SAS provides tons of data sets for free to use with our analytics products for demonstrating the software capabilities, testing out your custom programs and pipelines, and training purposes. But how do you know which data sets are appropriate for forecasting? Where can you find these data sets? How do you make them ready for forecasting? This post will help you figure out which sample data sets can be used for forecasting.
Forecasting can be done in both SAS Visual Analytics (using the Forecasting object) and in SAS Visual Forecasting. Below I’ll show some of the readily available data sets that will work well for either or both of these products. Some will work straight out of the box. Others will require a bit of massaging.
I’ll discuss three categories of SAS Sample Data Sets:
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
The first thing you need for forecasting is an historic record over time with a date, time or datetime variable. This variable must be in the proper format to use it in SAS, i.e., it must be a SAS date, SAS time, or SAS datetime. Also, in the ideal world you would have at least six full cycles of any significant cycles or seasons. So, for example, if there are annual cycles, and you want to forecast the next year, ideally you would have six years of data.
While there are a lot of time series data sets, some of them do not have independent variables. This means that you cannot demonstrate either scenario analysis or goal seeking. Others do not have categories and so do not allow you to demonstrate hierarchical forecasting. So to fully understand and illustrate the power of SAS forecasting tools (SAS Visual Analytics forecasting object and SAS Visual Forecasting), your data should include:
The PRICEDATA data set is an excellent choice. The basic table is available in SASHELP. Attribute tables and a data set with segments are available at the following:
SAS Visual Forecasting-specific data sets available online https://github.com/vasepu/SAS-Visual-Forecasting---sample-data-sets
IMHO, the richest data sets to learn and/or showcase the features of Visual Forecasting are the sample data sets at this link .
These data are based on sales, profit, etc. over time. They include:
They will download in a .zip file and you can then extract the individual data sets.
SASHELP and SAMPSIO data sets
My second choice are the SASHELP and SAMPSIO data sets because they ship with the software and are accessible directly from SAS Software, commonly configured to appear in your libraries for example from SAS Studio. To see how to access and load these data sets see my YouTube.
Depending on your environment and what has been done in it, you may not see the SAMPSIO library listed. By running a simple data step as shown below, you should then see SAMPSIO. data one; set sampsio.hmeq; run; SAS Viya Example Data Sets (csv example data sets)
Use a SAS Program in SAS Studio to Create a Dataset
An old school way to get a data set is to create the data set using SAS Code and you can download many of these SAS coding programs that create data. For example, there are a plethora of these programs available in the SAS 9 documentation. Just a few examples of these programs are listed below:
Real Data from Publicly Available Sources
In addition to the many choices of sample data sets that SAS provides, there are many excellent data sets available from the internet that will work with SAS forecasting with just a bit of data munging. One of my favorite sources is the electricity generation data available from the US Energy Information Agency. Another great data set is on US covid cases by state, brought to my attention by my colleague Stacey Wang.
Accessing, Loading, and Preparing Data for Forecasting
For an exhaustive (and exhausting) demonstration showing you how to find, import and/or load these data sets, see my video.
What if you have a date but it’s not a SAS date? There are a couple of ways to create a SAS date, depending on the format of your original date. For example, as shown in SAS Visual Analytics below, you may use DateFromMDY or TreatAs.
For details on creating SAS dates, see Teri Patsilaras’s post Build a date in SAS Visual Analytics Reports.
SUMMARY: Beth’s Favorite SAS Data Sets for Forecasting
We all have our favorites. Our favorite beverage, our favorite sport, our favorite time of year, our favorite child… oops, maybe not that last one. Well, in any case, I have my favorite data sets that work well for forecasting. See the table below to see my favorite categories in order.
The pricedata and skin product data sets from this link https://github.com/vasepu/SAS-Visual-Forecasting---sample-data-sets are my favorite if I want to illustrate many features of SAS Visual Forecasting and SAS Visual Analytics using the Forecasting object. One version of the pricedata set is also available from SASHELP, and is very useful. As you see below, it includes:
However, if my main focus is illustrating the concept of forecasting and the ability to capture seasonality, I prefer the airline data set, which is available in SASHELP.
FOR MORE INFORMATION:
Find more articles from SAS Global Enablement and Learning here.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.