This article is about a new SAS Custom Step available in the public GitHub SAS Custom Step repository called DAFT. DAFT stands for Dynamic Aggregations From Timeseries and supports the creation of analytical variables out of time series data with a click of a button for use in modeling and other analytical purposes. Take a look at the following animation showcasing a potential workflow to interact with DAFT. Don't forget to continue reading this article to learn more about the details.
Potential DAFT Workflow
Creating a variety of analytical variables for modeling purposes, e.g., forecasting or prediction models is crucial for creating a good scoring algorithm. The Dynamic Aggregations From Timeseries DAFT SAS Studio Custom Step enables SAS Studio Flow users to easily perform dynamic aggregations on timeseries data by the push of a button.
Let's explain based on an example: often times the outcome of events is dependent on other past events. So, it is important to get the historic views of the data, from say four weeks ago, seven weeks, etc. In addition, how does it look when aggregating data from two weeks, three weeks, etc?
Real life examples would be:
As mentioned in the above examples, often times it is not known which time parameters are relevant, hence it might be important to create many combinations and then let the statistic decide which combination is influencial.
DAFT allows us to calculate a large number of combinations, if necessary. At this point, DAFT allows the following aggregation functions:
The aggregations are based on one of the following time units:
With time series data usually being very granular, aggregating to higher level is necessary to allow best results for analytic purposes. Usually, it depends on the problem of which granularity to choose.
The output dataset is then made available based on that chosen granularity.
Weather data is available on a minute basis, and the problem at hand requires us to look at the data on a weekly basis. It is required we look at the total precipitation over one-week and two-weeks for both four-weeks and eight-weeks ago. Translated into DAFT terms, this would mean:
DAFT then creates all combinations between aggregation and lag sequence and the output variables would look like:
with "sum" describing the statistic for the variable, the number behind it describes the length based on the selected unit, and "L" describing the lag.
Since the granularity is "By Week", DAFT creates the following two time-variables:
Additionally, the output dataset contains the variables that describe the entity. In the weather example, this could the region/county level, or zip code level, etc.
In other examples, e.g., when the transaction data is bank data, the smallest entity could be person, household, company or parent company.
DAFT is available as a custom step, which means it only needs to upload the “step” file somewhere in SAS Content (see detailed upload instructions here ), and it's automatically available under “Shared” Steps in SAS Studio.
The DAFT SAS Studio Custom Step can be downloaded here .
Following are a few screenshots of the user experience when using the custom step. Each screenshot shows one tab in the custom step.
The complete options are spread out over two screenshots:
The “Input Data” tab contains all the parameters needed to determine which variables are needed, in which role.
DAFT Input Data Tab Part 1
DAFT Input Data Tab Part 2
Here the output granularity is determined.
DAFT Output Data Tab
Possible options are:
This is the tab where I can set which combination of aggregations and lags DAFT should produce.
DAFT Processing Options Tab
Here are all kinds of settings available that control process execution.
DAFT Admin Options Tab
This tab contains all the necessary information for using DAFT. It also contains a description of all the parameters and some sample code that produces an example transaction file to play around with DAFT.
DAFT About Tab
DAFT in Action - take a look at the animation..
Everything you need to run DAFT is available on Github here. This also contains a readme file with more information about all the parameters.
Please leave a comment and let me know what you think. Maybe you have some feature ideas? Also, share your experience with aggregating transaction table. I am curious to hear about your experience and can't wait to hear from you.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.