BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
awilli69
Calcite | Level 5

Hi,

In going over the module containing the course "Forecasting Using Model Studio in SAS Viya 3.4", I have a question as it relates to including categorical variables in a forecasting model with SAS. This lesson stated that you can add categorical variables (Varchar) for "forecasting" (auto or hierarchical, for example) as such with a "Top Level" reconciliation, meaning that the Gap (my dependent variable) is unchanged and the forecasts by Grade are adjusted for the response, correct? 

 

SAS example1.png

 

By looking at the results, from let's say an autoforecasting model, it seems as if forecasts are produced for each level of each grade by Time ID. Like this:

SAS example2.png

 

**However, I, in essence, simply want to treat the VarChar variables as a categorical covariate where the final forecast is NOT separated out or partitioned by grade levels resulting in forecasts for different series of my response, but for the overall response/dependent variable itself.  Certain levels of these Grade start at different dates etc, and I'm worried about, still, the integrity of the forecasts/estimation. 

 

What I'm asking is that:

1. if I were to pre-prepare these data by dummying these variables (as numeric 0/1 and including them as independent) versus including them as By variable attributes, what is the exact difference?   Is one option preferred over the other?

2. If I were to keep these attributes/By variables as such for all my categorical I want to include to help forecast the response, the autoforecasting template vs. the hierarchical template only differs by precision in the forecasts only?  What if the levels of the categorical variables all differ in terms of when they are correspond to the dependent variable itself in the series? 

 

Any help would be appreciated.

 

Thanks!

 

 

 
1 ACCEPTED SOLUTION

Accepted Solutions
Cynthia_sas
Diamond | Level 26

Hi:

  Here's some feedback from one of the instructors:

"There might be some confusion about the role of BY variables and categorical input variables in the project. BY variables are not candidate explanatory (aka input or independent) variables for models. They are sub-setting variables that define the hierarchical structure of the data and determine what time series are created in the project. You can add categorical explanatory variables, but you can’t assign character valued variables as explanatory variables in a project. That is,  you need to dummy code them first, and they have to exist as dummy columns in the table that contains the target variable. "

 

Hope this helps,

Cynthia

View solution in original post

2 REPLIES 2
Cynthia_sas
Diamond | Level 26

Hi:

  Here's some feedback from one of the instructors:

"There might be some confusion about the role of BY variables and categorical input variables in the project. BY variables are not candidate explanatory (aka input or independent) variables for models. They are sub-setting variables that define the hierarchical structure of the data and determine what time series are created in the project. You can add categorical explanatory variables, but you can’t assign character valued variables as explanatory variables in a project. That is,  you need to dummy code them first, and they have to exist as dummy columns in the table that contains the target variable. "

 

Hope this helps,

Cynthia

awilli69
Calcite | Level 5
That's exactly what I thought. Thank you so much Cynthia!