Hi all,
Pretty simple question, just wondering what best practices are here.
I have a simple Time-Series forecast built in Build Models.
When imported into Visual Analytics, the time series always includes predicted values, even when actuals are already posted.
I'd actually like to build a view where the Predicted Values only show up after the current date where Actual Values are posted. I'd also enjoy having the line there showing the end of the Actuals and beginning of the Predicted.
Thanks in advance!
Hey @illmatic! I do this all the time when producing forecasts. The simplest way is to create a new calculated item for your predicted values:
if(notmissing('actual'n) ) return .
else 'predict'n
This will set Predict to missing wherever there are actuals, but you'll have a line break. If you do not want to have a line break, you just need to do a little bit of data prep. Let's use an example of a forecast from PROC ESM:
proc esm data=sashelp.air outfor=outfor;
id date interval=month;
forecast air / model=multseasonal;
run;
Let's scroll down to where actuals end.
DATE | ACTUAL | PREDICT |
OCT1960 | 461 | 433 |
NOV1960 | 390 | 390 |
DEC1960 | 432 | 429 |
JAN1961 | . | 445 |
FEB1961 | . | 419 |
MAR1961 | . | 469 |
To make the lines connect, we need to set predict to missing before the last date of actuals. In other words, we want this:
DATE | ACTUAL | PREDICT |
OCT1960 | 461 | . |
NOV1960 | 390 | . |
DEC1960 | 432 | 429 |
JAN1961 | . | 445 |
FEB1961 | . | 419 |
MAR1961 | . | 469 |
With a little SQL and DATA Step logic, we can do this:
/* Read the last actual date into a macro variable */
proc sql noprint;
select max(date)
into :last_actual_date
from outfor
where actual NE .
;
quit;
/* Set predict to missing before the last actual date */
data outfor2;
set outfor;
if(date < &last_actual_date.) then predict = .;
run;
When you load your data into Visual Analytics, you'll have no fitted values in the past and a connecting line once predictions begin.
This forum is pretty dead 😞
Hey @illmatic! I do this all the time when producing forecasts. The simplest way is to create a new calculated item for your predicted values:
if(notmissing('actual'n) ) return .
else 'predict'n
This will set Predict to missing wherever there are actuals, but you'll have a line break. If you do not want to have a line break, you just need to do a little bit of data prep. Let's use an example of a forecast from PROC ESM:
proc esm data=sashelp.air outfor=outfor;
id date interval=month;
forecast air / model=multseasonal;
run;
Let's scroll down to where actuals end.
DATE | ACTUAL | PREDICT |
OCT1960 | 461 | 433 |
NOV1960 | 390 | 390 |
DEC1960 | 432 | 429 |
JAN1961 | . | 445 |
FEB1961 | . | 419 |
MAR1961 | . | 469 |
To make the lines connect, we need to set predict to missing before the last date of actuals. In other words, we want this:
DATE | ACTUAL | PREDICT |
OCT1960 | 461 | . |
NOV1960 | 390 | . |
DEC1960 | 432 | 429 |
JAN1961 | . | 445 |
FEB1961 | . | 419 |
MAR1961 | . | 469 |
With a little SQL and DATA Step logic, we can do this:
/* Read the last actual date into a macro variable */
proc sql noprint;
select max(date)
into :last_actual_date
from outfor
where actual NE .
;
quit;
/* Set predict to missing before the last actual date */
data outfor2;
set outfor;
if(date < &last_actual_date.) then predict = .;
run;
When you load your data into Visual Analytics, you'll have no fitted values in the past and a connecting line once predictions begin.
thanks @Stu_SAS !
My current issue is that I still get predicted values after realized dates with that logic. I believe it is because of the hierarchies. I'll try to breakout the dataset more and tweak the logic from there.
Since you have some experience with visualizing models, do you have a current best method in regard to automating a refresh?
I'm noticing that I have to reload the dataset, re-run the pipelines, export the dataset, and only then do my visuals have refreshed data. Wondering if it's possible to script it all, current code output scheduling from the pipeline section doesn't seem complete, but I may be wrong.
In Viya I would register the model to Model Manager, then save the scoring code as a .sas file in a folder within SAS Studio. From there, you can schedule it to run as a job in Environment Manager. Your code can then run on a regular batch schedule.
In SAS 9.4 we would use a dedicated scheduler, like Cron in Linux, to run SAS jobs. Since Viya has a built-in batch code runner, we don't need to use it anymore. A great paper on best practices for scheduling in Viya can be found below:
https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2020/4248-2020.pdf
How to run a pipeline in batch can be found here:
https://go.documentation.sas.com/doc/en/vfcdc/v_015/vfug/n194n9wnp2xokzn15f9gwn3ejtjo.htm
Your general code flow would look like this:
<copied batch score code goes here>
caslib _ALL_ assign;
proc casutil incaslib='myvacaslib';
droptable casdata='mytable';
quit;
data myvacaslib.mytable(promote=yes);
set modeloutputcaslib.mymodeldata;
run;
proc casutil incaslib='myvacaslib';
save casdata='mytable' replace;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
See how to use one filter for multiple data sources by mapping your data from SAS’ Alexandria McCall.
Find more tutorials on the SAS Users YouTube channel.