I am using a macro do loop to run numerous Holt-Winters forecasts. I would like to combine the estimates into one table as the loop runs. Each of the 20 forecasts produces output of the following:
_TYPE_ Date CYVR_TCU8
N 2017APR 208
NRESID 2017APR 208
DF 2017APR 194
WEIGHT1 2017APR 0.1055728
WEIGHT2 2017APR 0.1055728
WEIGHT3 2017APR 0.25
SIGMA 2017APR 13109.328
CONSTANT 2017APR 394060.29
LINEAR 2017APR 2039.1086
S_JAN 2017APR 0.9584237
S_FEB 2017APR 0.8701241
S_MAR 2017APR 0.9710871
S_APR 2017APR 0.9186076
S_MAY 2017APR 1.0205678
S_JUN 2017APR 1.0751503
S_JUL 2017APR 1.1659275
S_AUG 2017APR 1.1619254
S_SEP 2017APR 1.0370984
S_OCT 2017APR 0.9766547
S_NOV 2017APR 0.8751276
S_DEC 2017APR 0.9693058
SST 2017APR 2.9012E11
SSE 2017APR 3.334E10
MSE 2017APR 171854482
RMSE 2017APR 13109.328
MAPE 2017APR 2.9196567
MPE 2017APR 0.1482134
MAE 2017APR 9568.9117
ME 2017APR 749.04853
RSQUARE 2017APR 0.8850831
So after 20 runs I will get 20 of these outputs, with the only thing changing is the far right column, it will change depending on the airport code (e.g., CYYX_TCU8). I would like to create a table as the loop runs so that the final output will have _TYPE_ DATE and then all forecast statsitics for each airport. I use the varaible n to loop for each airport in the macro and it works. So I tried to use the n as follows:
%if &n=1 %then
%do;
data egtask.est_hw; set Est_HW_Auto_&i; run;
%end;
%else
%do;
proc sql;
alter table egtask.est_hw add &i._TCU8 numeric; run;
update egtask.est_hw set &i._TCU8=(select &i._TCU8 from Est_HW_Auto_&i); run;
%end;
I first set up the initial table when the first forecast (n=1) is done. Past n=1 I want to add the airport forecasts statistics column only from each forecast to this new table. So I update the combined table called egtask.est_hw by copying the statistics from each forecast run (Est_HW_Auto_&i). I am getting the error "ERROR: Subquery evaluated to more than one row."
I am hoping to get a final table I can review quickly that would look like:
_TYPE_ | Date | CYVR_TCU8 | CYYZ_TCU8 | CYOW_TCU8 |
N | 2017APR | 208 | . | . |
NRESID | 2017APR | 208 | . | . |
DF | 2017APR | 194 | . | . |
WEIGHT1 | 2017APR | 0.1055728 | . | . |
WEIGHT2 | 2017APR | 0.1055728 | . | . |
WEIGHT3 | 2017APR | 0.25 | . | . |
SIGMA | 2017APR | 13109.328 | . | . |
CONSTANT | 2017APR | 394060.29 | . | . |
LINEAR | 2017APR | 2039.1086 | . | . |
S_JAN | 2017APR | 0.9584237 | . | . |
S_FEB | 2017APR | 0.8701241 | . | . |
S_MAR | 2017APR | 0.9710871 | . | . |
S_APR | 2017APR | 0.9186076 | . | . |
S_MAY | 2017APR | 1.0205678 | . | . |
S_JUN | 2017APR | 1.0751503 | . | . |
S_JUL | 2017APR | 1.1659275 | . | . |
S_AUG | 2017APR | 1.1619254 | . | . |
S_SEP | 2017APR | 1.0370984 | . | . |
S_OCT | 2017APR | 0.9766547 | . | . |
S_NOV | 2017APR | 0.8751276 | . | . |
S_DEC | 2017APR | 0.9693058 | . | . |
SST | 2017APR | 2.90E+11 | . | . |
SSE | 2017APR | 3.33E+10 | . | . |
MSE | 2017APR | 171854482 | . | . |
RMSE | 2017APR | 13109.328 | . | . |
MAPE | 2017APR | 2.9196567 | . | . |
MPE | 2017APR | 0.1482134 | . | . |
MAE | 2017APR | 9568.9117 | . | . |
ME | 2017APR | 749.04853 | . | . |
RSQUARE | 2017APR | 0.8850831 | . | . |
Of course, all would be populated properly and there would be 20 airport model statistics as well.
Thanks for any help
-Bill
If you don't have an aversion to using a data step over proc sql, I think this accomplishes what you're looking for.
%if &n=1 %then
%do;
data egtask.est_hw; set Est_HW_Auto_&i; run;
%end;
%else
%do;
*Presort data prior to merge;
proc sort data = egtask.est_hw;
by _TYPE_ Date;
run;
proc sort data = Est_HW_Auto_&i;
by _TYPE_ Date;
run;
data egtask.est_hw;
merge egtask.est_hw
Est_HW_Auto_&i ( keep = _TYPE_ Date &i._TCU8 );
by _TYPE_ Date;
run;
%end;
Below two code samples how you could do this.
%macro sample1();
proc datasets lib=work nolist nowarn;
delete result_collection_1;
run;quit;
%do i=1 %to 20;
data forcast_&i;
iter=&i;
some_var='something 1';
output;
some_var='something 2';
output;
run;
proc append base=result_collection_1 data=forcast_&i;
run;
%end;
%mend;
%sample1();
%macro sample2();
proc datasets lib=work nolist nowarn;
delete result_collection_2 forcast_:;
run;quit;
%do i=1 %to 20;
data forcast_&i;
iter=&i;
some_var='something 1';
output;
some_var='something 2';
output;
run;
%end;
data result_collection_2;
set forcast_:;
run;
%mend;
%sample2();
Thanks to all who responded. As it turns out I was able to get what I wanted via:
%if &n=1 %then
%do;
data egtask.est_hw_Auto; set Est_HW_Auto_&i; run;
%end;
%else
%do;
data tmp_hw; set Est_HW_Auto_&i (keep=&i._TCU8); run;
data egtask.est_hw_auto; set egtask.est_hw; set tmp_hw; run;
%end;
So, on the first loop (n=1), set up the dataset I want, which contains the estimates (the order of which never changes), and the date (which never changes). Then the next time the loop comes by (n>1), create a temporary dataset and put in it only the estimates (&i._TCU8), and then just dump them into the dataset made when n=1. I did not realize that SAS will let you create the same dataset with reference to one of the same name via:
data egtask.est_hw_auto; set egtask.est_hw; set tmp_hw; run;
Works a treat!
-Bill
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.