Hi @Jim975
You have a library with many data sets pulled monthly from a data source, and you want to build a Data Step that creates a new data set by appending a number of consecutive monthly data sets. Further, you want to use a Data _NULL_ step to the create the Data Step code and execute it immediately by utilizing the CALL EXECUTE function. You want this Data Step:
data WholeData;
set
PulledData_202206
PulledData_202205
...
;
run;
You are not explicit about which months should be pulled, so I suppose the first and last month to append could be given as yyyymm values in 2 macro variables.
The argument to the CALL EXECUTE function is a string containing SAS code. It can be a little tricky to build the string so it contains syntactically correct code, and it is always a good idea to write the generated code to the log, so it is easy to review the result, and then convert to CALL EXECUTE afterwards, when the code looks as expected. Here is some code that writes the wanted Data Step code:
* Parameters to control operation;
%let MonthFirst = 202111;
%let MonthLast = 202206;
* Data _null_ step to generate code;
data _null_;
FirstMonthStartDate = input("&MonthFirst"||'01 ',anydtdte.);
* Initiate generated Data Step code;
put 'data WholeData;';
put @3 'set';
* Mechanism to loop from &MonthFirst to &MonthLast regardless of shift in year
and add name of monthly data set to generated code;
i = 0;
do while (ThisMonth ne "&MonthLast");
ThisMonthStartDate = intnx('month',FirstMonthStartDate,i,'B');
ThisMonth = put(ThisMonthStartDate,yymmn6.);
ThisMonthDS = 'PulledData_' || ThisMonth;
put @5 ThisMonthDS;
i = i + 1;
if i > 20 then leave;
end;
* Finish generated Data Step;
put @3 ';';
put 'run;';
run;
It writes the following code, which is as expected:
data WholeData;
set
PulledData_202111
PulledData_202112
PulledData_202201
PulledData_202202
PulledData_202203
PulledData_202204
PulledData_202205
PulledData_202206
;
run;
Next step is to change the Data Step, so the generated code is executed instead of written to the log. The code in the example above contains five PUT statements, and the third PUT is executed several times in a loop. So the question is: How should the code be modified to use the CALL EXECUTE function instead of PUT?.
Many people think it is necessary to build a string containing the full code, and then use one CALL EXECUTE to send the whole code to execution. But CALL EXECUTE does not run the code, it inserts the argument in the input stream after the step containing the CALL EXECUTE function, and one can use as many CALL EXECUTE statements as necessary to write a whole program to the input stream. So we get the desired result just by changing all the PUT <something> statements to CALL EXECUTE(<something>) instead:
%let MonthFirst = 202111;
%let MonthLast = 202206;
data _null_;
FirstMonthStartDate = input("&MonthFirst"||'01 ',anydtdte.);
* Initiate generated Data Step code;
call execute('data WholeData;');
call execute('set');
* Mechanism to loop from &MonthFirst to &MonthLast regardless of shift in year
and add name of monthly data set to generated code;
i = 0;
do while (ThisMonth ne "&MonthLast");
ThisMonthStartDate = intnx('month',FirstMonthStartDate,i,'B');
ThisMonth = put(ThisMonthStartDate,yymmn6.);
ThisMonthDS = 'PulledData_' || ThisMonth;
call execute(ThisMonthDS);
i = i + 1;
if i > 20 then leave;
end;
* Finish generated Data Step;
call execute(';');
call execute('run;');
run;
When I submit the code, I get a log full of errors, because I don't have the monthly data sets, but you can see it works as expected:
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.04 seconds
NOTE: CALL EXECUTE generated line.
1 + data WholeData;
2 + set
3 + PulledData_202111
4 + PulledData_202112
5 + PulledData_202201
6 + PulledData_202202
7 + PulledData_202203
8 + PulledData_202204
9 + PulledData_202205
10 + PulledData_202206
11 + ;
ERROR: File WORK.PULLEDDATA_202111.DATA does not exist.
ERROR: File WORK.PULLEDDATA_202112.DATA does not exist.
ERROR: File WORK.PULLEDDATA_202201.DATA does not exist.
ERROR: File WORK.PULLEDDATA_202202.DATA does not exist.
ERROR: File WORK.PULLEDDATA_202203.DATA does not exist.
ERROR: File WORK.PULLEDDATA_202204.DATA does not exist.
ERROR: File WORK.PULLEDDATA_202205.DATA does not exist.
ERROR: File WORK.PULLEDDATA_202206.DATA does not exist.
12 + run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.WHOLEDATA may be incomplete. When this step was stopped there were
0 observations and 0 variables.
WARNING: Data set WORK.WHOLEDATA was not replaced because this step was stopped.
... View more