I have a SAS program that updates data for a month. Once a year, I need to update six months of data on the same day and then run a second program that generates reports. I have found through trail & error, that the total run time for updating 6 months of data is minimized by simultaneously running three iterations of that program (with each iteration looping for a second month = 6 months).
I want to ensure that the second program does not start generating the reports until all 6 months have been updated. The issue is that there is no way to predict which of the three iterations of the update program will finish last.
I had an idea that some form of a counter could be engineered but, I am not sure how to tell the report program to continually check the condition of the counter and only start running when it reaches "6". Any ideas on how to trigger the report program?
BASE SAS 9.4 (TS1M6)
Thanks
Thanks, those are all excellent suggestions. However, real-life testing show that sometimes the data update programs can complete...but with errors due to bad json calls or internet interruptions, etc.. While I have built traps that address 99% of errors, one got through.
My original idea solved the issue. I added code to the data update program that build a permanent counter file and only adds to the count if there are no errors (all three iterations thus contribute to the counter) and in the second program, I built a DO UNTIL that periodically checks the counter value and only executes when the data runs are all error free. That allows me to re-run one of the data update progs manually and when it finishes (with no errors) the counter triggers the secondary program.
Data programs generate a counter:
LIBNAME EGTASK 'e:';
%macro checkds(dsn);
%if %sysfunc(exist(EGTASK.&dsn)) %then %do;
data EGTASK.&dsn;
set EGTASK.&dsn;
month + 1;
run;
%end;
%else %do;
data EGTASK.&dsn;
set &dsn;
month + 1;
run;
%end;
%mend checkds;
/* Create a test data set and invoke the macro again, */
/* passing the data set name that does exist */
data b;
month=0;
run;
%checkds(b)
Secondary program checks the counter value:
LIBNAME EGTASK 'e:';
%macro get_e;
%do %until (&nmonth=40);
data _null_;
set EGTASK.b;
call symputx('nmonth', month);
stop;
run;
data _null_;
min=5;
wait_sec=(60*min);
zzz=sleep(wait_sec);
run;
%end;
%mend get_e;
%get_e;
Thanks to all.
So how do you run the three iterations in parallel currently? One way of doing this is to make use of SAS/CONNECT and its ability to create "child" SAS sessions from a parent SAS session. You do asynchronous remote submits from the parent session then use the WAITFOR statement to pause all remaining processing until all iterations complete. No counting or monitoring is required. Not an option if you don't have SAS/CONNECT though.
Thanks, but I do not have SAS/CONNECT. I run three iterations by simply opening three SAS sessions and then submitting each one.
This is THE task for your institution's scheduling software. Running the same program in parallel with three different parameter sets and wait for all three to complete successfully is what schedulers are built for.
Alternatively, use a shell script (UNIX example):
/sasconf/Lev1/SASApp/BatchServer/sasbatch.sh program1.sas& /sasconf/Lev1/SASApp/BatchServer/sasbatch.sh program2.sas& /sasconf/Lev1/SASApp/BatchServer/sasbatch.sh program3.sas& wait /sasconf/Lev1/SASApp/BatchServer/sasbatch.sh final_program.sas
wait is a bash command which causes the shell to wait for the termination signal of all background processes.
One possibility is to start your parallel tasks with the SYSTASK statement, and then have your main program wait for all tasks to finish, using WAITFOR :
SYSTASK '<command to start task 1>';
SYSTASK '<command to start task 2>';
SYSTASK '<command to start task 3>';
WAITFOR _ALL_;
/* code for reports goes here */
The linked documentation here is for Windows, but I think the facility is available under UNIX as well.
Thanks, those are all excellent suggestions. However, real-life testing show that sometimes the data update programs can complete...but with errors due to bad json calls or internet interruptions, etc.. While I have built traps that address 99% of errors, one got through.
My original idea solved the issue. I added code to the data update program that build a permanent counter file and only adds to the count if there are no errors (all three iterations thus contribute to the counter) and in the second program, I built a DO UNTIL that periodically checks the counter value and only executes when the data runs are all error free. That allows me to re-run one of the data update progs manually and when it finishes (with no errors) the counter triggers the secondary program.
Data programs generate a counter:
LIBNAME EGTASK 'e:';
%macro checkds(dsn);
%if %sysfunc(exist(EGTASK.&dsn)) %then %do;
data EGTASK.&dsn;
set EGTASK.&dsn;
month + 1;
run;
%end;
%else %do;
data EGTASK.&dsn;
set &dsn;
month + 1;
run;
%end;
%mend checkds;
/* Create a test data set and invoke the macro again, */
/* passing the data set name that does exist */
data b;
month=0;
run;
%checkds(b)
Secondary program checks the counter value:
LIBNAME EGTASK 'e:';
%macro get_e;
%do %until (&nmonth=40);
data _null_;
set EGTASK.b;
call symputx('nmonth', month);
stop;
run;
data _null_;
min=5;
wait_sec=(60*min);
zzz=sleep(wait_sec);
run;
%end;
%mend get_e;
%get_e;
Thanks to all.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.