HI Patrick,
Please find the below code.. I know this is not the best way to write SAS program. It is the legacy code, I want to see if parallel processing makes any difference. I wanna process the "J" loop in parallel.
%MACRO TEST; LIBNAME test 'path/location/'; DATA DS1; SET test.inp_1; RUN; DATA DS2; SET tes.inp_2; RUN; PROC SORT DATA=DS1; By var; run; PROC SORT DATA=DS2; By var; run; DATA ds3; MERGE ds1(in=a) ds2(in=b); by var; if a and b; RUN; PROC SORT DATA=ds3 nodupkey out=ds4; By var2; RUN; DATA _null_; SET ds3; CALL SYMPUT('var3',compress(_n_)); RUN; %DO j=1 %TO &var3;**********50 times; DATA _null_; set ds3(firstobs=&j obs=&j); CALL SYMPUT('var4',compress(var4)); CALL SYMPUT('var5',compress(var5)); RUN; DATA ds5; SET ds3; WHERE var4="var4" and var5="var5"; RUN; PROC SORT DATA=ds5 nodupkey out=ds6; BY var4 var5 var6; RUN; DATA _null_; SET ds6; CALL SYMPUT('var7',compress(_n_)); RUN; %DO i=1 %TO &var7;***************100 times; DATA _null_; SET ds6; CALL SYMPUT('var4',compress(var4)); CALL SYMPUT('var5',compress(var5)); CALL SYMPUT('var8',compress(var8)); RUN; DATA ds7; set ds5; WHERE var4="&var4" and var5="&var5" and var8="&var5"; RUN; DATA ds7; set ds7; do i=1 to count; output; end; if a > . then call symput ('a',compress(a)); if b > . then call symput ('b',compress(b)); RUN; PROC SUMMARY data=ds7; var date; output out=ds8 min=mindate max=maxdate; RUN; ****** ODS FOR GRAPH PROC CAPABILITY PROC APPEND; ******; %END; %END; %MEND; %TEST;
These are macro do loops that just generate SAS code. These loops won't take that long to run. It's the generated SAS code (50*100 times almost the same with many passes through the data) that will take up the time.
Looking into the code you've shared I'm rather certain that you could get rid of all macro processing and get this done via "normal" SAS only using by-group processing. This will then also perform much better. Fixing the code is where you should spend your time.
It's a bit hard to provide fixed code without representative sample data and desired result. I've mocked-up something below but it will likely not fully match what you need. It should show you the way to go.
data inp_1;
infile datalines truncover dlm=',' dsd;
input var (var4 var5) ($);
datalines;
1,x,y
2,x,y
3,a,b
;
data inp_2;
infile datalines truncover dlm=',' dsd;
input var var2 $ date :date9.;
format date date9.;
datalines;
1,a,01jan2021
1,a,01feb2021
1,b,01mar2021
1,a,01apr2021
2,a,01jan2021
2,c,01feb2021
2,c,01mar2021
4,a,01jan2021
;
libname test "%sysfunc(pathname(work))";
data ds1;
set test.inp_1;
run;
data ds2;
set test.inp_2;
run;
proc sort data=ds1;
by var;
run;
proc sort data=ds2;
by var;
run;
data ds3;
merge ds1(in=a) ds2(in=b);
by var;
if a and b;
run;
proc summary data=ds3 ;
class var2;
var date;
ways 1;
output out=ds8 min=mindate max=maxdate;
run;
Nesting is done in BY by using multiple variables.
by a b;
will do a group change whenever b or a changes, and because of the preceding sort with the same BY, all b groups within the first a group will be dealt with first, then all b groups within the second a group, and so on.
You really need to get an understanding of BY first before you engage in such unwieldy and inefficient macro coding.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.