Hello, I would like to use the same macro across multiple threads in Enterprise Guide to utilise parallel processing for example;
I have a macro;
%macro loop(thread_var);
data test_&thread_var;
set input_data;
where var = "&thread_var";
run;
%mend;
I would like to submit this macro into memory then execute it on several threads using different programs on the same process flow all linked to the macro.
So I can then run;
%loop(a);
%loop(b);
%loop(c);
In three different programs, all running in parallel.
Any thoughts on how to do this?
My SAS server is on Linux and my client is windows, so I can't use %include.
I don't want to copy and paste the macro as I want to only change the code in one place.
Thanks
R
Setting aside the macro language, can you describe how you would do this parallel processing in EG without any macro code?
Even though your client (EG) is running on windows and your SAS session is Linux, you can still use %include. But since the SAS session is on Linux, the %include would read from .sas files that are sitting on the Linux box. (Not suggesting that %include is necessarily the best approach).
There is no location for me store SAS files on the linux box.
This isn't my code, but it can largely be broken up into different where statements which can be processed individually or as "var in (a,b,c)"
running on thread 1;
data test_a;
set input_data;
where var = "a";
run;
running on thread 2;
data test_b;
set input_data;
where var = "b";
run;
running on thread 3;
data test_c;
set input_data;
where var = "c";
run;
Just to note, this is how I would run it currently, opening up 3 sessions and just running them simultaneously, which is 3 times faster then in one session subsequently or twice as quick as one big intensive query as one statement.
I see, so it's an EG parallel processing question, more than a macro question.
Do you have XCMD enabled on the server? (you can check with : %put %sysfunc(getoption(xcmd)); )
If so, then you could use SYSTASK to start multiple parallel SAS sessions. I think you would still need a place to store the macro definition on the Linux box (even a home drive could work). If you have SAS/CONNECT licensed that might be another option. I don't know if EG has built-in functionality for this sort of parallel processing. You might try asking this in the E.G. forum if you don't get better answers than I can provide.
I should think it is possible to define the macro in one program, and then run the individual calls in different threads. So you would have
%macro loop(thread_var);
data test_&thread_var;
set input_data;
where var = "&thread_var";
run;
%mend;
in one program, which then forks out to the three calls of the macro.
If you know beforehand what the parameters are, that is, otherwise you will have to be more roundabout. One possibility is to put the parameters in global macro variables:
proc sql noprint;
select distinct var into :thread_var1-:thread_var999 trimmed from input_data;
%let nvalues=&sqlobs;
quit;
%let num_threads=3;
%macro loop(thread_num);
%local i thread_var;
%do i=&thread_num %to &nvalues %by &num_threads;
%let thread_var=&&thread_var&i;
data test_&thread_var;
set input_data;
where var="&thread_var";
run;
%mend;
You will then have three (as num_threads = 3 in this example) calls in the fork, doing
%loop(1), %loop(2) and %loop(3), respectively. So if there are 5 values, the first loop will do values #1 and #4, the second will do values #2 and #5, and the last thread will do the third value.
@Rhys I'm confused. You said there was no location on the Linux box where you can save .sas files.
But there is a location on the Linux box where you can save a permanent SAS catalog file?
Seems like if you could save a catalog file, you could save a .sas file. Or maybe it's just a business rule that they don't want you saving code on the Linux server (even though you can save compiled code in a catalog)?
You're right. There is a file location but there's different security on it and it's not really used for storing code. I didn't want to create a new file structure to keep code there, under a growing number of projects with version control that is just a copy of another directory.
A macro catalogue seems a little neater and I can store them without the source code.
Not that it's working, I can utilise multiple threads, if I start each program separately but running from branch returns errors.
No worries, automated parallel processing isn't meant to be.
Interesting. So if you run multiple parallel batch jobs, they fail with an error about the macro catalog being locked/inaccessible? I suppose it's possible that macro catalogs are locked by a single session. If you did store the macro code, I wouldn't think that would be locked.
Troy Hughes has a book and a bunch of papers about this sort of automated parallel processing, so I wouldn't give up too soon. e.g.:
http://support.sas.com/resources/papers/proceedings17/0870-2017.pdf
In theory you wouldn't need to save code permanently on the server, so you could avoid maintaining code in two places. When your parent program starts, it could copy the .sas source code for the shared macros to a temporary shared location on the server, then spawn child sessions which can read the macro definitions from the shared location, then at the end the parent program could delete all the files from the shared location. You might even be able to use the WORK library of the parent session as the location for storing shared code. The parent would just need to tell the child sessions the path to look in.
Those are just ideas. I haven't played with this sort of roll-your-own parallelism myself.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.