Macros and Parallel processing

Accepted Solution Solved
Reply
Contributor
Posts: 22
Accepted Solution

Macros and Parallel processing

[ Edited ]

Hello, I would like to use the same macro across multiple threads in Enterprise Guide to utilise parallel processing for example;

 

I have a macro;

 

%macro loop(thread_var);

    data test_&thread_var;

        set input_data;

        where var = "&thread_var";

    run;

 

%mend;

 

I would like to submit this macro into memory then execute it on several threads using different programs on the same process flow all linked to the macro.

 

So I can then run;

 

%loop(a);

%loop(b);

%loop(c);

 

In three different programs, all running in parallel. 

 

 

Any thoughts on how to do this?

 

My SAS server is on Linux and my client is windows, so I can't use %include. 

I don't want to copy and paste the macro as I want to only change the code in one place.

 

Thanks


Accepted Solutions
Solution
‎01-24-2018 09:36 AM
PROC Star
Posts: 260

Re: Macros and Parallel processing

Sorry, did not know that. Is it possible to use the MSTORED and SASMSTORE option, saving the compiled macro in a permanent library? Then it may be available for all the parallel sessions.

View solution in original post


All Replies
PROC Star
Posts: 1,456

Re: Macros and Parallel processing

[ Edited ]

Setting aside the macro language, can you describe how you would do this parallel processing in EG without any macro code?

 

Even though your client (EG) is running on windows and your SAS session is Linux, you can still use %include.  But since the SAS session is on Linux, the %include would read from .sas files that are sitting on the Linux box.  (Not suggesting that %include is necessarily the best approach).

Contributor
Posts: 22

Re: Macros and Parallel processing

[ Edited ]

There is no location for me store SAS files on the linux box.

This isn't my code, but it can largely be broken up into different where statements which can be processed individually or as "var in (a,b,c)"

running on thread 1;
data test_a;
set input_data;
where var = "a";
run;

running on thread 2;
data test_b;
set input_data;
where var = "b";
run;

running on thread 3;
data test_c;
set input_data;
where var = "c";
run;

 

Just to note, this is how I would run it currently, opening up 3 sessions and just running them simultaneously, which is 3 times faster then in one session subsequently or twice as quick as one big intensive query as one statement.

PROC Star
Posts: 1,456

Re: Macros and Parallel processing

I see, so it's an EG parallel processing question, more than a macro question.

 

Do you have XCMD enabled on the server?  (you can check with :  %put %sysfunc(getoption(xcmd));  )

 

If so, then you could use SYSTASK to start multiple parallel SAS sessions.  I think you would still need a place to store the macro definition on the Linux box (even a home drive could work).  If you have SAS/CONNECT licensed that might be another option.  I don't know if EG has built-in functionality for this sort of parallel processing.  You might try asking this in the E.G. forum if you don't get better answers than I can provide.

Contributor
Posts: 22

Re: Macros and Parallel processing

No X cmd, although I could use python to do something similar. However I wanted to keep it all in one project
PROC Star
Posts: 260

Re: Macros and Parallel processing

I should think it is possible to define the macro in one program, and then run the individual calls in different threads. So you would have

%macro loop(thread_var);
    data test_&thread_var;
        set input_data;
        where var = "&thread_var";
    run;
 
%mend;

in one program, which then forks out to the three calls of the macro.

If you know beforehand what the parameters are, that is, otherwise you will have to be more roundabout. One possibility is to put the parameters in global macro variables:

proc sql noprint;
  select distinct var into :thread_var1-:thread_var999 trimmed from input_data;
  %let nvalues=&sqlobs;
quit;

%let num_threads=3;

%macro loop(thread_num);
  %local i thread_var;
  %do i=&thread_num %to &nvalues %by &num_threads;
%let thread_var=&&thread_var&i;
data test_&thread_var;
set input_data;
where var="&thread_var";
run;
%mend;

You will then have three (as num_threads = 3 in this example) calls in the fork, doing

%loop(1), %loop(2) and %loop(3), respectively. So if there are 5 values, the first loop will do values #1 and #4, the second will do values #2 and #5, and the last thread will do the third value.

 

 

 

 

Contributor
Posts: 22

Re: Macros and Parallel processing

This is exactly what I've been trying to do, however in EG when you have enabled parallel processing, after the macro has been called the first time, it is dropped from memory.

I will ask in the EG forums.
Solution
‎01-24-2018 09:36 AM
PROC Star
Posts: 260

Re: Macros and Parallel processing

Sorry, did not know that. Is it possible to use the MSTORED and SASMSTORE option, saving the compiled macro in a permanent library? Then it may be available for all the parallel sessions.
PROC Star
Posts: 1,456

Re: Macros and Parallel processing

@Rhys I'm confused.  You said there was no location on the Linux box where you can save .sas files. 

 

But there is a location on the Linux box where you can save a permanent SAS catalog file?

 

Seems like if you could save a catalog file, you could save a .sas file.  Or maybe it's just a business rule that they don't want you saving code on the Linux server (even though you can save compiled code in a catalog)?

Contributor
Posts: 22

Re: Macros and Parallel processing

You're right. There is a file location but there's different security on it and  it's not really used for storing code. I didn't want to create a new file structure to keep code there, under a growing number of projects with version control that is just a copy of another directory. 

 

A macro catalogue seems a little neater and I can store them without the source code. 

 

Not that it's working, I can utilise multiple threads, if I start each program separately but running from branch returns errors. 

 

No worries, automated parallel processing isn't meant to be. 

 

PROC Star
Posts: 1,456

Re: Macros and Parallel processing

Interesting.  So if you run multiple parallel batch jobs, they fail with an error about the macro catalog being locked/inaccessible? I suppose it's possible that macro catalogs are locked by a single session.  If you did store the macro code, I wouldn't think that would be locked.

 

Troy Hughes has a book and a bunch of papers about this sort of automated parallel processing, so I wouldn't give up too soon.  e.g.:

http://support.sas.com/resources/papers/proceedings17/0870-2017.pdf

 

In theory you wouldn't need to save code permanently on the server, so you could avoid maintaining code in two places.  When your parent program starts, it could copy the .sas source code for the shared macros to a temporary shared location on the server, then spawn child sessions which can read the macro definitions from the shared location, then at the end the parent program could delete all the files from the shared location.  You might even be able to use the WORK library of the parent session as the location for storing shared code.  The parent would just need to tell the child sessions the path to look in.

 

Those are just ideas.  I haven't played with this sort of roll-your-own parallelism myself.

Super User
Posts: 5,876

Re: Macros and Parallel processing

A way to accomplish parallelism in code using MP CONNECT statements, that being signing on to and submit code to new SAS sessions on the current server.
Data never sleeps
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 668 views
  • 2 likes
  • 4 in conversation