DATA Step, Macro, Functions and more

parallel processing for loops & under macro

Reply
Frequent Contributor
Posts: 95

parallel processing for loops & under macro

Hi SAS users,

 

due to heavy run time, i was thinking of trying approch of parallell processing. But my code structure is like below and not sure whether there is a chance to do parallell process or not?

 

 

do loop i  

do loop j  

do loop k

 

macro test1     ( macro test -10 are same but with 1 parameter being diff which made them run 10 times)

macro  test2

macro test3

..

..

macro test10

 

do loop ends   k

do loop ends   j

do loop ends  k

 

Thanks,

Ana

Super User
Posts: 7,766

Re: parallel processing for loops & under macro

Before embarking on parallel processing, take a good look at why your current process performs badly. If your process is already running at saturation level of your I/O subsystem, parallelization will only make it worse.

So you have to check if you have the computing infrastructure upon which you can spread your parallel processes

- multiple CPU cores that are not yet used

- multiple I/O paths that are not yet used, or I/O capability in reserve

- memory that is not yet used

and so on.

 

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 95

Re: parallel processing for loops & under macro

Posted in reply to KurtBremser

Hi Kurt,

 

UNIX System is very good and database is also fine where i connect and do 50%  of the work in my code.

 

since the data size is 75K to 1 million records per month, i thought parallel process may help in processing macro test1 to macr test10

Super User
Posts: 7,766

Re: parallel processing for loops & under macro

[ Edited ]

Define "very good". I could have a system with 16 POWER cores, 512 GB of RAM and a nominal I/O throughput of 1GB/sec, and 2 silly SQLs could bring it to a standstill by riding one disk to death.

You need to identify what parts of your process take so long, and how your server(s) perform during those steps.

 

One approach that you could take would be this:

 

Suppose your outer loop performs 10 iterations.

Paramaterize those loops (ie retrieve loop_start and loop_end from commandline parameters)

Run the program in two parallel batch jobs with suiting commandline options (1 to 5 and 6 to 10) and measure your performance

If performance increases, increase parallelization until peformance stops getting better.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Valued Guide
Posts: 505

Re: parallel processing for loops & under macro

Posted in reply to KurtBremser

Mutitask 8 batch jobs.

 

My inexpensive workstation(SAS calls it PC-SAS) has 64gb and dual Xeons(3gz) and two raid 0 arrays (about $600 off lease Dell T7400)

 

Because the macro cascon was compute bound the 8 processes below cut the elapsed time by about factor of eight.

Eight small datasets(4 million obs) were created which you will need to either append or keep as a view after the tasks complete. 

 

If the largest temp or perm table is less than a 1TB(Big Data)  you may be able to run on a workstation otherwise I suggest 

EG GRID on a SAS server. If you are doing a large simulation that requires more than 16 cores you should also consider

the EG GRID.

 

You need to think about mutiple SPDE(libnames)  each with partitioed data if you have more I/O intensive work.

SPDE does not support mutiple tasks? Only the server addition does?


%let _s=%sysfunc(compbl(C:\Progra~1\SASHome\SASFoundation\9.4\sas.exe -sysin c:\nul -sasautos c:\oto -autoexec c:\oto\Tut_Oto.sas
-work d:\wrk));


options noxwait noxsync;
%let tym=%sysfunc(time());
systask kill sys1 sys2 sys3 sys4 sys5 sys6 sys7 sys8;
systask command "&_s -termstmt %nrstr(%cascon(beg=0000001,end=0125000)Smiley Wink -log d:\log\a1.log" taskname=sys1;
systask command "&_s -termstmt %nrstr(%cascon(beg=0125001,end=0250000)Smiley Wink -log d:\log\a2.log" taskname=sys2;
systask command "&_s -termstmt %nrstr(%cascon(beg=0250001,end=0375000)Smiley Wink -log d:\log\a3.log" taskname=sys3;
systask command "&_s -termstmt %nrstr(%cascon(beg=0375001,end=0500000)Smiley Wink -log d:\log\a4.log" taskname=sys4;
systask command "&_s -termstmt %nrstr(%cascon(beg=0500001,end=0625000)Smiley Wink -log d:\log\a5.log" taskname=sys5;
systask command "&_s -termstmt %nrstr(%cascon(beg=0625001,end=0750000)Smiley Wink -log d:\log\a6.log" taskname=sys6;
systask command "&_s -termstmt %nrstr(%cascon(beg=0750001,end=0875000)Smiley Wink -log d:\log\a7.log" taskname=sys7;
systask command "&_s -termstmt %nrstr(%cascon(beg=0875001,end=1000000)Smiley Wink -log d:\log\a8.log" taskname=sys8;
waitfor sys1 sys2 sys3 sys4 sys5 sys6 sys7 sys8;
%put %sysevalf( %sysfunc(time()) - &tym);

 

 

Super User
Super User
Posts: 7,942

Re: parallel processing for loops & under macro

[ Edited ]

What do the macro's resolve to, that is the key question here.  The code the derive to is being run once per inner loop * middle loop * outer loop, so could be many times.  I find it highly unlikely that this is a good methodology of working, but without seeing it all I can't really say.  Moving to parallel processing *may* help, it wont change the macro part as macro is just a text replacement facility, and if there is heavy read/write then it wont help.  But is hard to advie without seeing some test data (in the form of a datastep) and the code.

 

Also, just re-reading your post, if your running the same macro, but with a different parameter, then a simple change to your data structure - so that each parameter is a row rather than a column, can sometimes a) reduce your coding effort, b) be far more efficient that coding each item (due to by group processing).

Super User
Posts: 5,498

Re: parallel processing for loops & under macro

The answer is short and easy:  No.

 

There is nothing about macro loops that creates parallel processing.  SAS programs (whether macro language is involved or not) process one DATA or PROC step at a time, sequentially.  There are SAS language techniques that can parallelize a single step in some cases, but introducing macro language does not bring any of those SAS language techniques into play.

 

If you want to post whichever macro is taking longest to run, you could probably get some suggestions on how to speed up the SAS steps within.

Frequent Contributor
Posts: 95

Re: parallel processing for loops & under macro

Posted in reply to Astounding
Thanks Asto, Macro is pretty simple with DB connectivity, insert, update & delete SQL's only. But data is huge. running 7 loops is taking extra hours.
Super User
Posts: 5,498

Re: parallel processing for loops & under macro

In that case, it becomes a matter of strategy.  As others have mentioned above:

 

  • Consider whether several steps could logically be combined into one.  (How to adjust the program to combine those steps is a secondary issue.)
  • Consider splitting up the job into several jobs.  SAS will run several jobs in parallel, although there may be contention for either processing power or for access to the database that is being updated.

 

Ask a Question
Discussion stats
  • 8 replies
  • 345 views
  • 0 likes
  • 5 in conversation