I would like to parallelize inside a sas proc. I want to run:
proc lifereg data=main; by SampleNo; model statement etc. run;
Where each sample will execute at the same time. Is that something that is possible? Thank you.
I went through this paper. http://www.lexjansen.com/wuss/2018/15_Final_Paper_PDF.pdf One of its main takeaways is that any proc or data steps that don't rely on one another can be run at the same time.
Hello @thewan ,
A by-statement is very efficient! It's probably not processing your samples in parallel (as there's only one SAS session in 'the game') but it's very efficient. Much more efficient than looping over your samples.
If you are talking about parallel processing, you should mention your SAS version (SAS 9.4 Mx or SAS VIYA 3.5+).
Submit
%PUT &=sysvlong4;
to find out.
Anyway I think you can use MP Connect (Multi-Process Connect) if you have SAS/Connect.
It works very well and I use it a lot.
Here's some info:
If you have six samples you launch 6 concurrent SAS sessions, each one dealing with one sample.
Like here:
options sascmd="sas";
/* Process 1 */
signon task1;
rsubmit task1 wait=no;
PROC LIFEREG DATA=;
where SampleNumber="1";
MODEL ...;
RUN;
endrsubmit;
/* Process 2 */
signon task2;
rsubmit task2 wait=no;
PROC LIFEREG DATA=;
where SampleNumber="2";
MODEL ...;
RUN;
endrsubmit;
/* ... ... ... */
/* Process 6 */
signon task6;
rsubmit task6 wait=no;
PROC LIFEREG DATA=;
where SampleNumber="6";
MODEL ...;
RUN;
endrsubmit;
waitfor _all_;
signoff _all_;
/* end of program */
Kind regards,
Koen
Hello @thewan ,
A by-statement is very efficient! It's probably not processing your samples in parallel (as there's only one SAS session in 'the game') but it's very efficient. Much more efficient than looping over your samples.
If you are talking about parallel processing, you should mention your SAS version (SAS 9.4 Mx or SAS VIYA 3.5+).
Submit
%PUT &=sysvlong4;
to find out.
Anyway I think you can use MP Connect (Multi-Process Connect) if you have SAS/Connect.
It works very well and I use it a lot.
Here's some info:
If you have six samples you launch 6 concurrent SAS sessions, each one dealing with one sample.
Like here:
options sascmd="sas";
/* Process 1 */
signon task1;
rsubmit task1 wait=no;
PROC LIFEREG DATA=;
where SampleNumber="1";
MODEL ...;
RUN;
endrsubmit;
/* Process 2 */
signon task2;
rsubmit task2 wait=no;
PROC LIFEREG DATA=;
where SampleNumber="2";
MODEL ...;
RUN;
endrsubmit;
/* ... ... ... */
/* Process 6 */
signon task6;
rsubmit task6 wait=no;
PROC LIFEREG DATA=;
where SampleNumber="6";
MODEL ...;
RUN;
endrsubmit;
waitfor _all_;
signoff _all_;
/* end of program */
Kind regards,
Koen
Thank you, your example code was what I needed!
The list of multithreaded procedures is available here:
https://support.sas.com/rnd/scalability/procs/
If you can't use one of these, you can maybe:
1. Ask SAS R&D to add proc lifereg to the list
2. Try to program the logic you need using proc DS2
3. Break the table into subsets, and multithread using MPConnect as shown by @sbxkoenk
You could also multithread using call system() but you lose any ability to synchronise the jobs
Note that:
- You first need to ensure that you bottleneck is the CPU. If it's the disk, you'll only make things worse by multithreading.
- There are overheads when multithreading and the gains may not be as expected
call dosubl() is still limited to sequential processing sadly.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.