BookmarkSubscribeRSS Feed
SASAna
Quartz | Level 8

Hi SAS users,

 

I wanted some help and tips on how to put this process in 'parallel processing'. I am passing 15 parts - each part has 5 million records in the data extract.. after that each process will run intermediate macro's and holds in the datasets and finally load data into Database. process is running over 2 days.

 

 

how to put this process in parallel run by parts to speedup the process.

 

%Macro Exec(part);
    %Let Cd=%Bquote('&part');


%data extract

 

data extract; set data_extract; run;
    
Proc Sql Noprint; Select Count(*) Into:Obs From extract; Quit;

%If %Eval(&Obs Gt 0) %Then %Do; 

 

  %Loop( Obs=&Obs, Num_Per_Loop=1000, Input_Dataset=extract, Variable=id, Variable_Type=C, Macro_To_Call= data_pull
             Output_Dataset=All_data);


%Delete_Dataset(DS_Name=extract,Lib_Name=Work);
      

%macro1

%macro2

%macro3

%macro4
  

%hold_for_output(dsin=All_data,dsout=all_details);            
       
%End;


%Mend Exec;

 

Thank you.

3 REPLIES 3
PeterClemmensen
Tourmaline | Level 20

To really help you with a problem like this, we need to know more about what you want to do. There are a few macros here, which we don't know the purpose of. So describe you problem in more detail please. 

 

The only thing that jumps into my eyes is this

 

data extract; set data_extract; run;

There is absolutely no need to do this. You can simply use data_extract data set directly in the SQL Procedure. 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

This:
 %Loop( Obs=&Obs, Num_Per_Loop=1000, Input_Dataset=extract, Variable=id, Variable_Type=C, Macro_To_Call= data_pull
             Output_Dataset=All_data);

Also doesn't seem ideal.  It looks like it is looping 1000 times, possibly creating datasets etc.  That would be time intesive, and some alterations to that process alone could save you a lot of resources.  I.e. one big dataset to process rather than lots of small ones.

Can't tell anything else from that code.

Astounding
PROC Star

I could be wrong here, but I think you are barking up the wrong tree.

 

Just the little bit of code you have shown contains obvious inefficiencies.  Paying attention to them might eliminate a major portion of the run time, without the need for parallel processing  A couple of examples:

 

Why create the data set EXTRACT?  Why not just use the data set DATA_EXTRACT? 

 

There are faster ways to get a count of all the observations in a data set, other than using COUNT(*).

 

It's just a guess, since we don't really see the contents of %MACRO1, ..., %MACRO5.  But I suspect it would be easy to cut out much of the time without parallel processing.  Even if you do find a parallel approach, it would still work much faster if the original programming were to run much faster.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1406 views
  • 1 like
  • 4 in conversation