BookmarkSubscribeRSS Feed
JJP1
Pyrite | Level 9

Hi ,

Would you please help on coding part for below steps please.please help

 

data _null_;
set xx.ZBB nobs=nobs; (dataset has 800+ millions of records)
call symputx('nobs',nobs);
run;
%do i=1 to 8;
*create sas code to sort nobs/8 at a time;
* run the sas code;
%end;
*create a wait code to ensure 8 datasets are created;
*run the following four datasteps in parallel;
data four1;
merge eight1-eight2;
by key;
run;
data four2;
merge eight3-eight4;
by key;
data four3;
merge eight5-eight6;
by key;
run;
data four4;
merge eight7-eight8;
by key;
run;

 

3 REPLIES 3
Kurt_Bremser
Super User
data
  eight1
  eight2
  eight3
  eight4
  eight5
  eight6
  eight7
  eight8
;
set xx.zbb;
select (mod(_n_,8);
  when (1) output eight1;
  when (2) output eight2;
  when (3) output eight3;
  when (4) output eight4;
  when (5) output eight5;
  when (6) output eight6;
  when (7) output eight7;
  when (0) output eight8;
end;
run;

takes care of splitting in one step, without having to count at all.

Astounding
PROC Star

There are several items you need to address besides the splitting of the data.

 

If you plan to put all the data sets back together again as the final step, there is no advantage in creating FOUR1 FOUR2 FOUR3 and FOUR4.  You might as well just combine all 8 subsets in one step.

 

MERGE is the wrong tool for the job.  It is slower and does the wrong thing.  Using MERGE explains the results you got that dropped a handful of observations.  That result means that KEY is not unique.  I don't know whether it is supposed to be unique or not, but there are some KEYs with more than 1 observation in your original data set.  The right tool for the job would be SET instead of MERGE (but keep the BY statement in the program).  If your boss told you to use MERGE, you need someone on the team with more SAS knowledge.

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 780 views
  • 1 like
  • 3 in conversation