BookmarkSubscribeRSS Feed
yadavdharmendra
Calcite | Level 5
I am converting SAS code to Spark. In one of the data step SAS is doing merge (many to many). I need to achieve same in Spark. Please help me to get this done.
proc sort data=dat1;by mbr_sys_id clm_aud_nbr2;run;
proc sort data=dat2;by mbr_sys_id clm_aud_nbr2;run;

data want;
 merge dat1(in=a)
    dat2(in=b);
    by mbr_sys_id clm_aud_nbr2;
    if a and b;
run;


Capture.PNG

 

I want to get this same output using proc sql. Can you please help me in here.

 

4 REPLIES 4
maguiremq
SAS Super FREQ

DATA step merges cannot do many-to-many joins.

 

Note: The MERGE statement does not produce a Cartesian product on a many-to-many match-merge. Instead, it performs a one-to-one merge while there are observations in the BY group in at least one data set. When all observations in the BY group have been read from one data set and there are still more observations in another data set, SAS performs a one-to-many merge until all BY group observations have been read.

 

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsref/n1i8w2bwu1fn5kn1gpxj18xttbb0.htm

 

What you're showing is an inner join.

 

You also need to be careful if you share any variable names in the data sets in a MERGE statement. SAS takes the right-most value.

SASKiwi
PROC Star

The following SQL will do a many-to-many join which is similar but not the same as a SAS MERGE:

proc sql;
  create table want as 
  select *
  from dat1 as A
  inner join dat2 as B
  on A.mbr_sys_id = B.mbr_sys_id
  and A.clm_aud_nbr2 = B.clm_aud_nbr2
  ;
quit;
  
yadavdharmendra
Calcite | Level 5

I can do this inner join but how can I achieve same out which sas merge produces .

Actually bigger picture is I am converting SAS code to SPARK 3 I do not have merge in spark

only i have joins, if you can replicate SAS merge output into sql , it will be helpful for me.

LinusH
Tourmaline | Level 20

As already stated, SQL join and data step merge are not the same.

Under many circumstances they produce the same result.

But for you to be able to migrate in a consistent way, you need to understand the requirement behind the SAS code, and how your input data behaves,

Data never sleeps

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1802 views
  • 0 likes
  • 4 in conversation