BookmarkSubscribeRSS Feed
Sarah2913
Calcite | Level 5

I have the following SAS code 

 

data merged_data;

merge data1 (in=_1) data2 (in_2) ;

by ID ;

if _1 and _2;

run; 

 

I believe this is similiar to an inner join in SQL but not always the same. Could anyone please explain the difference? 

Thank in advance.

Sarah 

6 REPLIES 6
novinosrin
Tourmaline | Level 20

data a;
input a;
cards;
1
1
2
2
3
3
;

data b;
input a;
cards;
1
1
2
2
3
3
;

data want;
merge a(in=in1) b(in=in2);
by a;
if in1 and in2;
run;

/*please notice n*m many to many*/
proc sql;
create table w as
select a.a,b.a as b
from a a,b b
where a.a=b.a;
quit;
novinosrin
Tourmaline | Level 20

So, as long as it is one to many both would mostly likely yield the same results. However many to many makes sql yield different results for the reason it does a cartesian between by groups n*m as opposed to merge that joins by position

Reeza
Super User

SAS data step doesn't do a many to many match, whereas SQL does. Otherwise most joins can be replicated with a data step merge. 

 

I generally join using only SQL, more control and easier to understand in my opinion. 

 


@Sarah2913 wrote:

I have the following SAS code 

 

data merged_data;

merge data1 (in=_1) data2 (in_2) ;

by ID ;

if _1 and _2;

run; 

 

I believe this is similiar to an inner join in SQL but not always the same. Could anyone please explain the difference? 

Thank in advance.

Sarah 


 

Sarah2913
Calcite | Level 5
Thanks fpr the reply. Is there anyway to repicate the merge exactly using sql?
Reeza
Super User

I don't know but in my experience, no one ever wants the data to do what the SAS data step is doing. 

If you're doing straight code conversion for someone, flag this as a possible issue with the model/data.

 

If you're doing this as homework, my question would be does this actually occur in the data? Is this an issue you're worried about? 
If you don't have a many to many, but a left or inner join, those can usually be replicated in some form. 

 

proc sql;
create table want as
select a.*, b.*
from have1 as t1
inner join have2 as t2
on t1.ID=t2.ID;
quit;

 


@Sarah2913 wrote:
Thanks fpr the reply. Is there anyway to repicate the merge exactly using sql?

 

novinosrin
Tourmaline | Level 20

HI @Sarah2913   You can replicate if you have rownum for each by group akin to DB sql like oracle or teradata i.e partition by 

 



data a;
input a;
cards;
1
1
2
2
2
2
2
3
3
;
/*Partition by rownum for each by group*/
data a;
set a;
by a;
if first.a then n=1;
else n+1;
run;


data b;
input a;
cards;
1
1
1
2
3
3
3
;
/*Partition by rownum for each by group*/
data b;
set b;
by a;
if first.a then n1=1;
else n1+1;
run;


data want_merge(drop=n:);
merge a(in=in1) b(in=in2);
by a;
if in1 and in2;
run;

/*replicate merge using many to many to make it one to many*/
proc sql;
create table want_sql(drop=n2) as
select distinct a.a,b.a as b,(max(n,n1)) as n2
from a a inner join b b
on  a.a=b.a;
quit; 

Of course the above does involve some logic , nevertheless that's a fun experiment

 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 795 views
  • 0 likes
  • 3 in conversation