I am using full join in Proc Sql to join two datasets. There are multiple same values of key variable in both datasets so "many to many" join has been observed. Here is the small part of datasets (ie. subset of data set ).
data a;
input id var1 date;
cards;
A Car 1
A Car 2
;
run;
data b;
input id var1 va2;
cards;
A Car 100
A Car 200
;
run;
Here is the code I am using.
proc sql noprint;
create table t1 as
select a.*,b.var2
from a as a
full join
b as b
on a.id=b.id and
a.var1=b.var1
quit;
Desired output:
id var1 date var2
A Car 1 100
A Car 2 200
Can many to many merge be prevented in full join with multiple key variables?
Thanks !
SQL operations cannot depend on observations order.; but data step programs can:
data t2;
set a;
set b;
by id var1;
run;
proc print data=t2 noobs; run;
@PGStats. Thank you. I have already applied full join in my large data set. The problem is occuring only when both key variable (s) have multiple same values and third key variable is not availabe for joining purpose.
@LinusH. Is there any way I can create third variable to merge in data step. Thanks !
Usually, it's hard to create a new key with a lower granular level on an existing table. Your sample data show too little of the possibilities. But it depends on the data. The best if you could get back to the data provider and ask for more details, and have tables that have a true primary key.
The other "leg" is what you actually want to achieve by matching those tables. What is the business rule/requirement?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.