Hi,
I have two data set I will like to merge. Data one have two ID i will like to use to merge but only one of the ID is in Data two. I would like to merge data two to Data one. I have provided a sample data and what my expected output should be. I use proc sql one-many merge but i did not get the expected result.
/*Data one*/
data dads;
input fid obs name $ inc ;
cards;
1 2 Art 21000
1 3 Art 22000
1 4 Art 23000
2 5 Bill 30000
2 6 Bill 30000
2 7 Bill 31000
3 2 Paul 25000
3 5 Paul 25000
3 4 Paul 25000
;
run;
data kids;
input famid kidname $ birth age wt yr ;
cards;
1 Beth 1 9 60 2007
1 Bob 2 6 40 2008
1 Barb 3 3 20 2009
2 Andy 1 8 80 2007
2 Al 2 6 50 2008
2 Ann 3 2 20 2009
3 Pete 1 6 60 2007
3 Pam 2 4 40 2008
3 Phil 3 2 20 2009
;
run;
/*Expected output*/
fid obs name $ inc famid kidname $ birth age wt sex $
1 2 Art 21000 1 Beth 1 9 60 2007
1 3 Art 22000 1 Bob 2 6 40 2008
1 4 Art 23000 1 Barb 3 3 20 2009
2 5 Bill 30000 2 Andy 1 8 80 2007
2 6 Bill 30000 2 Al 2 6 50 2008
2 7 Bill 31000 2 Ann 3 2 20 2009
3 2 Paul 25000 3 Pete 1 6 60 2007
3 5 Paul 25000 3 Pam 2 4 40 2008
3 4 Paul 25000 3 Phil 3 2 20 2009
/*Sample code*/
proc sql;
create table dadkid2 as
select *
from dads, kids
where dads.fid=kids.famid
order by dads.fid, kids.kidname;
quit;
It seems that the rows match one to one and you don't need to merge by key.
If that's not the case, provide a different sample.
Otherwise simply run:
data WANT;
merge DAD KIDS;
run;
The "DADS" dataset seems messed up. What family has three dads? And if they did what are the odds that they are all named ART?
If you reduce the DADS dataset to one observation per family then a simple data step merge will work. You will need to make sure the two versions of the family id are using the same variable name however.
data dads;
input fid obs name $ inc ;
cards;
1 2 Art 21000
1 3 Art 22000
1 4 Art 23000
2 5 Bill 30000
2 6 Bill 30000
2 7 Bill 31000
3 2 Paul 25000
3 5 Paul 25000
3 4 Paul 25000
;
data kids;
input famid kidname $ birth age wt yr ;
cards;
1 Beth 1 9 60 2007
1 Bob 2 6 40 2008
1 Barb 3 3 20 2009
2 Andy 1 8 80 2007
2 Al 2 6 50 2008
2 Ann 3 2 20 2009
3 Pete 1 6 60 2007
3 Pam 2 4 40 2008
3 Phil 3 2 20 2009
;
proc summary data=dads(rename=(fid=famid)) nway;
by famid name ;
var inc;
output out=one_dad(drop=_type_ _freq_) max=;
run;
data want;
merge one_dad kids;
by famid;
run;
Results
Obs famid name inc kidname birth age wt yr 1 1 Art 23000 Beth 1 9 60 2007 2 1 Art 23000 Bob 2 6 40 2008 3 1 Art 23000 Barb 3 3 20 2009 4 2 Bill 31000 Andy 1 8 80 2007 5 2 Bill 31000 Al 2 6 50 2008 6 2 Bill 31000 Ann 3 2 20 2009 7 3 Paul 25000 Pete 1 6 60 2007 8 3 Paul 25000 Pam 2 4 40 2008 9 3 Paul 25000 Phil 3 2 20 2009
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.