Hi All,
I'm trying to merge 2 data sets
data one;
input id $ year ;
datalines;
a 1
a 2
;
run;
data two;
input id $ sc pt;
datalines;
a 1 50
a 2 60
a 3 40
;
run;
This is the code I used
proc sort data= one;by id;run;
proc sort data= two;by id;run;
data one_two;
merge one(in=a) two(in=b);
if a and b;
by id;
run;
This is the output I got
id | year | sc | pt |
---|---|---|---|
a | 1 | 1 | 50 |
a | 2 | 2 | 60 |
a | 2 | 3 | 40 |
Instead, How do I get this output
id | year | sc | pt |
---|---|---|---|
a | 1 | 1 | 50 |
a | 2 | 2 | 60 |
a | . | 3 | 40 |
THANK YOU!!
When doing a MANY to MANY merge with a data step SAS will match the records one by one until one of the datasets does not have enough records in the group. The values from that dataset are then retained onto the rest of the records in the group.
To get what you want you just need to add an OUTPUT and CALL MISSING to the end of the data step.
data one_two;
merge one(in=a) two(in=b);
by id;
if a and b;
output;
call missing(of _all_) ;
run;
You can't get that output in a secure way unless you can't define an extended primary key. Is year and sc related?
When doing a MANY to MANY merge with a data step SAS will match the records one by one until one of the datasets does not have enough records in the group. The values from that dataset are then retained onto the rest of the records in the group.
To get what you want you just need to add an OUTPUT and CALL MISSING to the end of the data step.
data one_two;
merge one(in=a) two(in=b);
by id;
if a and b;
output;
call missing(of _all_) ;
run;
Hi, another way to get the desired output for those small data sets is to get rid of the BY statement .(by the way, you don't need a RUN statement after a DATALINES file) ...
data one;
input id $ year @@;
datalines;
a 1 a 2
;
data two;
input id $ sc pt @@;
datalines;
a 1 50 a 2 60 a 3 40
;
data one_two;
merge one two;
run;
DATA SET ONE-TWO ...
Obs id year sc pt
1 a 1 1 50
2 a 2 2 60
3 a . 3 40
data one;
input id $ year @@;
datalines;
a 1 a 2
;
data two;
input id $ sc pt @@;
datalines;
a 1 50 a 2 60 a 3 40
;
data one;
set one;
sc=year;
run;
data want;
merge one two;
by id sc;
run;
data one;
input id $ year @@;
datalines;
a 1 a 2
;
data two;
input id $ sc pt @@;
datalines;
a 1 50 a 2 60 a 3 40
;
run;
/****Alternative Way*****/
proc sql;
create table _want as
select b.*,a.year
from one as a right join two as b
on a.id=b.id and a.year=b.sc;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.