Hi Everyone
I am asking your help to solve the following question:
I have two datasets to merge with SAS 9.4. One contains matched cases and controls and some controls can be matched to more than one case. The second dataset contains the same ID with possible multiple observations per ID but does not contain the matching_ID. The two datasets and the expected output are displayed below.
Thanks a lot for your help!
dataset 1
ID | Match_ID | case |
1 | 1a | 0 |
2 | 1a | 1 |
1 | 1b | 0 |
3 | 1b | 1 |
dataset 2
ID | var_date |
1 | 01APR2008 |
1 | 10OCT2009 |
2 | 11JUL2008 |
2 | 09OCT2010 |
3 | 02JANV2011 |
Expected output
ID | Match_ID | Case | var_date |
1 | 1a | 0 | 01APR2008 |
1 | 1a | 0 | 10OCT2009 |
2 | 1a | 1 | 11JUL2008 |
2 | 1a | 1 | 09OCT20010 |
1 | 1b | 0 | 01APR2008 |
1 | 1b | 0 | 10OCT2009 |
3 | 1b | 1 | 02JANV2011 |
Do a SQL join:
data d1;
input ID Match_ID $ case;
datalines;
1 1a 0
2 1a 1
1 1b 0
3 1b 1
;
data d2;
input ID var_date :date9.;
format var_date date9.;
datalines;
1 01APR2008
1 10OCT2009
2 11JUL2008
2 09OCT2010
3 02JAN2011
;
proc sql;
create table d3 as
select d1.*, d2.var_date
from d1 inner join d2 on d1.id=d2.id
order by match_id, id, var_date;
select * from d3;
quit;
Do a SQL join:
data d1;
input ID Match_ID $ case;
datalines;
1 1a 0
2 1a 1
1 1b 0
3 1b 1
;
data d2;
input ID var_date :date9.;
format var_date date9.;
datalines;
1 01APR2008
1 10OCT2009
2 11JUL2008
2 09OCT2010
3 02JAN2011
;
proc sql;
create table d3 as
select d1.*, d2.var_date
from d1 inner join d2 on d1.id=d2.id
order by match_id, id, var_date;
select * from d3;
quit;
Thank you very much PG. It works perfectly.
Henri
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.