Hi,
I have two datasets that each have ID and a date variable. Dataset 1 also has Var1 (which is continous) and Dataset 2 also has Var2 (a dummy variable).
Dataset 1
ID Date Var1
1 1/5/12 30
1 3/9/13 54
1 10/4/14 3
2 2/1/12 9
2 1/1/13 23
Dataset 2
ID Date Var2
1 1/5/12 0
1 2/7/12 0
1 3/9/13 1
2 2/1/12 1
2 4/3/12 1
3 4/5/13 1
3 10/4/13 1
The goal is to have for each ID/date combo as much information as possible--so ideally I'd want the values of both Var1 and Var2 on a given date, but if a person only has Var1 or Var2 on a given date, that should be displayed. Therefore, I joined them using proc sql full join on ID and date:
proc sql;
create table want as
select * from Dataset1 a
full join Dataset2 b
on (a.ID=b.ID) and (a.date=b.date);
quit;
This gave me the output dataset I wanted:
ID Date Var1 Var2
1 1/5/12 30 0
1 2/7/12 0
1 3/9/13 54 1
1 10/4/14 3
2 2/1/12 9 1
2 4/3/12 1
2 1/1/13 23
3 4/5/13 1
3 10/4/13 1
The issue is that there are some people like ID 3, who are only in Dataset 2 but not Dataset 1. Ideally, I'd like to exclude these people from my final output dataset altogether, but I'm not sure how to do that with a full outer join. Alternatively, if I just had a way of identifying which people came from Dataset 2 only, that would be helpful as well.
Any help is much appreciated.
I didn't find the proc sql code you included produced the results you stated.
However, this version of the code produced what you stated:
proc sql;
create table want as
select coalesce(a.id, b.id) 'id',
coalesce(a.date, b.date) 'date',
a.var1, b.var2
from temp1 a
full join temp2 b
on (a.ID=b.ID) and (a.date=b.date);
quit;
AND if you add this where clause it eliminates the records from the 2nd data set that you stated you didn't want included.
proc sql;
create table want as
select coalesce(a.id, b.id) 'id',
coalesce(a.date, b.date) 'date',
a.var1, b.var2
from temp1 a
full join temp2 b
on (a.ID=b.ID) and (a.date=b.date)
where coalesce(a.id,b.id) in (select id from temp1);
quit;
Hope this helps,
David
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.