data have1;
input id year death reg;
datalines;
1 2000 0 11
1 2001 0 12
1 2002 0 11
1 2003 0 11
2 2000 1 11
2 2001 1 12
2 2002 1 13
2 2003 1 14
3 2000 0 .
3 2001 0 11
3 2002 0 11
3 2003 0 11
5 2000 1 .
5 2001 1 .
5 2002 1 11
5 2003 1 11
;
data have2;
input id year reg sex incs;
datalines;
1 2000 11 1 1000
2 2000 11 1 2000
4 2000 12 1 4000
;data want;
input id year reg death sex incs;
datalines;
1 2000 11 0 1 1000
1 2001 12 0 1 1000
1 2002 11 0 1 1000
1 2003 11 0 1 1000
2 2000 11 1 1 2000
2 2001 12 1 1 2000
2 2002 13 1 1 2000
2 2003 14 1 1 2000
4 2000 12 . 1 4000
;I'd like to have a dataset like want.
The logic is that I want to keep IDs only from have2, but want to keep all person-year information for the IDs in have1. Except for the reg variable, sex and incs are time-invariant information, so I want to have the same values for all years, but in case of reg it's all changing over years, so I want to keep information from have1. Meanwhile, there may be some IDs only shown in have2, in that case, I want to keep only the first row (as shown in have2) not making other years.
Thank you so much in advance.
According to ID=4 wanted output it seems that you select DEATH only from have1 and neglect it from have2. Try next code:
proc sql;
create table want as
select a.ID, a.year, a.reg, a. sex, a.incs,
b.death
from have2 as a
left join have1 as b
on a.ID = b.ID;
quit;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.