Hi all,
I have data in the following format (some has dates missing):
ID Date 1 Date 2 Date 3
1201 | 11/29/2007 | 11/1/2007 | 11/1/2007 |
1220 | 6/1/2002 | ||
1230 | 8/1/2002 | 8/1/2002 | 8/1/2002 |
1230 | 4/1/2000 | 4/1/2000 | 4/1/2000 |
1237 | 7/1/2003 | 7/1/2003 | 7/1/2003 |
I want to do the following:
If one ID has more than one date i's (for i = 1,2,3), I want to keep only the earlier date.
ID Date 1 Date 2 Date 3
1201 | 11/29/2007 | 11/1/2007 | 11/1/2007 |
1220 | 6/1/2002 | ||
1230 | 4/1/2000 | 4/1/2000 | 4/1/2000 |
1237 | 7/1/2003 | 7/1/2003 | 7/1/2003 |
How can I do this?
Thanks,
Chris
Then it becomes a classic SQL solution:
data have;
infile cards missover;
format date1 mmddyy10.
date2 mmddyy10.
date3 mmddyy10.;
input id $ (date1-date3) (: mmddyy10.);
cards;
1201 11/29/2007 11/1/2007 11/1/2007
1220 6/1/2002
1230 8/1/2002 8/1/1998 8/1/2002
1230 4/1/2000 4/1/2000 4/1/1999
1237 7/1/2003 7/1/2003 7/1/1999
;
proc sql;
create table want as
select id, min(date1) as date1 format mmddyy10., min(date2) as date2 format mmddyy10.,
min(date3) as date3 format mmddyy10. from have
group by id;
quit;
proc print;run;
Regards,
Haikuo
Basically you need to sort your data based on ID and date, then utilize first.variable or last.variable to capture what you want.
data have;
infile cards missover;
format date1 mmddyy10.
date2 mmddyy10.
date3 mmddyy10.;
input id $ (date1-date3) (: mmddyy10.);
cards;
1201 11/29/2007 11/1/2007 11/1/2007
1220 6/1/2002
1230 8/1/2002 8/1/2002 8/1/2002
1230 4/1/2000 4/1/2000 4/1/2000
1237 7/1/2003 7/1/2003 7/1/2003
;
proc sort data=have;
by id date1;
run;
data want;
set have;
by id date1;
if first.id;
run;
proc print;run;
Regards,
Haikuo
I provided a bad example in my question.
it's not always the case that a row with the earliest date1 also has the earliest date 2 and date 3.
for example,
1201 11/29/2007 11/1/2007 11/1/2007
1220 6/1/2002
1230 8/1/2002 "1/1/1998" 8/1/2002
1230 4/1/2000 4/1/2000 4/1/2000
1237 7/1/2003 7/1/2003 7/1/2003
should become:
1201 11/29/2007 11/1/2007 11/1/2007
1220 6/1/2002
1230 4/1/2000 "1/1/1998" 4/1/2000
1237 7/1/2003 7/1/2003 7/1/2003
please see my response to Hai.kuo's answer. I gave a bad example... and I'm giving a better example there. thanks
do you want something like this?
data have;
infile cards missover;
format date1 mmddyy10.
date2 mmddyy10.
date3 mmddyy10.;
input id $ (date1-date3) (: mmddyy10.);
cards;
1201 11/29/2007 11/1/2007 11/1/2007
1220 6/1/2002
1230 4/1/2000 1/1/1998 4/1/2000
1237 7/1/2003 7/1/2003 7/1/2003
;
data want(keep=id date);
set have;
date=smallest(1,of date1-date3);
format date mmddyy10.;
run;
proc print;run;
Obs id date
1 1201 11/01/2007
2 1220 06/01/2002
3 1230 01/01/1998
4 1237 07/01/2003
Linlin
no. this is different from the example I posted.
Then it becomes a classic SQL solution:
data have;
infile cards missover;
format date1 mmddyy10.
date2 mmddyy10.
date3 mmddyy10.;
input id $ (date1-date3) (: mmddyy10.);
cards;
1201 11/29/2007 11/1/2007 11/1/2007
1220 6/1/2002
1230 8/1/2002 8/1/1998 8/1/2002
1230 4/1/2000 4/1/2000 4/1/1999
1237 7/1/2003 7/1/2003 7/1/1999
;
proc sql;
create table want as
select id, min(date1) as date1 format mmddyy10., min(date2) as date2 format mmddyy10.,
min(date3) as date3 format mmddyy10. from have
group by id;
quit;
proc print;run;
Regards,
Haikuo
thanks!
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.