Hi all,
I have data in the following format (some has dates missing):
ID Date 1 Date 2 Date 3
1201 | 11/29/2007 | 11/1/2007 | 11/1/2007 |
1220 | 6/1/2002 | ||
1230 | 8/1/2002 | 8/1/2002 | 8/1/2002 |
1230 | 4/1/2000 | 4/1/2000 | 4/1/2000 |
1237 | 7/1/2003 | 7/1/2003 | 7/1/2003 |
I want to do the following:
If one ID has more than one date i's (for i = 1,2,3), I want to keep only the earlier date.
ID Date 1 Date 2 Date 3
1201 | 11/29/2007 | 11/1/2007 | 11/1/2007 |
1220 | 6/1/2002 | ||
1230 | 4/1/2000 | 4/1/2000 | 4/1/2000 |
1237 | 7/1/2003 | 7/1/2003 | 7/1/2003 |
How can I do this?
Thanks,
Chris
Then it becomes a classic SQL solution:
data have;
infile cards missover;
format date1 mmddyy10.
date2 mmddyy10.
date3 mmddyy10.;
input id $ (date1-date3) (: mmddyy10.);
cards;
1201 11/29/2007 11/1/2007 11/1/2007
1220 6/1/2002
1230 8/1/2002 8/1/1998 8/1/2002
1230 4/1/2000 4/1/2000 4/1/1999
1237 7/1/2003 7/1/2003 7/1/1999
;
proc sql;
create table want as
select id, min(date1) as date1 format mmddyy10., min(date2) as date2 format mmddyy10.,
min(date3) as date3 format mmddyy10. from have
group by id;
quit;
proc print;run;
Regards,
Haikuo
Basically you need to sort your data based on ID and date, then utilize first.variable or last.variable to capture what you want.
data have;
infile cards missover;
format date1 mmddyy10.
date2 mmddyy10.
date3 mmddyy10.;
input id $ (date1-date3) (: mmddyy10.);
cards;
1201 11/29/2007 11/1/2007 11/1/2007
1220 6/1/2002
1230 8/1/2002 8/1/2002 8/1/2002
1230 4/1/2000 4/1/2000 4/1/2000
1237 7/1/2003 7/1/2003 7/1/2003
;
proc sort data=have;
by id date1;
run;
data want;
set have;
by id date1;
if first.id;
run;
proc print;run;
Regards,
Haikuo
I provided a bad example in my question.
it's not always the case that a row with the earliest date1 also has the earliest date 2 and date 3.
for example,
1201 11/29/2007 11/1/2007 11/1/2007
1220 6/1/2002
1230 8/1/2002 "1/1/1998" 8/1/2002
1230 4/1/2000 4/1/2000 4/1/2000
1237 7/1/2003 7/1/2003 7/1/2003
should become:
1201 11/29/2007 11/1/2007 11/1/2007
1220 6/1/2002
1230 4/1/2000 "1/1/1998" 4/1/2000
1237 7/1/2003 7/1/2003 7/1/2003
please see my response to Hai.kuo's answer. I gave a bad example... and I'm giving a better example there. thanks
do you want something like this?
data have;
infile cards missover;
format date1 mmddyy10.
date2 mmddyy10.
date3 mmddyy10.;
input id $ (date1-date3) (: mmddyy10.);
cards;
1201 11/29/2007 11/1/2007 11/1/2007
1220 6/1/2002
1230 4/1/2000 1/1/1998 4/1/2000
1237 7/1/2003 7/1/2003 7/1/2003
;
data want(keep=id date);
set have;
date=smallest(1,of date1-date3);
format date mmddyy10.;
run;
proc print;run;
Obs id date
1 1201 11/01/2007
2 1220 06/01/2002
3 1230 01/01/1998
4 1237 07/01/2003
Linlin
no. this is different from the example I posted.
Then it becomes a classic SQL solution:
data have;
infile cards missover;
format date1 mmddyy10.
date2 mmddyy10.
date3 mmddyy10.;
input id $ (date1-date3) (: mmddyy10.);
cards;
1201 11/29/2007 11/1/2007 11/1/2007
1220 6/1/2002
1230 8/1/2002 8/1/1998 8/1/2002
1230 4/1/2000 4/1/2000 4/1/1999
1237 7/1/2003 7/1/2003 7/1/1999
;
proc sql;
create table want as
select id, min(date1) as date1 format mmddyy10., min(date2) as date2 format mmddyy10.,
min(date3) as date3 format mmddyy10. from have
group by id;
quit;
proc print;run;
Regards,
Haikuo
thanks!
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.