Dear all,
I have a longitudinal dataset in which each patient is recorded with multiple dates. Each date is counted as a separate row. Note that a patient could have multiple entries for the same date. The data layout is as follow:
ID Date Var1 Var2
1 2002 4 5
1 2002 5 7
1 2003 7 9
2 2001 4 8
2 2002 3 8
Ideally, I would like to delete observations from the first recored date for each patient, so the output data would look like:
ID Date Var1 Var2
1 2003 7 9
2 2002 3 8
I have tried proc sort with nodupkey as well as proc sql distinct but none of them gave me the output i want. I was also trying to come up with an indicator to code for the first date within each patient but did not succeed. I'd really appreciate if anyone could help!
data have;
input ID Date Var1 Var2;
cards;
1 2002 4 5
1 2002 5 7
1 2003 7 9
2 2001 4 8
2 2002 3 8
;
run;
data want;
set have;
by id date notsorted;
if first.id then n=0;
n+first.date;
if n ne 1;
drop n;
run;
See PROC SORT examples especially with the option NODUPKEY.
proc sort data = have;
by id date;
run;
data want;
set have;
if last.id;
by id;
run;
Hi. I think that the only problem with this is if there are multiple occurrences of dates after the first date, for example ...
data have;
input ID Date Var1 Var2 @@;
cards;
1 2002 4 5 1 2002 5 7 1 2003 7 9 1 2003 9 9
2 2001 4 8 2 2002 3 8 2 2002 9 9
;
data want;
set have;
by id;
if last.id;
run;
WANT would be ...
Obs ID Date Var1 Var2
1 1 2003 9 9
2 2 2002 9 9
and miss one "not first date" for each ID. How about ...
data want (drop=first_date);
set have;
by id;
first_date + first.id*date;
if date ne first_date then output;
if last.id then call missing(first_date);
run;
WANT would be ...
Obs ID Date Var1 Var2
1 1 2003 7 9
2 1 2003 9 9
3 2 2002 3 8
4 2 2002 9 9
(same answer you'd get with Xia's data step answer or my PROC SQL answer)
data have;
input ID Date Var1 Var2;
cards;
1 2002 4 5
1 2002 5 7
1 2003 7 9
2 2001 4 8
2 2002 3 8
;
run;
data want;
set have;
by id date notsorted;
if first.id then n=0;
n+first.date;
if n ne 1;
drop n;
run;
Thank you so much. This is exactly what I was looking for.
Hi. You tried SQL. Did you try this ...
proc sql;
create table want as select * from have group by id having date ^= min(date);
quit;
Mike, I suppose you want GT (>) instead of ^=, just in case some dates are missing.
Sure, thanks.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.