Hello,
I need to filter repeated measures if the dates (by id) are not 60 days apart. There are hundreds of id's in the dataset; I am showing three here.
Person 1 can be reduced to a single date (because they're the same). Person 2 can be reduced to a single date (I figure choosing the earliest date makes the most sense). Person 3 has 2 dates that are far enough apart to keep, but a third that would cause that row to be dropped (again wanting to keep the earliest date). Dates can be repeated and a person can have any number of dates from 1-10 (although no one has 7-9 oddly enough).
DATA have;
input id date mmddyy10.;
format date mmddyy10.;
CARDS;
1 2/20/2016
1 2/20/2016
2 1/25/2016
2 1/28/2016
2 2/4/2016
2 2/4/2016
3 3/18/2016
3 3/20/2016
3 9/6/2016
;
RUN;
DATA want;
input id date mmddyy10.;
format date mmddyy10.;
CARDS;
1 2/20/2016
2 1/25/2016
3 3/18/2016
3 9/6/2016
;
RUN;
Here's my logic for comparing dates, and an example of a person with three sorted dates:
Case1:
If date2-date1 <= 60, keep date1 and delete date2 For the next comparison date1 would be used again.
If date3-date1 <=60, keep date1 delete date3
====================================
Case2:
If date2-date1 >= 60, keep both dates(rows) At this point, when keeping both dates, the next comparison must be done using the latest date.
If date3-date2 >=60, keep both dates In this situation all three dates are retained. But had date3-date2 been <= 60 I would've dropped date3.
Any help is appreciated.
This is a case where you have to look ahead one observation to see if the upcoming gap is more than 60 days (or if the next observation starts a new id). You can do this via a MERGE statement where you merge observation1 with observation2 (with ID and DATE variables renamed), and then merge observation 2 with observation 3, etc. That is supported via the "FIRSTOBS=2" option in one of the objects of the MERGE statement below:
DATA have;
input id date mmddyy10.;
format date mmddyy10.;
CARDS;
1 2/20/2016
1 2/20/2016
2 1/25/2016
2 1/28/2016
2 2/4/2016
2 2/4/2016
3 3/18/2016
3 3/20/2016
3 9/6/2016
;
RUN;
data want (drop=_:);
do until (_nxt_id^=id or date+60<_nxt_date);
merge have
have (firstobs=2 keep=id date rename=(id=_nxt_id date=_nxt_date));
if _init_date=. then _init_date=date;
end;
date=_init_date;
run;
This is a case where you have to look ahead one observation to see if the upcoming gap is more than 60 days (or if the next observation starts a new id). You can do this via a MERGE statement where you merge observation1 with observation2 (with ID and DATE variables renamed), and then merge observation 2 with observation 3, etc. That is supported via the "FIRSTOBS=2" option in one of the objects of the MERGE statement below:
DATA have;
input id date mmddyy10.;
format date mmddyy10.;
CARDS;
1 2/20/2016
1 2/20/2016
2 1/25/2016
2 1/28/2016
2 2/4/2016
2 2/4/2016
3 3/18/2016
3 3/20/2016
3 9/6/2016
;
RUN;
data want (drop=_:);
do until (_nxt_id^=id or date+60<_nxt_date);
merge have
have (firstobs=2 keep=id date rename=(id=_nxt_id date=_nxt_date));
if _init_date=. then _init_date=date;
end;
date=_init_date;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.