Quartz | Level 8

How to get records with time gap of certain number of days

Hi All,

I have a list of records with dates related to each person and want to only keep the records that give at least time gap of 31 days. The HAVE data set is what I have, and WANT is the target data file. Can anyone guide how to go from HAVE to WANT? Thank you in advance!

data have;
input person_id occurance_date :mmddyy.;
format occurance_date mmddyy.;
cards;
1 1/5/2023
2 8/21/2023
2 9/25/2023
3 1/5/2023
3 2/1/2023
4 1/22/2023
4 1/29/2023
4 3/9/2023
4 3/10/2023
4 3/14/2023
4 5/1/2023
4 6/11/2023
4 6/15/2023
;
run;
data want;
input person_id eff_occurance_date :mmddyy.;
format eff_occurance_date mmddyy.;
cards;
1 1/5/2023
2 8/21/2023
2 9/25/2023
3 1/5/2023
4 1/22/2023
4 3/9/2023
4 5/1/2023
4 6/11/2023
;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
PROC Star

Re: How to get records with time gap of certain number of days

If you have 3 records dates 20MAR, 01APR, and 28APR, do you keep the third record (it is more than 31 days after the last kept record (20MAR), but only 27 days after the last record encountered (01APR).  If the answer is yes, and if your data are sorted by ID/DATE, then:

``````data have;
input person_id occurance_date :mmddyy.;
format occurance_date mmddyy.;
cards;
1 1/5/2023
2 8/21/2023
2 9/25/2023
3 1/5/2023
3 2/1/2023
4 1/22/2023
4 1/29/2023
4 3/9/2023
4 3/10/2023
4 3/14/2023
4 5/1/2023
4 6/11/2023
4 6/15/2023
run;

data want (drop=_:);
set have;
by person_id;
retain _allowable_date ;
if first.person_id=1 or occurance_date>_allowable_date;
_allowable_date=occurance_date+32;
run;``````

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
3 REPLIES 3
Tourmaline | Level 20

Re: How to get records with time gap of certain number of days

Like this?

``````data WANT;
set HAVE;
by PERSON_ID;
DIF = dif(OCCURANCE_DATE);
if DIF>=35 or first.PERSON_ID ;
run;``````

Super User

Re: How to get records with time gap of certain number of days

@lichee wrote:

Hi All,

I have a list of records with dates related to each person and want to only keep the records that give at least time gap of 31 days. The HAVE data set is what I have, and WANT is the target data file. Can anyone guide how to go from HAVE to WANT? Thank you in advance!

I think you missing a detail or two.

Since there is only one record at all with Person_id=1 there is no gap at all. So how does it have a gap of "at least 31 days"?

You say "records that give at least time of 31 days" but the Want shows a single record for Person_id=3. What "gap" was used for that determination as the only other Person_id=3 is less than 31 days later. So why is that record chosen to keep.

PROC Star

Re: How to get records with time gap of certain number of days

If you have 3 records dates 20MAR, 01APR, and 28APR, do you keep the third record (it is more than 31 days after the last kept record (20MAR), but only 27 days after the last record encountered (01APR).  If the answer is yes, and if your data are sorted by ID/DATE, then:

``````data have;
input person_id occurance_date :mmddyy.;
format occurance_date mmddyy.;
cards;
1 1/5/2023
2 8/21/2023
2 9/25/2023
3 1/5/2023
3 2/1/2023
4 1/22/2023
4 1/29/2023
4 3/9/2023
4 3/10/2023
4 3/14/2023
4 5/1/2023
4 6/11/2023
4 6/15/2023
run;

data want (drop=_:);
set have;
by person_id;
retain _allowable_date ;
if first.person_id=1 or occurance_date>_allowable_date;
_allowable_date=occurance_date+32;
run;``````

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Discussion stats
• 3 replies
• 330 views
• 1 like
• 4 in conversation