BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
TineKopp
Fluorite | Level 6

Hi

I have retrieved data on abortions (date of abortions=d_inddto)) from a hospital discharge registry. Sometimes there are several abortions (dates) per person within a short period of time (for example 4 dates in 10 days), which I know is not right. On the other hand, one person can have multiple abortions - but there should be at least 12 weeks between each abortion.  So, I want to make a rule that says that in a period of 12 weeks or 84 days, only 1 abortion (that is, only on date) should remain in my dataset per person (PNR). And it should only be the first date. How do I do that? My data looks like this:

PNR  d_inddto

1        13/11/2013

1        14/11/2013

1        20/11/2013

1        22/11/2013

2        24/05/2015

3        01/09/2006

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User
data want;
set have;
by PNR;
retain last_date;
if first.PNR
then last_date = d_inddto;
else if intck('week',last_date,d_inddto) le 12
then delete;
else last_date = d_inddto;
run;

Untested. For tested code, supply example data in a data step.

View solution in original post

8 REPLIES 8
Kurt_Bremser
Super User
data want;
set have;
by PNR;
retain last_date;
if first.PNR
then last_date = d_inddto;
else if intck('week',last_date,d_inddto) le 12
then delete;
else last_date = d_inddto;
run;

Untested. For tested code, supply example data in a data step.

TineKopp
Fluorite | Level 6

Thank you very much. It seems to work perfectly:-)

jmhorstman
Obsidian | Level 7

How about something like this?  

 

data want;
	set have;
	by pnr;
	_prevdate = lag(d_inddto);
	if first.pnr or (d_inddto - _prevdate) >= 84;
	drop _prevdate;
run;

This is just based on 84 days, without regard to week boundaries.  It assumes the input dataset is already sorted by ascending values of pnr and d_inddto.

 

Josh

 

EDIT: syntax correction to code above

TineKopp
Fluorite | Level 6

Thank you for your answer! Unfortunately, it removes all datelines except for the first one. Or maybe I haven't used it correctly..... Nevertheless, I got an answer by KurtBremser that seems to work better on my data. Again, thank you!

jmhorstman
Obsidian | Level 7

My mistake - I made a slight in error in the code.  I've corrected it above.  My apologies - I was in too much of a hurry to be the first responder and earn precious badges.  🙂

 

Josh

TineKopp
Fluorite | Level 6

It is correct that in the very small dataset, I would expect only 3 records. However, if the dataset looked like this:

 

PNR  d_inddto

1        13/11/2013

1        14/11/2013

1        20/11/2013

1        22/11/2013

1        03/05/2017

2        24/05/2015

3        01/09/2006

 

there should be 4 records left. I haven't tried it myself since I'm just trying to help a colleague. She said it didn't work on her data. But as I wrote, we may have used it incorrectly.

jmhorstman
Obsidian | Level 7
Nope you used it correctly. It was my error. See above for the corrected version. Thanks!
TineKopp
Fluorite | Level 6

Ok - that makes sense:-) Thanks again!

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2142 views
  • 2 likes
  • 3 in conversation