Count instances at least 30 days apart

Accepted Solution Solved
Reply
Contributor
Posts: 28
Accepted Solution

Count instances at least 30 days apart

I have data like this:

 

 ID        relapse_Date

1     8/28/2012

1     8/30/2012

1      11/5/2012

1        1/2/2013

1        2/13/2013

1        3/18/2013

1        4/15/2013

2        5/1/2008

2       5/20/2008

2        6/14/2008

 

I would like to count the unique number of dates that each ID has a replase, but only count relapses that are 30 days or more apart from each other. So even though ID 1 has 7 unique relapse dates, I only want to count 5 of them (8/28/2012, 11/5/2012, 1/2/2013, 2/13/2013, 3/18/2013; the relapses on 8/30/2012 and 4/15/2013 are within 30 days of other relapse dates). I've been trying to use lag, retain, and multiple set statements but can't seem to make any of those solve my problem. Thanks in advance for any help.


Accepted Solutions
Solution
‎09-07-2017 11:48 AM
Trusted Advisor
Posts: 1,148

Re: Count instances at least 30 days apart

I deleted my first reply because I hadn't read the problem correctly.  But you apparently need to retain a cutoff date, which is updated only when a relapse date is more than 30 days after the START of the previous relapse regime:

 

data want;
  set have;
  by id;
  difdate=ifn(first.id,.,dif(relapse_date));
  if first.id or relapse_date>cutoff then do;
    counter=ifn(first.id,1,counter+1);
    cutoff=relapse_date+30;
  end;
  retain cutoff counter;
  format cutoff date9.;
 run;

 

 

But what if you have a series of relapse_dates on, say, 6 consecutive Wednesdays?  Do you really want the 5th Wednesday (=original Wed plus 35 days) to increment the COUNTER, even though it trails the preceding Wed only by 7 days?  That's what I understand your request to mean.

View solution in original post


All Replies
Super User
Posts: 21,530

Re: Count instances at least 30 days apart

You don't specify what you want as output, so here's one way that you can then modify to your needs.

 

data want;
set have; 
by id relapse_date;

dif_date = dif(relapse_date);
retain counter;

if first.id then do;
 dif_date = .;
counter=1;
end;
else if dif_date >= 30 then counter+1;

run;
PROC Star
Posts: 829

Re: Count instances at least 30 days apart

Not sure what you want in your output, you should provide an output sample too. Try  and modify-

 

data have;

input ID        relapse_Date :mmddyy10.;

format relapse_Date mmddyy10.;

datalines;

1     8/28/2012

1     8/30/2012

1      11/5/2012

1        1/2/2013

1        2/13/2013

1        3/18/2013

1        4/15/2013

2        5/1/2008

2       5/20/2008

2        6/14/2008

;

 

data want;

merge have have(firstobs=2 rename=(relapse_Date=_relapse_Date));

by id;

if first.id then count=1;

if intck('days',relapse_Date,_relapse_Date)>=30 then count+1;

drop _:;

run;

Contributor
Posts: 28

Re: Count instances at least 30 days apart

[ Edited ]
Posted in reply to novinosrin

Thank you! This is close but here is an example of a tricky situation. After I run Reeza's code, I have this:

 

ID

Relapse_date

dif_date

counter

33

5/8/2007

.

1

33

6/5/2007

28

1

33

7/7/2007

32

2

33

8/7/2007

31

3

33

9/11/2007

35

4

33

10/10/2007

29

4

33

11/8/2007

29

4

33

12/4/2007

26

4

33

12/27/2007

23

4

 

And what I would like to have is this:

ID

Relapse_date

dif_date

counter

33

5/8/2007

.

1

33

6/5/2007

28

1

33

7/7/2007

32

2

33

8/7/2007

31

3

33

9/11/2007

35

4

33

10/10/2007

29

4

33

11/8/2007

29

5

33

12/4/2007

26

5

33

12/27/2007

23

6

I would like the program to ignore the relapse at 10/10/2007, because it's less than 30 days after the relapse on 9/11/2007, but then take into account that the relapse on 11/8/2007 IS more than 30 days after the 9/11/2007 relapse and count it. Same issue with the relapse on 12/27/2007- it's less than 30 days after the one on 12/4/2007 but more than 30 days after the one on 11/8/2007. Thanks so much in advance!

Solution
‎09-07-2017 11:48 AM
Trusted Advisor
Posts: 1,148

Re: Count instances at least 30 days apart

I deleted my first reply because I hadn't read the problem correctly.  But you apparently need to retain a cutoff date, which is updated only when a relapse date is more than 30 days after the START of the previous relapse regime:

 

data want;
  set have;
  by id;
  difdate=ifn(first.id,.,dif(relapse_date));
  if first.id or relapse_date>cutoff then do;
    counter=ifn(first.id,1,counter+1);
    cutoff=relapse_date+30;
  end;
  retain cutoff counter;
  format cutoff date9.;
 run;

 

 

But what if you have a series of relapse_dates on, say, 6 consecutive Wednesdays?  Do you really want the 5th Wednesday (=original Wed plus 35 days) to increment the COUNTER, even though it trails the preceding Wed only by 7 days?  That's what I understand your request to mean.

Contributor
Posts: 28

Re: Count instances at least 30 days apart

Thank you so much, that worked! To answer your question, yes, I would only want the 5th Wednesday (that was 35 days since the original relapse) to trigger the counter, which is what your code does. I can't thank you enough! Smiley Happy

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 189 views
  • 1 like
  • 4 in conversation