DATA Step, Macro, Functions and more

identify observations in rolling 12 month period

Reply
Frequent Contributor
Posts: 76

identify observations in rolling 12 month period

[ Edited ]

Hi All,

 

I have a data that each patient has multiple records.

how can I do:  if the variable time has more than four records (include four) within any continuous 12 month for the patient, then the patient defined as "Serious"?

 

Thanks,

 

data abc;
  informat date mmddyy10.;
  informat time anydtdtm16.;
  input ID Date time;
  format date mmddyy10. time MDYAMPM.;
  datalines;
14022 6/22/2010 10/11/2004 9:02
14022 6/22/2010 3/25/2008 12:53
14022 6/22/2010 4/29/2008 9:04
14022 6/22/2010 6/17/2008 9:14
14022 6/22/2010 7/8/2008 10:41
14022 6/22/2010 8/12/2008 12:30
14022 6/22/2010 9/16/2008 14:24
14022 6/22/2010 10/21/2008 1:44
14022 6/22/2010 11/18/2008 2:11
14022 6/22/2010 12/30/2008 9:53
14022 6/22/2010 2/10/2009 13:07
14022 6/22/2010 4/7/2009 9:45
14022 6/22/2010 6/2/2009 7:35
14022 6/22/2010 7/28/2009 10:11
14022 6/22/2010 9/29/2009 10:46
14022 6/22/2010 12/1/2009 10:51
14022 6/22/2010 1/26/2010 8:49
14022 6/22/2010 5/23/2010 7:48
14022 6/22/2010 6/22/2010 7:46
14032 2/7/2012 7/24/2002 10:25
14032 2/7/2012 1/21/2003 12:00
14032 2/7/2012 1/28/2003 9:25
14032 2/7/2012 1/29/2008 9:00
14032 2/7/2012 3/20/2008 13:52
14032 2/7/2012 5/1/2008 8:53
14032 2/7/2012 6/12/2008 14:01
14032 2/7/2012 7/24/2008 15:00
14032 2/7/2012 9/11/2008 11:02
14032 2/7/2012 9/20/2011 9:25
14032 2/7/2012 11/1/2011 10:40
14032 2/7/2012 1/8/2012 12:58
14032 2/7/2012 2/7/2012 10:20
14032 2/7/2012 3/7/2012 9:18
14032 2/7/2012 5/15/2012 12:43
14032 2/7/2012 6/19/2012 9:19
14032 2/7/2012 7/31/2012 11:48
14032 2/7/2012 9/18/2012 12:15
14032 2/7/2012 10/16/2012 12:35
14032 2/7/2012 10/18/2012 11:12
14032 2/7/2012 10/19/2012 5:00
14032 2/7/2012 10/20/2012 4:33
14032 2/7/2012 10/21/2012 4:19
14032 2/7/2012 10/22/2012 4:30
14032 2/7/2012 10/23/2012 5:03
14032 2/7/2012 10/24/2012 3:14
14032 2/7/2012 10/30/2012 11:10
14032 2/7/2012 11/27/2012 12:30
14032 2/7/2012 1/8/2013 9:58
14032 2/7/2012 2/19/2013 10:07
14032 2/7/2012 4/2/2013 13:45
14032 2/7/2012 5/14/2013 10:18
14601 1/30/2014 4/26/2005 8:40
14601 1/30/2014 6/28/2005 9:05
14601 1/30/2014 7/26/2005 8:21
14601 1/30/2014 8/9/2005 8:24
14601 1/30/2014 8/23/2005 8:10
14601 1/30/2014 10/18/2005 9:20
14601 1/30/2014 12/29/2013 9:32
14601 1/30/2014 12/30/2013 9:04
14601 1/30/2014 2/16/2014 8:44
14601 1/30/2014 5/2/2006 10:31
14601 1/30/2014 7/12/2006 7:31
14601 1/30/2014 9/7/2006 8:55
14601 1/30/2014 10/31/2006 8:43
14601 1/30/2014 2/6/2007 9:15
14601 1/30/2014 5/3/2007 7:54
14601 1/30/2014 6/28/2007 7:52
14601 1/30/2014 8/23/2007 8:10
14601 1/30/2014 10/18/2007 8:17
14601 1/30/2014 11/19/2007 2:07
14601 1/30/2014 12/13/2007 8:16
14601 1/30/2014 1/24/2008 10:23
14601 1/30/2014 1/31/2008 10:07
14601 1/30/2014 2/12/2008 12:31
14601 1/30/2014 3/25/2008 9:52
;
run;

 

PROC Star
Posts: 258

Re: identify observations in rolling 12 month period

may be something like this.

proc sql;
select a.*, case when a.id =b.id and b.cnt ge 4 then 'serious'
else 'ok'
end as condition from
(select * from abc)a
left join
(select distinct  id, year(date) as year, count(*) as cnt 
from abc
group by id, year(date))b
on a.id =b.id;
Trusted Advisor
Posts: 1,630

Re: identify observations in rolling 12 month period

[ Edited ]

Seems to work if you only look at calendar year ... but as I understood the problem, the rolling 12 month period could start at any date, for example if there were 4 visit during the period 3/27/14 to 3/26/15, that would also count as SERIOUS.

 

I get the feeling that PROC EXPAND could provide a solution, for example if there was one observation for each patient at each date of the time period, and a value of 1 if the patient had a visit on that date, and 0 otherwise. Then PROC EXPAND could compute rolling sums over the last 365 days of the 0-1 variable for each starting date and any time this rolling sum was 4 or greater, you get a SERIOUS flag. I guess this fails for leap years, but otherwise that seems to work in my mind ... of course getting it to work with actual code seems a little harder, but doable, and I don't have the time to do it right now.

Frequent Contributor
Posts: 76

Re: identify observations in rolling 12 month period

Yes, you are right!

Even  leap years should be include in my result. As your example, if there were 4 visit during the period 3/27/14 to 3/26/15, that would also count as SERIOUS.

 

Really no idea how to do it.....

 

 

PROC Star
Posts: 7,364

Re: identify observations in rolling 12 month period

[ Edited ]

Here is a way to do it using a FIFO stack:

data want (drop=stack:);
  retain stack0-stack3;
  array stack(0:3) stack0-stack3;
  length serious $3;
  do until(last.id);
    set abc;
    by id;
    if first.id then do;
      serious='No';
      n=1;
      call missing(of stack(*));
    end;
    else n+1;
    stack(mod(n,4))=time;
    if n ge 4 and serious='No' then do;
if yrdif(datepart(min(of stack(*))),datepart(max(of stack(*))),'age') le 1 then
serious='Yes'; end; end; do until(last.id); set abc; by id; output; end; run;

Art, CEO, AnalystFinder.com

 

Trusted Advisor
Posts: 1,630

Re: identify observations in rolling 12 month period

Although I don't understand @art297's code, and I have never heard of a FIFO stack in SAS, I do know what FIFO means and yes that should work if it is implemented properly (and its hard to imagine a situation where art297's code is wrong).

 

With regards to PROC EXPAND, accounting for leap years would be done by using one moving sum for 365 days and one for 366 days; when the starting date plus one year includes leap day, you would use the 366 day moving sum, otherwise you would use the 365 day moving sum.

 

 

PROC Star
Posts: 7,364

Re: identify observations in rolling 12 month period

@PaigeMiller: Actually, the code I originally posted (but have now changed) was wrong.

 

I stole the FIFO logic from @ArtC!  If you have an array (in this case called STACK), ranging from 0 to N-1, and fill the mod(n,N) cell with the current value, you will always be replacing the FI (first in) cell.

 

Therefore, if the file is in date order, yrdif(datepart(min(of stack(*))),datepart(max(of stack(*))),'age') le 1 will only be le 1 if the four entries capture a period that is under one year.

 

Art, CEO, AnalystFinder.com

 

Trusted Advisor
Posts: 1,630

Re: identify observations in rolling 12 month period

Always good to learn new techniques and new ways of thinking. Back in the days before I had SAS/ETS (and therefore I didn't have PROC EXPAND), I probably would have had to create an immensely complicated data step to do this calculation, and even with PROC EXPAND, it would require multiple steps before PROC EXPAND could be used. So I am bookmarking this and I hope I get to use it someday.

Frequent Contributor
Posts: 76

Re: identify observations in rolling 12 month period

Hi Art,

 

Thanks so much for your new method. is it proper for "the patient has more than four records (includes four) in a period of 12 month?" I wrote the patient has four records in a period of 12 month, which is wrong (I've changed it now). Sorry for my mistake.

 

Thanks,

L

Frequent Contributor
Posts: 76

Re: identify observations in rolling 12 month period

Hi Art,

 

I tried to run your solution, but the output seems not correct. for example, in my real data, id = 14238 should not be marked as Serious since there is no more 4 records for the patient in any continuous 12 month. Any idea about this?

 

Yes142385/21/2007
Yes142386/14/2007
Yes142387/12/2007
Yes142387/15/2009
Yes142387/16/2009
Super User
Posts: 9,687

Re: identify observations in rolling 12 month period

 What do you mean by " if the variable time has four records within 12 month for the patient" ?

data abc;
  informat date mmddyy10.;
  informat time anydtdtm16.;
  input ID Date time;
  format date mmddyy10. time MDYAMPM.;
  datalines;
14022 6/22/2010 10/11/2004 9:02
14022 6/22/2010 3/25/2008 12:53
14022 6/22/2010 4/29/2008 9:04
14022 6/22/2010 6/17/2008 9:14
14022 6/22/2010 7/8/2008 10:41
14022 6/22/2010 8/12/2008 12:30
14022 6/22/2010 9/16/2008 14:24
14022 6/22/2010 10/21/2008 1:44
14022 6/22/2010 11/18/2008 2:11
14022 6/22/2010 12/30/2008 9:53
14022 6/22/2010 2/10/2009 13:07
14022 6/22/2010 4/7/2009 9:45
14022 6/22/2010 6/2/2009 7:35
14022 6/22/2010 7/28/2009 10:11
14022 6/22/2010 9/29/2009 10:46
14022 6/22/2010 12/1/2009 10:51
14022 6/22/2010 1/26/2010 8:49
14022 6/22/2010 5/23/2010 7:48
14022 6/22/2010 6/22/2010 7:46
14032 2/7/2012 7/24/2002 10:25
14032 2/7/2012 1/21/2003 12:00
14032 2/7/2012 1/28/2003 9:25
14032 2/7/2012 1/29/2008 9:00
14032 2/7/2012 3/20/2008 13:52
14032 2/7/2012 5/1/2008 8:53
14032 2/7/2012 6/12/2008 14:01
14032 2/7/2012 7/24/2008 15:00
14032 2/7/2012 9/11/2008 11:02
14032 2/7/2012 9/20/2011 9:25
14032 2/7/2012 11/1/2011 10:40
14032 2/7/2012 1/8/2012 12:58
14032 2/7/2012 2/7/2012 10:20
14032 2/7/2012 3/7/2012 9:18
14032 2/7/2012 5/15/2012 12:43
14032 2/7/2012 6/19/2012 9:19
14032 2/7/2012 7/31/2012 11:48
14032 2/7/2012 9/18/2012 12:15
14032 2/7/2012 10/16/2012 12:35
14032 2/7/2012 10/18/2012 11:12
14032 2/7/2012 10/19/2012 5:00
14032 2/7/2012 10/20/2012 4:33
14032 2/7/2012 10/21/2012 4:19
14032 2/7/2012 10/22/2012 4:30
14032 2/7/2012 10/23/2012 5:03
14032 2/7/2012 10/24/2012 3:14
14032 2/7/2012 10/30/2012 11:10
14032 2/7/2012 11/27/2012 12:30
14032 2/7/2012 1/8/2013 9:58
14032 2/7/2012 2/19/2013 10:07
14032 2/7/2012 4/2/2013 13:45
14032 2/7/2012 5/14/2013 10:18
14601 1/30/2014 4/26/2005 8:40
14601 1/30/2014 6/28/2005 9:05
14601 1/30/2014 7/26/2005 8:21
14601 1/30/2014 8/9/2005 8:24
14601 1/30/2014 8/23/2005 8:10
14601 1/30/2014 10/18/2005 9:20
14601 1/30/2014 12/29/2013 9:32
14601 1/30/2014 12/30/2013 9:04
14601 1/30/2014 2/16/2014 8:44
14601 1/30/2014 5/2/2006 10:31
14601 1/30/2014 7/12/2006 7:31
14601 1/30/2014 9/7/2006 8:55
14601 1/30/2014 10/31/2006 8:43
14601 1/30/2014 2/6/2007 9:15
14601 1/30/2014 5/3/2007 7:54
14601 1/30/2014 6/28/2007 7:52
14601 1/30/2014 8/23/2007 8:10
14601 1/30/2014 10/18/2007 8:17
14601 1/30/2014 11/19/2007 2:07
14601 1/30/2014 12/13/2007 8:16
14601 1/30/2014 1/24/2008 10:23
14601 1/30/2014 1/31/2008 10:07
14601 1/30/2014 2/12/2008 12:31
14601 1/30/2014 3/25/2008 9:52
;
run;

proc sql;
select *,case when 
(select count(*) from abc
 where id=a.id and time between a.time and intnx('dtmonth',a.time,12,'s')) > 3 then 1
else 0 end as flag
 from abc as a;
quit;   
Frequent Contributor
Posts: 76

Re: identify observations in rolling 12 month period

[ Edited ]

it means in the whole treatment for the patient (time from first record to last record for the patient), if the patient has more than four records in any continuous 12 month, the patient will be marked as serious. 

Even  leap years should be include in my result. for example, if there were 4 visit during the period 3/27/14 to 3/26/15, that would also count as SERIOUS.

 

Thanks,

PROC Star
Posts: 7,364

Re: identify observations in rolling 12 month period

[ Edited ]

As shown below, after adding the new records you mentioned, that new case results in NOT being identified as being SERIOUS, so I'm not sure what you ran.

 

In the code below, I made the changes that would be needed to accomodate your changed requirement (i.e., more than 4 months rather than four months or more):

 

data abc;
  informat date mmddyy10.;
  informat time anydtdtm16.;
  input ID Date time;
  format date mmddyy10. time MDYAMPM.;
  datalines;
14022 6/22/2010 10/11/2004 9:02
14022 6/22/2010 3/25/2008 12:53
14022 6/22/2010 4/29/2008 9:04
14022 6/22/2010 6/17/2008 9:14
14022 6/22/2010 7/8/2008 10:41
14022 6/22/2010 8/12/2008 12:30
14022 6/22/2010 9/16/2008 14:24
14022 6/22/2010 10/21/2008 1:44
14022 6/22/2010 11/18/2008 2:11
14022 6/22/2010 12/30/2008 9:53
14022 6/22/2010 2/10/2009 13:07
14022 6/22/2010 4/7/2009 9:45
14022 6/22/2010 6/2/2009 7:35
14022 6/22/2010 7/28/2009 10:11
14022 6/22/2010 9/29/2009 10:46
14022 6/22/2010 12/1/2009 10:51
14022 6/22/2010 1/26/2010 8:49
14022 6/22/2010 5/23/2010 7:48
14022 6/22/2010 6/22/2010 7:46
14032 2/7/2012 7/24/2002 10:25
14032 2/7/2012 1/21/2003 12:00
14032 2/7/2012 1/28/2003 9:25
14032 2/7/2012 1/29/2008 9:00
14032 2/7/2012 3/20/2008 13:52
14032 2/7/2012 5/1/2008 8:53
14032 2/7/2012 6/12/2008 14:01
14032 2/7/2012 7/24/2008 15:00
14032 2/7/2012 9/11/2008 11:02
14032 2/7/2012 9/20/2011 9:25
14032 2/7/2012 11/1/2011 10:40
14032 2/7/2012 1/8/2012 12:58
14032 2/7/2012 2/7/2012 10:20
14032 2/7/2012 3/7/2012 9:18
14032 2/7/2012 5/15/2012 12:43
14032 2/7/2012 6/19/2012 9:19
14032 2/7/2012 7/31/2012 11:48
14032 2/7/2012 9/18/2012 12:15
14032 2/7/2012 10/16/2012 12:35
14032 2/7/2012 10/18/2012 11:12
14032 2/7/2012 10/19/2012 5:00
14032 2/7/2012 10/20/2012 4:33
14032 2/7/2012 10/21/2012 4:19
14032 2/7/2012 10/22/2012 4:30
14032 2/7/2012 10/23/2012 5:03
14032 2/7/2012 10/24/2012 3:14
14032 2/7/2012 10/30/2012 11:10
14032 2/7/2012 11/27/2012 12:30
14032 2/7/2012 1/8/2013 9:58
14032 2/7/2012 2/19/2013 10:07
14032 2/7/2012 4/2/2013 13:45
14032 2/7/2012 5/14/2013 10:18
14238 2/7/2012 5/21/2007 10:18
14238 2/7/2012 6/14/2007 10:18
14238 2/7/2012 7/12/2007 10:18
14238 2/7/2012 7/15/2009 10:18
14238 2/7/2012 7/16/2009 10:18
14601 1/30/2014 4/26/2005 8:40
14601 1/30/2014 6/28/2005 9:05
14601 1/30/2014 7/26/2005 8:21
14601 1/30/2014 8/9/2005 8:24
14601 1/30/2014 8/23/2005 8:10
14601 1/30/2014 10/18/2005 9:20
14601 1/30/2014 12/29/2013 9:32
14601 1/30/2014 12/30/2013 9:04
14601 1/30/2014 2/16/2014 8:44
14601 1/30/2014 5/2/2006 10:31
14601 1/30/2014 7/12/2006 7:31
14601 1/30/2014 9/7/2006 8:55
14601 1/30/2014 10/31/2006 8:43
14601 1/30/2014 2/6/2007 9:15
14601 1/30/2014 5/3/2007 7:54
14601 1/30/2014 6/28/2007 7:52
14601 1/30/2014 8/23/2007 8:10
14601 1/30/2014 10/18/2007 8:17
14601 1/30/2014 11/19/2007 2:07
14601 1/30/2014 12/13/2007 8:16
14601 1/30/2014 1/24/2008 10:23
14601 1/30/2014 1/31/2008 10:07
14601 1/30/2014 2/12/2008 12:31
14601 1/30/2014 3/25/2008 9:52
;
run;

data want (drop=stack:);
  retain stack0-stack4;
  array stack(0:4) stack0-stack4;
  length serious $3;
  do until(last.id);
    set abc;
    by id;
    if first.id then do;
      serious='No';
      n=1;
      call missing(of stack(*));
    end;
    else n+1;
    stack(mod(n,5))=time;
    if n ge 5 and serious='No' then do;
      if yrdif(datepart(min(of stack(*))),datepart(max(of stack(*))),'age') le 1 then
        serious='Yes';
    end;
  end;
  do until(last.id);
    set abc;
    by id;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

Ask a Question
Discussion stats
  • 12 replies
  • 192 views
  • 0 likes
  • 5 in conversation