DATA Step, Macro, Functions and more

continuous enrollment counting

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 104
Accepted Solution

continuous enrollment counting

Hi SAS community,

I would like to identify the individuals who have been continuously enrolled beginning from their start date to either end of study (12/31/2015) or date of event. It is possible for a subject to be enrolled prior to the start date. However, I do not want to count the months before the listed start date .Please advise on suggested coding tips.  Below is a simplified version of the dataset.  Using this example, the Individuals who meet the criteria, include AA, BB, EE.

data have;

input SubID startdate eventdate enddate mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12;

datalines;

AA 03/10/2015 . 12/31/2015 . . 1 1 1 1 1 1 1 1 1 1

BB 05/20/2015 08/02/2015 12/31/2015 . . . 1 1 1 1 1 1 1 1 1

CC 01/18/2015 . 12/31/2015 1 1 1 1 . . . . . 1 . .

DD 08/01/2015 . 12/31/2015 . . . . . . 1 1 . 1

EE 02/25/2015 . 05/11/2015 12/31/2015 . 1 1 1 1 1 1 1 1 . . 1;

FF 04/09/2015 11/10/2015 12/31/2015 . 1 1 1 1 . 1 1 1 1 1 .

;;

Thanks!


Accepted Solutions
Solution
‎08-06-2015 12:34 PM
Valued Guide
Posts: 858

Re: continuous enrollment counting

I think I have a solution for you, let me know if this does it, first I got the applicable range, then made sure that your mon1-12 variable within that range was populated:

data have;

infile datalines ;

informat subid $2. startdate eventdate enddate mmddyy10.;;

format subid $2. startdate eventdate enddate mmddyy10.;

input SubID$ startdate eventdate enddate mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12;

datalines;

AA 03/10/2015 . 12/31/2015 . . 1 1 1 1 1 1 1 1 1 1

BB 05/20/2015 08/02/2015 12/31/2015 . . . 1 1 1 1 1 1 1 1 1

CC 01/18/2015 . 12/31/2015 1 1 1 1 . . . . . 1 . .

DD 08/01/2015 . 12/31/2015 . . . . . . 1 1 . 1 . .

EE 02/25/2015 05/11/2015 12/31/2015 . 1 1 1 1 1 1 1 1 . . 1

FF 04/09/2015 11/10/2015 12/31/2015 . 1 1 1 1 . 1 1 1 1 1 .

;

data range;

set have;

start = month(startdate);

if not missing(eventdate) then end = month(eventdate);

    else end = month(enddate);

run;

proc transpose data=range out=tran_range;by subid start end;var mon:;

data compress;

set tran_range;

where start <= input(compress(_NAME_,'a','l'),8.) <= end;

if missing(col1) then col1 = 0;

run;

proc sql;

create table want as

select distinct subid,'Yes' as Eligible,end - start + 1 as sum_months

from compress

group by subid

having min(col1) = 1;

data final;

merge want(in=a)

      have(in=b);

by subid;

if b;

if b and not a then Eligible = 'No';

run;

Message was edited by: Mark Johnson

View solution in original post


All Replies
Super User
Posts: 17,960

Re: continuous enrollment counting

What do mon1 mon2 represent, how are they derived?

How do you determine/define continuous enrollment?

Super User
Posts: 10,550

Re: continuous enrollment counting

Also provide an example of what you want the output to look like.

Frequent Contributor
Posts: 104

Re: continuous enrollment counting

Thanks Reeza and Ballardw for your questions. See my responses to your questions.  Let me know if I should clarify further!  Thank you!

1. What do mon1 mon2 represent, how are they derived?

mon1, mon2, . . . mon12 = the months in 2013 where mon1 = Jan 2013, mon2 = Feb 2013 ... mon12=Dec 2013.  1 = enrolled; . = not enrolled.

For example, subject AA was enrolled from mon3-mon12 or from March 2013 to Dec 2013.

2. How do you determine/define continuous enrollment?

Continuous enrollment is define by enrollment (i.e. value=1) for at least 1 day in every month from start date to end date or date of event. For example, Subject AA was continuously enrolled for the defined period because there is a 1 from Mon3-Mon12 and AA start date is 3/10/2015 (Mon3) and end date is 12/31/2015 (Mon12).  In contrast, Subject CC was not continuously enrolled for the desired time frame because CC had start date and end dates from Jan 2013-Dec 2013 and was not enrolled in mon5-mon9, and mon11-mon12.

3. Provide an example of what you want the output to look like.

Ideally, the output will contain at least 2 new variables: eligible (i.e. met the continuously enrollment criteria) and if yes, the sum of months from start to end or start to event.  For example, see the same dataset below with two new variables.

data want;

input SubID startdate eventdate enddate mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12 eligible sum_months;

datalines;

AA 03/10/2015 . 12/31/2015 . . 1 1 1 1 1 1 1 1 1 1 yes 10

BB 05/20/2015 08/02/2015 12/31/2015 . . . 1 1 1 1 1 1 1 1 1 yes 4

CC 01/18/2015 . 12/31/2015 1 1 1 1 . . . . . 1 . . no .

DD 08/01/2015 . 12/31/2015 . . . . . . 1 1 . 1 no .

EE 02/25/2015 . 05/11/2015 12/31/2015 . 1 1 1 1 1 1 1 1 . . 1 yes 4

FF 04/09/2015 11/10/2015 12/31/2015 . 1 1 1 1 . 1 1 1 1 1 . no .

;;

Solution
‎08-06-2015 12:34 PM
Valued Guide
Posts: 858

Re: continuous enrollment counting

I think I have a solution for you, let me know if this does it, first I got the applicable range, then made sure that your mon1-12 variable within that range was populated:

data have;

infile datalines ;

informat subid $2. startdate eventdate enddate mmddyy10.;;

format subid $2. startdate eventdate enddate mmddyy10.;

input SubID$ startdate eventdate enddate mon1 mon2 mon3 mon4 mon5 mon6 mon7 mon8 mon9 mon10 mon11 mon12;

datalines;

AA 03/10/2015 . 12/31/2015 . . 1 1 1 1 1 1 1 1 1 1

BB 05/20/2015 08/02/2015 12/31/2015 . . . 1 1 1 1 1 1 1 1 1

CC 01/18/2015 . 12/31/2015 1 1 1 1 . . . . . 1 . .

DD 08/01/2015 . 12/31/2015 . . . . . . 1 1 . 1 . .

EE 02/25/2015 05/11/2015 12/31/2015 . 1 1 1 1 1 1 1 1 . . 1

FF 04/09/2015 11/10/2015 12/31/2015 . 1 1 1 1 . 1 1 1 1 1 .

;

data range;

set have;

start = month(startdate);

if not missing(eventdate) then end = month(eventdate);

    else end = month(enddate);

run;

proc transpose data=range out=tran_range;by subid start end;var mon:;

data compress;

set tran_range;

where start <= input(compress(_NAME_,'a','l'),8.) <= end;

if missing(col1) then col1 = 0;

run;

proc sql;

create table want as

select distinct subid,'Yes' as Eligible,end - start + 1 as sum_months

from compress

group by subid

having min(col1) = 1;

data final;

merge want(in=a)

      have(in=b);

by subid;

if b;

if b and not a then Eligible = 'No';

run;

Message was edited by: Mark Johnson

Frequent Contributor
Posts: 104

Re: continuous enrollment counting

Hi Mark!

Your solution works great.  Thank you for your time!

One other follow-up question. I actually have multiple years of data that I'm working with so my start and end dates frequently have different years.  For example, subject MM start date= 06/05/2013 and end date = 12/31/2015.  Do you have a suggestion on how to work with those differences?

Thanks again!

Valued Guide
Posts: 858

Re: continuous enrollment counting

What do your variables look like, var1 var2?

Frequent Contributor
Posts: 104

Re: continuous enrollment counting

I'm sorry for not mentioning the multiple years of data.  I was trying to simplify things . . .

As for your question, the only difference in my variables is how the months are labeled.  There are actually 36 months with Mon1=Jan 2013 and Mon36=Dec2015 everything else is the same.

Frequent Contributor
Posts: 104

Re: continuous enrollment counting

Hi Mark! I figured out a solution! Thanks again!

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 918 views
  • 0 likes
  • 4 in conversation