DATA Step, Macro, Functions and more

count consecutive months

Accepted Solution Solved
Reply
Contributor
Posts: 58
Accepted Solution

count consecutive months

Hello,

So I am working with insurance data and I am trying to figure out how to count consecutive months. So in order to be included in my sample, the person needs to have 6 consecutive months of coverage. I am unsure how to approach my coding for this.

So for example:

data fakedata;

input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12;

datalines;

1 1 1 1 1 1 1 0 0 0 0 0 0

2 0 1 1 1 0 0 0 1 1 1 0 0

3 0 0 0 1 1 1 1 1 1 1 0 0

4 0 1 1 1 1 1 0 0 0 0 0 0

5 0 1 1 0 0 0 0 0 0 0 0 0

;

run;

So in the above example, we have 12 variables for the 12 months. So we see ID#1 3 and 4 meets the criteria, since they have 6 consecutive months of coverage. While ID 2 only have 2 3 consecutive months coverage and 5 only have 2 months coverage, so they will not be included in the sample.

My first thought is just sum all the months, but that will only tell me if they have atleast 6 months of coverage. So I am a bit lost on seeing how can I code it for it to be consecutive. Does anyone have any experience doing this and can provide me with any input?

Thank you so much for your help

Tan


Accepted Solutions
Solution
‎07-11-2013 08:52 PM
Super User
Posts: 10,044

Re: count consecutive months

Tom's code should be OK. it is also suited for your new situation.

But mine need change a little bit .

data fakedata;
   input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12 year;
   datalines;
1 1 1 1 1 1 1 0 0 0 0 0 0 2001
1 1 1 1 0 1 1 0 0 0 0 0 0 2002
2 0 1 1 1 0 0 0 1 1 1 0 0 2002
3 0 0 0 1 1 1 1 1 1 1 0 0 2004
4 0 1 1 1 1 1 0 0 0 0 0 0 2001
5 0 1 1 0 0 0 0 0 0 0 0 0 2004
6 0 1 1 0 0 1 1 1 0 0 0 0 2010
7 0 1 1 0 0 0 0 0 0 0 0 1 2001
8 0 1 1 0 0 0 0 0 1 1 1 1 2004
8 0 1 1 0 0 0 1 1 1 1 1 1 2005
;
   run;
data temp(keep=id year name value);
 set fakedata ;
 length name $ 10;
 array m{*} month1-month12;
 do i=1 to dim(m);
  name=vname(m{i});value=m{i};output;
 end;
run;
data temp;
 set temp;
 if value ne lag(value) or id ne lag(id) then count+1;
run;
proc sql;
 create table x as
  select *,count(*) as n from temp 
   where value=1
    group by count
     having calculated n ge 6
      order by id,year,input(compress(name, ,'kd'),best8.);
quit;
data xx(keep=id year start end n);
 set x;
 by id year;
 length start end $ 10;
 retain start end;
 if first.year then start=name;
 if last.year then do;end=name;output;end;
run;
data want;
 merge fakedata xx;
 by id year;
run;

Ksharp

View solution in original post


All Replies
Regular Contributor
Posts: 180

Re: count consecutive months

Try this:

data want;

  set fakedata;

  array m(12) month1-month12;

  count=m(1);

  max=0;

  do i=2 to 12;

    if m(i) then count+1;

else do; if count>max then  max=count; count= 0; end;

  end;

  if count>max then max=count;

  if max ge 6;

  drop count i;

run;

Regards,

Super User
Posts: 19,860

Re: count consecutive months

Its better to exit the loop as soon as you find the 6 months, if your criteria is at least 6 months. If you want the max amount of coverage then CTorres solution will work.

data fakedata;

input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12;

datalines;

1 1 1 1 1 1 1 0 0 0 0 0 0

2 0 1 1 1 0 0 0 1 1 1 0 0

3 0 0 0 1 1 1 1 1 1 1 0 0

4 0 1 1 1 1 1 0 0 0 0 0 0

5 0 1 1 0 0 0 0 0 0 0 0 0

;

run;

data want;

set fakedata;

array month(12) month1-month12;

flag=0;count=0;

do i=1 to 12 while (flag ne 1);

    if month(i)=1 then count=count+1;

    else count=0;

    if count eq 6 then flag=1;

end;

run;

Contributor
Posts: 58

Re: count consecutive months

So I don't know anything about do loops. Would the do loop you written, work for consecutive months? Just reading the code, I think I need to read up on do loops and have a better undertanding on what the code is doing itself. I'm out of the office today and would play with that tomorrow

Respected Advisor
Posts: 3,799

Re: count consecutive months

Maybe treat months as a string.  This finds max coverage.

data fakedata;
   input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12;
   length x $12;
  
x = cats(of monthSmiley Happy;
   drop month:;
   do _s = 1 by 0;
      _r = findc(x||
' ','1',_s);
      if _r eq 0 then leave;
      _t = findc(x||' ','1',_r,'K');
      _l = _t-_r;
      m = max(m,_l);
      _s=_t;
     
end;
  
drop _:;
   datalines;
1 1 1 1 1 1 1 0 0 0 0 0 0
2 0 1 1 1 0 0 0 1 1 1 0 0
3 0 0 0 1 1 1 1 1 1 1 0 0
4 0 1 1 1 1 1 0 0 0 0 0 0
5 0 1 1 0 0 0 0 0 0 0 0 0
6 0 1 1 0 0 1 1 1 0 0 0 0
7 0 1 1 0 0 0 0 0 0 0 0 1
8 0 1 1 0 0 0 0 0 1 1 1 1
;
  
run;
Super User
Super User
Posts: 7,076

Re: count consecutive months

An interesting way to do this is to use a character string to record the coverage.

data fakedata;

input id month1-month12;

cards;

1 1 1 1 1 1 1 0 0 0 0 0 0

2 0 1 1 1 0 0 0 1 1 1 0 0

3 0 0 0 1 1 1 1 1 1 1 0 0

4 0 1 1 1 1 1 0 0 0 0 0 0

5 0 1 1 0 0 0 0 0 0 0 0 0

;

run;

data want;

  set fakedata;

  length coverstring $12;

  coverstring = cats(of month1-month12);

  has6months = 0 ^= index(coverstring,repeat('1',6-1));

  put (id has6months coverstring) (=);

run;

id=1 has6months=1 coverstring=111111000000

id=2 has6months=0 coverstring=011100011100

id=3 has6months=1 coverstring=000111111100

id=4 has6months=0 coverstring=011111000000

id=5 has6months=0 coverstring=011000000000

Super User
Posts: 10,044

Re: count consecutive months

if care about the start month and end month which has more than 6 series month.

data fakedata;
   input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12;
   datalines;
1 1 1 1 1 1 1 0 0 0 0 0 0
2 0 1 1 1 0 0 0 1 1 1 0 0
3 0 0 0 1 1 1 1 1 1 1 0 0
4 0 1 1 1 1 1 0 0 0 0 0 0
5 0 1 1 0 0 0 0 0 0 0 0 0
6 0 1 1 0 0 1 1 1 0 0 0 0
7 0 1 1 0 0 0 0 0 0 0 0 1
8 0 1 1 0 0 0 0 0 1 1 1 1
;
   run;
data temp(keep=id name value);
 set fakedata ;
 length name $ 10;
 array m{*} month1-month12;
 do i=1 to dim(m);
  name=vname(m{i});value=m{i};output;
 end;
run;
data temp;
 set temp;
 if value ne lag(value) or id ne lag(id) then count+1;
run;
proc sql;
 create table x as
  select *,sum(value=1) as n from temp 
   group by count
    having calculated n ge 6
      order by id,input(compress(name, ,'kd'),best8.);
quit;
data xx(keep=id start end n);
 set x;
 by id;
 length start end $ 10;
 retain start end;
 if first.id then start=name;
 if last.id then do;end=name;output;end;
run;
data want;
 merge fakedata xx;
 by id;
run;

Ksharp

Regular Contributor
Posts: 195

Re: count consecutive months

Hi,

How about the following one...

proc transpose data = have

               out  = have(rename = (_NAME_ = Month COL1 = VAL));

      by id;

      var month1-month12;

run;

data want;

   set have;

   by id;

   _temp = dif(val);

   if _temp = . then _temp = 0;

   if first.id or _temp = 1 then con_mnth = 1;

   else con_mnth + 1;

   if _temp < 0 then con_mnth = .;

run;

proc sql;

   create table want as

   select id,max(con_mnth) as con_mnth

   from want

   where val = 1

   group by id;

quit;

-Urvish

Contributor
Posts: 58

Re: count consecutive months

Wow thanks for everyone's response.. I am going to take all of your suggestions and play around with it.

So things just got a bit more complex. So some patients might have 2 records, so I am assuming the best option is combine them into 1.

data fakedata;

input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12 year;

datalines;

1 1 1 1 1 1 1 0 0 0 0 0 0 2001

1 0 1 1 1 1 1 0 0 0 0 0 0 2002;

run;


as an example.  I assume I have to create variables like Yr1_month1..etc...


I am unsure if converting the values to strings is an option (method that Tom suggested). I only use the dataline to illustrate what the dataset looks like. But I can give it a try others string suggestion also.


Tan

Solution
‎07-11-2013 08:52 PM
Super User
Posts: 10,044

Re: count consecutive months

Tom's code should be OK. it is also suited for your new situation.

But mine need change a little bit .

data fakedata;
   input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12 year;
   datalines;
1 1 1 1 1 1 1 0 0 0 0 0 0 2001
1 1 1 1 0 1 1 0 0 0 0 0 0 2002
2 0 1 1 1 0 0 0 1 1 1 0 0 2002
3 0 0 0 1 1 1 1 1 1 1 0 0 2004
4 0 1 1 1 1 1 0 0 0 0 0 0 2001
5 0 1 1 0 0 0 0 0 0 0 0 0 2004
6 0 1 1 0 0 1 1 1 0 0 0 0 2010
7 0 1 1 0 0 0 0 0 0 0 0 1 2001
8 0 1 1 0 0 0 0 0 1 1 1 1 2004
8 0 1 1 0 0 0 1 1 1 1 1 1 2005
;
   run;
data temp(keep=id year name value);
 set fakedata ;
 length name $ 10;
 array m{*} month1-month12;
 do i=1 to dim(m);
  name=vname(m{i});value=m{i};output;
 end;
run;
data temp;
 set temp;
 if value ne lag(value) or id ne lag(id) then count+1;
run;
proc sql;
 create table x as
  select *,count(*) as n from temp 
   where value=1
    group by count
     having calculated n ge 6
      order by id,year,input(compress(name, ,'kd'),best8.);
quit;
data xx(keep=id year start end n);
 set x;
 by id year;
 length start end $ 10;
 retain start end;
 if first.year then start=name;
 if last.year then do;end=name;output;end;
run;
data want;
 merge fakedata xx;
 by id year;
run;

Ksharp

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 2180 views
  • 4 likes
  • 7 in conversation