BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Tpham
Quartz | Level 8

Hello,

So I am working with insurance data and I am trying to figure out how to count consecutive months. So in order to be included in my sample, the person needs to have 6 consecutive months of coverage. I am unsure how to approach my coding for this.

So for example:

data fakedata;

input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12;

datalines;

1 1 1 1 1 1 1 0 0 0 0 0 0

2 0 1 1 1 0 0 0 1 1 1 0 0

3 0 0 0 1 1 1 1 1 1 1 0 0

4 0 1 1 1 1 1 0 0 0 0 0 0

5 0 1 1 0 0 0 0 0 0 0 0 0

;

run;

So in the above example, we have 12 variables for the 12 months. So we see ID#1 3 and 4 meets the criteria, since they have 6 consecutive months of coverage. While ID 2 only have 2 3 consecutive months coverage and 5 only have 2 months coverage, so they will not be included in the sample.

My first thought is just sum all the months, but that will only tell me if they have atleast 6 months of coverage. So I am a bit lost on seeing how can I code it for it to be consecutive. Does anyone have any experience doing this and can provide me with any input?

Thank you so much for your help

Tan

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Tom's code should be OK. it is also suited for your new situation.

But mine need change a little bit .

data fakedata;
   input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12 year;
   datalines;
1 1 1 1 1 1 1 0 0 0 0 0 0 2001
1 1 1 1 0 1 1 0 0 0 0 0 0 2002
2 0 1 1 1 0 0 0 1 1 1 0 0 2002
3 0 0 0 1 1 1 1 1 1 1 0 0 2004
4 0 1 1 1 1 1 0 0 0 0 0 0 2001
5 0 1 1 0 0 0 0 0 0 0 0 0 2004
6 0 1 1 0 0 1 1 1 0 0 0 0 2010
7 0 1 1 0 0 0 0 0 0 0 0 1 2001
8 0 1 1 0 0 0 0 0 1 1 1 1 2004
8 0 1 1 0 0 0 1 1 1 1 1 1 2005
;
   run;
data temp(keep=id year name value);
 set fakedata ;
 length name $ 10;
 array m{*} month1-month12;
 do i=1 to dim(m);
  name=vname(m{i});value=m{i};output;
 end;
run;
data temp;
 set temp;
 if value ne lag(value) or id ne lag(id) then count+1;
run;
proc sql;
 create table x as
  select *,count(*) as n from temp 
   where value=1
    group by count
     having calculated n ge 6
      order by id,year,input(compress(name, ,'kd'),best8.);
quit;
data xx(keep=id year start end n);
 set x;
 by id year;
 length start end $ 10;
 retain start end;
 if first.year then start=name;
 if last.year then do;end=name;output;end;
run;
data want;
 merge fakedata xx;
 by id year;
run;

Ksharp

View solution in original post

9 REPLIES 9
CTorres
Quartz | Level 8

Try this:

data want;

  set fakedata;

  array m(12) month1-month12;

  count=m(1);

  max=0;

  do i=2 to 12;

    if m(i) then count+1;

else do; if count>max then  max=count; count= 0; end;

  end;

  if count>max then max=count;

  if max ge 6;

  drop count i;

run;

Regards,

Reeza
Super User

Its better to exit the loop as soon as you find the 6 months, if your criteria is at least 6 months. If you want the max amount of coverage then CTorres solution will work.

data fakedata;

input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12;

datalines;

1 1 1 1 1 1 1 0 0 0 0 0 0

2 0 1 1 1 0 0 0 1 1 1 0 0

3 0 0 0 1 1 1 1 1 1 1 0 0

4 0 1 1 1 1 1 0 0 0 0 0 0

5 0 1 1 0 0 0 0 0 0 0 0 0

;

run;

data want;

set fakedata;

array month(12) month1-month12;

flag=0;count=0;

do i=1 to 12 while (flag ne 1);

    if month(i)=1 then count=count+1;

    else count=0;

    if count eq 6 then flag=1;

end;

run;

Tpham
Quartz | Level 8

So I don't know anything about do loops. Would the do loop you written, work for consecutive months? Just reading the code, I think I need to read up on do loops and have a better undertanding on what the code is doing itself. I'm out of the office today and would play with that tomorrow

data_null__
Jade | Level 19

Maybe treat months as a string.  This finds max coverage.

data fakedata;
   input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12;
   length x $12;
  
x = cats(of month:);
   drop month:;
   do _s = 1 by 0;
      _r = findc(x||
' ','1',_s);
      if _r eq 0 then leave;
      _t = findc(x||' ','1',_r,'K');
      _l = _t-_r;
      m = max(m,_l);
      _s=_t;
     
end;
  
drop _:;
   datalines;
1 1 1 1 1 1 1 0 0 0 0 0 0
2 0 1 1 1 0 0 0 1 1 1 0 0
3 0 0 0 1 1 1 1 1 1 1 0 0
4 0 1 1 1 1 1 0 0 0 0 0 0
5 0 1 1 0 0 0 0 0 0 0 0 0
6 0 1 1 0 0 1 1 1 0 0 0 0
7 0 1 1 0 0 0 0 0 0 0 0 1
8 0 1 1 0 0 0 0 0 1 1 1 1
;
  
run;
Tom
Super User Tom
Super User

An interesting way to do this is to use a character string to record the coverage.

data fakedata;

input id month1-month12;

cards;

1 1 1 1 1 1 1 0 0 0 0 0 0

2 0 1 1 1 0 0 0 1 1 1 0 0

3 0 0 0 1 1 1 1 1 1 1 0 0

4 0 1 1 1 1 1 0 0 0 0 0 0

5 0 1 1 0 0 0 0 0 0 0 0 0

;

run;

data want;

  set fakedata;

  length coverstring $12;

  coverstring = cats(of month1-month12);

  has6months = 0 ^= index(coverstring,repeat('1',6-1));

  put (id has6months coverstring) (=);

run;

id=1 has6months=1 coverstring=111111000000

id=2 has6months=0 coverstring=011100011100

id=3 has6months=1 coverstring=000111111100

id=4 has6months=0 coverstring=011111000000

id=5 has6months=0 coverstring=011000000000

Ksharp
Super User

if care about the start month and end month which has more than 6 series month.

data fakedata;
   input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12;
   datalines;
1 1 1 1 1 1 1 0 0 0 0 0 0
2 0 1 1 1 0 0 0 1 1 1 0 0
3 0 0 0 1 1 1 1 1 1 1 0 0
4 0 1 1 1 1 1 0 0 0 0 0 0
5 0 1 1 0 0 0 0 0 0 0 0 0
6 0 1 1 0 0 1 1 1 0 0 0 0
7 0 1 1 0 0 0 0 0 0 0 0 1
8 0 1 1 0 0 0 0 0 1 1 1 1
;
   run;
data temp(keep=id name value);
 set fakedata ;
 length name $ 10;
 array m{*} month1-month12;
 do i=1 to dim(m);
  name=vname(m{i});value=m{i};output;
 end;
run;
data temp;
 set temp;
 if value ne lag(value) or id ne lag(id) then count+1;
run;
proc sql;
 create table x as
  select *,sum(value=1) as n from temp 
   group by count
    having calculated n ge 6
      order by id,input(compress(name, ,'kd'),best8.);
quit;
data xx(keep=id start end n);
 set x;
 by id;
 length start end $ 10;
 retain start end;
 if first.id then start=name;
 if last.id then do;end=name;output;end;
run;
data want;
 merge fakedata xx;
 by id;
run;

Ksharp

UrvishShah
Fluorite | Level 6

Hi,

How about the following one...

proc transpose data = have

               out  = have(rename = (_NAME_ = Month COL1 = VAL));

      by id;

      var month1-month12;

run;

data want;

   set have;

   by id;

   _temp = dif(val);

   if _temp = . then _temp = 0;

   if first.id or _temp = 1 then con_mnth = 1;

   else con_mnth + 1;

   if _temp < 0 then con_mnth = .;

run;

proc sql;

   create table want as

   select id,max(con_mnth) as con_mnth

   from want

   where val = 1

   group by id;

quit;

-Urvish

Tpham
Quartz | Level 8

Wow thanks for everyone's response.. I am going to take all of your suggestions and play around with it.

So things just got a bit more complex. So some patients might have 2 records, so I am assuming the best option is combine them into 1.

data fakedata;

input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12 year;

datalines;

1 1 1 1 1 1 1 0 0 0 0 0 0 2001

1 0 1 1 1 1 1 0 0 0 0 0 0 2002;

run;


as an example.  I assume I have to create variables like Yr1_month1..etc...


I am unsure if converting the values to strings is an option (method that Tom suggested). I only use the dataline to illustrate what the dataset looks like. But I can give it a try others string suggestion also.


Tan

Ksharp
Super User

Tom's code should be OK. it is also suited for your new situation.

But mine need change a little bit .

data fakedata;
   input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12 year;
   datalines;
1 1 1 1 1 1 1 0 0 0 0 0 0 2001
1 1 1 1 0 1 1 0 0 0 0 0 0 2002
2 0 1 1 1 0 0 0 1 1 1 0 0 2002
3 0 0 0 1 1 1 1 1 1 1 0 0 2004
4 0 1 1 1 1 1 0 0 0 0 0 0 2001
5 0 1 1 0 0 0 0 0 0 0 0 0 2004
6 0 1 1 0 0 1 1 1 0 0 0 0 2010
7 0 1 1 0 0 0 0 0 0 0 0 1 2001
8 0 1 1 0 0 0 0 0 1 1 1 1 2004
8 0 1 1 0 0 0 1 1 1 1 1 1 2005
;
   run;
data temp(keep=id year name value);
 set fakedata ;
 length name $ 10;
 array m{*} month1-month12;
 do i=1 to dim(m);
  name=vname(m{i});value=m{i};output;
 end;
run;
data temp;
 set temp;
 if value ne lag(value) or id ne lag(id) then count+1;
run;
proc sql;
 create table x as
  select *,count(*) as n from temp 
   where value=1
    group by count
     having calculated n ge 6
      order by id,year,input(compress(name, ,'kd'),best8.);
quit;
data xx(keep=id year start end n);
 set x;
 by id year;
 length start end $ 10;
 retain start end;
 if first.year then start=name;
 if last.year then do;end=name;output;end;
run;
data want;
 merge fakedata xx;
 by id year;
run;

Ksharp

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 7334 views
  • 4 likes
  • 7 in conversation