BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Tpham
Quartz | Level 8

Hello,

So I am working with insurance data and I am trying to figure out how to count consecutive months. So in order to be included in my sample, the person needs to have 6 consecutive months of coverage. I am unsure how to approach my coding for this.

So for example:

data fakedata;

input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12;

datalines;

1 1 1 1 1 1 1 0 0 0 0 0 0

2 0 1 1 1 0 0 0 1 1 1 0 0

3 0 0 0 1 1 1 1 1 1 1 0 0

4 0 1 1 1 1 1 0 0 0 0 0 0

5 0 1 1 0 0 0 0 0 0 0 0 0

;

run;

So in the above example, we have 12 variables for the 12 months. So we see ID#1 3 and 4 meets the criteria, since they have 6 consecutive months of coverage. While ID 2 only have 2 3 consecutive months coverage and 5 only have 2 months coverage, so they will not be included in the sample.

My first thought is just sum all the months, but that will only tell me if they have atleast 6 months of coverage. So I am a bit lost on seeing how can I code it for it to be consecutive. Does anyone have any experience doing this and can provide me with any input?

Thank you so much for your help

Tan

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Tom's code should be OK. it is also suited for your new situation.

But mine need change a little bit .

data fakedata;
   input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12 year;
   datalines;
1 1 1 1 1 1 1 0 0 0 0 0 0 2001
1 1 1 1 0 1 1 0 0 0 0 0 0 2002
2 0 1 1 1 0 0 0 1 1 1 0 0 2002
3 0 0 0 1 1 1 1 1 1 1 0 0 2004
4 0 1 1 1 1 1 0 0 0 0 0 0 2001
5 0 1 1 0 0 0 0 0 0 0 0 0 2004
6 0 1 1 0 0 1 1 1 0 0 0 0 2010
7 0 1 1 0 0 0 0 0 0 0 0 1 2001
8 0 1 1 0 0 0 0 0 1 1 1 1 2004
8 0 1 1 0 0 0 1 1 1 1 1 1 2005
;
   run;
data temp(keep=id year name value);
 set fakedata ;
 length name $ 10;
 array m{*} month1-month12;
 do i=1 to dim(m);
  name=vname(m{i});value=m{i};output;
 end;
run;
data temp;
 set temp;
 if value ne lag(value) or id ne lag(id) then count+1;
run;
proc sql;
 create table x as
  select *,count(*) as n from temp 
   where value=1
    group by count
     having calculated n ge 6
      order by id,year,input(compress(name, ,'kd'),best8.);
quit;
data xx(keep=id year start end n);
 set x;
 by id year;
 length start end $ 10;
 retain start end;
 if first.year then start=name;
 if last.year then do;end=name;output;end;
run;
data want;
 merge fakedata xx;
 by id year;
run;

Ksharp

View solution in original post

9 REPLIES 9
CTorres
Quartz | Level 8

Try this:

data want;

  set fakedata;

  array m(12) month1-month12;

  count=m(1);

  max=0;

  do i=2 to 12;

    if m(i) then count+1;

else do; if count>max then  max=count; count= 0; end;

  end;

  if count>max then max=count;

  if max ge 6;

  drop count i;

run;

Regards,

Reeza
Super User

Its better to exit the loop as soon as you find the 6 months, if your criteria is at least 6 months. If you want the max amount of coverage then CTorres solution will work.

data fakedata;

input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12;

datalines;

1 1 1 1 1 1 1 0 0 0 0 0 0

2 0 1 1 1 0 0 0 1 1 1 0 0

3 0 0 0 1 1 1 1 1 1 1 0 0

4 0 1 1 1 1 1 0 0 0 0 0 0

5 0 1 1 0 0 0 0 0 0 0 0 0

;

run;

data want;

set fakedata;

array month(12) month1-month12;

flag=0;count=0;

do i=1 to 12 while (flag ne 1);

    if month(i)=1 then count=count+1;

    else count=0;

    if count eq 6 then flag=1;

end;

run;

Tpham
Quartz | Level 8

So I don't know anything about do loops. Would the do loop you written, work for consecutive months? Just reading the code, I think I need to read up on do loops and have a better undertanding on what the code is doing itself. I'm out of the office today and would play with that tomorrow

data_null__
Jade | Level 19

Maybe treat months as a string.  This finds max coverage.

data fakedata;
   input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12;
   length x $12;
  
x = cats(of month:);
   drop month:;
   do _s = 1 by 0;
      _r = findc(x||
' ','1',_s);
      if _r eq 0 then leave;
      _t = findc(x||' ','1',_r,'K');
      _l = _t-_r;
      m = max(m,_l);
      _s=_t;
     
end;
  
drop _:;
   datalines;
1 1 1 1 1 1 1 0 0 0 0 0 0
2 0 1 1 1 0 0 0 1 1 1 0 0
3 0 0 0 1 1 1 1 1 1 1 0 0
4 0 1 1 1 1 1 0 0 0 0 0 0
5 0 1 1 0 0 0 0 0 0 0 0 0
6 0 1 1 0 0 1 1 1 0 0 0 0
7 0 1 1 0 0 0 0 0 0 0 0 1
8 0 1 1 0 0 0 0 0 1 1 1 1
;
  
run;
Tom
Super User Tom
Super User

An interesting way to do this is to use a character string to record the coverage.

data fakedata;

input id month1-month12;

cards;

1 1 1 1 1 1 1 0 0 0 0 0 0

2 0 1 1 1 0 0 0 1 1 1 0 0

3 0 0 0 1 1 1 1 1 1 1 0 0

4 0 1 1 1 1 1 0 0 0 0 0 0

5 0 1 1 0 0 0 0 0 0 0 0 0

;

run;

data want;

  set fakedata;

  length coverstring $12;

  coverstring = cats(of month1-month12);

  has6months = 0 ^= index(coverstring,repeat('1',6-1));

  put (id has6months coverstring) (=);

run;

id=1 has6months=1 coverstring=111111000000

id=2 has6months=0 coverstring=011100011100

id=3 has6months=1 coverstring=000111111100

id=4 has6months=0 coverstring=011111000000

id=5 has6months=0 coverstring=011000000000

Ksharp
Super User

if care about the start month and end month which has more than 6 series month.

data fakedata;
   input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12;
   datalines;
1 1 1 1 1 1 1 0 0 0 0 0 0
2 0 1 1 1 0 0 0 1 1 1 0 0
3 0 0 0 1 1 1 1 1 1 1 0 0
4 0 1 1 1 1 1 0 0 0 0 0 0
5 0 1 1 0 0 0 0 0 0 0 0 0
6 0 1 1 0 0 1 1 1 0 0 0 0
7 0 1 1 0 0 0 0 0 0 0 0 1
8 0 1 1 0 0 0 0 0 1 1 1 1
;
   run;
data temp(keep=id name value);
 set fakedata ;
 length name $ 10;
 array m{*} month1-month12;
 do i=1 to dim(m);
  name=vname(m{i});value=m{i};output;
 end;
run;
data temp;
 set temp;
 if value ne lag(value) or id ne lag(id) then count+1;
run;
proc sql;
 create table x as
  select *,sum(value=1) as n from temp 
   group by count
    having calculated n ge 6
      order by id,input(compress(name, ,'kd'),best8.);
quit;
data xx(keep=id start end n);
 set x;
 by id;
 length start end $ 10;
 retain start end;
 if first.id then start=name;
 if last.id then do;end=name;output;end;
run;
data want;
 merge fakedata xx;
 by id;
run;

Ksharp

UrvishShah
Fluorite | Level 6

Hi,

How about the following one...

proc transpose data = have

               out  = have(rename = (_NAME_ = Month COL1 = VAL));

      by id;

      var month1-month12;

run;

data want;

   set have;

   by id;

   _temp = dif(val);

   if _temp = . then _temp = 0;

   if first.id or _temp = 1 then con_mnth = 1;

   else con_mnth + 1;

   if _temp < 0 then con_mnth = .;

run;

proc sql;

   create table want as

   select id,max(con_mnth) as con_mnth

   from want

   where val = 1

   group by id;

quit;

-Urvish

Tpham
Quartz | Level 8

Wow thanks for everyone's response.. I am going to take all of your suggestions and play around with it.

So things just got a bit more complex. So some patients might have 2 records, so I am assuming the best option is combine them into 1.

data fakedata;

input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12 year;

datalines;

1 1 1 1 1 1 1 0 0 0 0 0 0 2001

1 0 1 1 1 1 1 0 0 0 0 0 0 2002;

run;


as an example.  I assume I have to create variables like Yr1_month1..etc...


I am unsure if converting the values to strings is an option (method that Tom suggested). I only use the dataline to illustrate what the dataset looks like. But I can give it a try others string suggestion also.


Tan

Ksharp
Super User

Tom's code should be OK. it is also suited for your new situation.

But mine need change a little bit .

data fakedata;
   input id month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12 year;
   datalines;
1 1 1 1 1 1 1 0 0 0 0 0 0 2001
1 1 1 1 0 1 1 0 0 0 0 0 0 2002
2 0 1 1 1 0 0 0 1 1 1 0 0 2002
3 0 0 0 1 1 1 1 1 1 1 0 0 2004
4 0 1 1 1 1 1 0 0 0 0 0 0 2001
5 0 1 1 0 0 0 0 0 0 0 0 0 2004
6 0 1 1 0 0 1 1 1 0 0 0 0 2010
7 0 1 1 0 0 0 0 0 0 0 0 1 2001
8 0 1 1 0 0 0 0 0 1 1 1 1 2004
8 0 1 1 0 0 0 1 1 1 1 1 1 2005
;
   run;
data temp(keep=id year name value);
 set fakedata ;
 length name $ 10;
 array m{*} month1-month12;
 do i=1 to dim(m);
  name=vname(m{i});value=m{i};output;
 end;
run;
data temp;
 set temp;
 if value ne lag(value) or id ne lag(id) then count+1;
run;
proc sql;
 create table x as
  select *,count(*) as n from temp 
   where value=1
    group by count
     having calculated n ge 6
      order by id,year,input(compress(name, ,'kd'),best8.);
quit;
data xx(keep=id year start end n);
 set x;
 by id year;
 length start end $ 10;
 retain start end;
 if first.year then start=name;
 if last.year then do;end=name;output;end;
run;
data want;
 merge fakedata xx;
 by id year;
run;

Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 6693 views
  • 4 likes
  • 7 in conversation