Re: First and Last date in a wide format data set

Reply
Super User
Super User
Posts: 6,500

Re: First and Last date in a wide format data set

data have;

input id month1-month12;

cards;

1 0 0 1 1 1 1 0 0 0 0 0 0

2 0 0 1 1 1 1 1 1 1 0 0 0

3 1 1 1 1 1 1 1 0 0 0 0 0

4 0 0 0 0 0 0 0 0 0 0 0 0

5 1 1 1 1 1 1 1 1 1 1 1 1

;

run;

data want ;

  set have ;

  length str $12 ;

  str = cats(of month1-month12);

  first = index(str,'1');

  last = first + indexc(cats(substr(str,first+1),'0'),'0') -1 ;

  put (id first last str) (3*3. +1 $12.);

run;

  1  3  6 001111000000

  2  3  9 001111111000

  3  1  7 111111100000

  4  0  0 000000000000

  5  1 12 111111111111

Respected Advisor
Posts: 3,777

Re: First and Last date in a wide format data set

I think features of the FINDC function make this a little bit easier.

data need;
   set have;
   length s $12;
   s=cats(of monthSmiley Happy;
   f=findc(s,
'1',1);
   if f gt 0
     
then l=findc(cats(s,'0'),'0',f)-1;
     
else l=0;
  
run;
Contributor
Posts: 20

Re: First and Last date in a wide format data set

Thank you Tom, this does work for what I wanted. Unfortunately, the plot thickens. I am trying to construct episodes of coverage for patients in an insurance database. My episodes of coverage are continuous periods of time that a patient was enrolled in an insurance plan. This is what I have:

ID   YEAR   ENRIND1-ENRIND12    MEMDAYS1-MEMDAYS12

101 2008    0 0 1 1 1 1 1 1 1 1 1 1   0 0 31 30……………..31

101 2009    1 1 1 1 1 0 0 1 1 1 1 1   31 28…0 0……………31

101 2010    1 1 1 1 1 1 1 1 1 0 0 0   31 28………………0 0 0

What I want to make is two episodes of continuous coverage for this:

ID     FIRST1     LAST1      FIRST2      LAST2      …  FIRSTN    LASTN

101  2/1/2008  5/31/2009    8/1/2009   9/30/2010

The MEMDAYS1-12 variables might be necessary to create the day for LAST1-N date variables. The FIRST1-N will always start with the first of the month.


Complications being that the duplications for IDs are necessary for other parts of the analysis and I think it will be quite fine to end up with something like this:

ID     FIRST1     LAST1      FIRST2      LAST2      …  FIRSTN    LASTN

101  2/1/2008  5/31/2009    8/1/2009   9/30/2010

101  2/1/2008  5/31/2009    8/1/2009   9/30/2010

101  2/1/2008  5/31/2009    8/1/2009   9/30/2010

I added the FIRSTN LASTN because some people may have > 2 periods of continuous coverage.

Any assistance here is appreciated, I only wish I could repay in some way.

Thank you in advance.

Ask a Question
Discussion stats
  • 2 replies
  • 192 views
  • 1 like
  • 3 in conversation