BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sophia_SAS
Obsidian | Level 7

Dear SAS users,

 

Do you have suggestions on how I can create different start and stop dates based on noncontinuous data points?  For example, for each observation I want to create variables mon_start1, mon_end1, mon_start2, mon_end2, mon_start3, mon_end3, etc.  I can create mon_start1 and mon_end1, but cannot figure out how to calculate the remaining variables (mon_start2, mon_end2, mon_start3, mon_end3, etc.) 

 

Here's a sample dataset for months Jan 2015 - Dec 2016 along with the beginning sas code where I created variables mon_start1 and mon_end1.   For example, for ID=AA, the start/end variables should be as follows:

mon_start1=1, mon_end1=6,

mon_start2=10, mon_end2=18,

mon_start3=22,  mon_end3=24

 

data have;

input id $ dob mmddyy10. total_mon mon1-mon24;

format dob monyy7.;

datalines;

AA 01/11/2015 18 1 1 1 1 1 1 . . . 1 1 1 1 1 1 1 1 1 . . . 1 1 1

BB 01/29/2015 24 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

CC 08/23/2015 13 . . . . . . . 1 1 1 1 1 1 1 1 1 . . 1 1 1 1 . .

DD 02/11/2015 20 . 1 1 1 . . . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

EE 03/12/2016 8 . . . . . . . . . . . . . . 1 1 1 1 . 1 1 1 1 .

;

 

data have2;

set have;

start=1;

 

Mon_Start1=(intck('month','01Jan2015'd,dob))+1;

 

array month mon:;

do m=Mon_Start1-start+1 to dim(month) while(month[m]);

end;

 

if m>=24 then Mon_end1=total_mon;

else Mon_End1=m-1;

run;

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Like any question where you have many columns, it is better to drop the "Excel thinking" and normalise your data, i.e. have data going down in observations.  Now the below code, loops over the array and creates a normalised data structure with start/stop times, if the data was normalised in the first place it would be even simpler than this.  Remember your output report files can be transposed before output and so don't need to look like the data you work with.

data have;
  input id $ dob mmddyy10. total_mon mon1-mon24;
  format dob monyy7.;
datalines;
AA 01/11/2015 18 1 1 1 1 1 1 . . . 1 1 1 1 1 1 1 1 1 . . . 1 1 1
BB 01/29/2015 24 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
CC 08/23/2015 13 . . . . . . . 1 1 1 1 1 1 1 1 1 . . 1 1 1 1 . .
DD 02/11/2015 20 . 1 1 1 . . . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
EE 03/12/2016 8 . . . . . . . . . . . . . . 1 1 1 1 . 1 1 1 1 .
;
run;

data want (keep=id dob start_month end_month);
  set have;
  array mon{24};
  length start_month end_month $20;
  do i=1 to 24;
    if mon{i}=1 and start_month="" then start_month=vname(mon{i});
    else if (mon{i}=. and end_month="" and start_month ne "") then do;
      end_month=vname(mon{i-1});
      output;
      end_month="";
      start_month="";
    end;
    else if i=24 and end_month="" and start_month ne "" then do;
      end_month=vname(mon{i});
      output;
    end;
 end;
run;

View solution in original post

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Like any question where you have many columns, it is better to drop the "Excel thinking" and normalise your data, i.e. have data going down in observations.  Now the below code, loops over the array and creates a normalised data structure with start/stop times, if the data was normalised in the first place it would be even simpler than this.  Remember your output report files can be transposed before output and so don't need to look like the data you work with.

data have;
  input id $ dob mmddyy10. total_mon mon1-mon24;
  format dob monyy7.;
datalines;
AA 01/11/2015 18 1 1 1 1 1 1 . . . 1 1 1 1 1 1 1 1 1 . . . 1 1 1
BB 01/29/2015 24 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
CC 08/23/2015 13 . . . . . . . 1 1 1 1 1 1 1 1 1 . . 1 1 1 1 . .
DD 02/11/2015 20 . 1 1 1 . . . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
EE 03/12/2016 8 . . . . . . . . . . . . . . 1 1 1 1 . 1 1 1 1 .
;
run;

data want (keep=id dob start_month end_month);
  set have;
  array mon{24};
  length start_month end_month $20;
  do i=1 to 24;
    if mon{i}=1 and start_month="" then start_month=vname(mon{i});
    else if (mon{i}=. and end_month="" and start_month ne "") then do;
      end_month=vname(mon{i-1});
      output;
      end_month="";
      start_month="";
    end;
    else if i=24 and end_month="" and start_month ne "" then do;
      end_month=vname(mon{i});
      output;
    end;
 end;
run;
sophia_SAS
Obsidian | Level 7

This is a beautiful code and does even more than I had hoped for with my initial question! And yes, you are correct, I need to reframe how I think of my data - i.e. drop the Excel thinking.'

 

Thank you for putting it together so quickly. 

 

art297
Opal | Level 21

I wouldn't be so quick in deciding that @RW9's code does everything you want to accomplish. I added two example records where the code fails to work.

 

Of course, I'm not suggesting that my code is perfect either, but it doesn't require restructuring your data:

 

data have;
  input id $ dob mmddyy10. total_mon mon1-mon24;
  format dob monyy7.;
  datalines;
AA 01/11/2015 18 1 1 1 1 1 1 . . . 1 1 1 1 1 1 1 1 1 . . . 1 1 1
BB 01/29/2015 24 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
CC 08/23/2015 13 . . . . . . . 1 1 1 1 1 1 1 1 1 . . 1 1 1 1 . .
DD 02/11/2015 20 . 1 1 1 . . . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
EE 03/12/2016 8 . . . . . . . . . . . . . . 1 1 1 1 . 1 1 1 1 .
FF 03/12/2016 8 . . . . . . . . . . . . . . . . . . . . . . . 1
GG 03/12/2016 8 . . . . . . . . . . . . . . . . . . . . . . . .
;

data want (drop=i start_month period);
  set have;
  array month{24} mon1-mon24;
  array mon_start(3);
  array mon_end(3);
  period=1;
  call missing(start_month);
  do i=1 to 24;
    if month{i}=1 then do;
      if missing(start_month) then do;
        start_month=i;
        mon_start{period}=i;
      end;
      if i eq 24 then mon_end{period}=i;
    end;
    else if missing(month{i}) and missing(mon_end{period}) and not missing(start_month) then do;
      mon_end{period}=i-1;
      period+1;
      call missing(start_month);
    end;
 end;
run;

Art, CEO, AnalystFinder.com

 

sophia_SAS
Obsidian | Level 7
Hi Art297,

Thanks for responding to my inquiry! You are right in that the earlier code did not catch your two additional IDs - i.e. FF and GG. In my dataset while I never have any GG instances, I do have instances similar to FF (i.e. only 1 data point in the last month). For these examples, I added in a dummy month variable (month25).



Thank you for carefully considering this question and your excellent code. I'm now trying to figure out which to use given my current data structure and plan forward.


art297
Opal | Level 21

@Ksharp's code works well too and provides the result in a long form like the one that @RW9 proposed.

 

Art, CEO, AnalystFinder.com

 

Ksharp
Super User
options missing='x';
data have;
  input id $ dob mmddyy10. total_mon mon1-mon24;
  format dob monyy7.;
  datalines;
AA 01/11/2015 18 1 1 1 1 1 1 . . . 1 1 1 1 1 1 1 1 1 . . . 1 1 1
BB 01/29/2015 24 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
CC 08/23/2015 13 . . . . . . . 1 1 1 1 1 1 1 1 1 . . 1 1 1 1 . .
DD 02/11/2015 20 . 1 1 1 . . . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
EE 03/12/2016 8 . . . . . . . . . . . . . . 1 1 1 1 . 1 1 1 1 .
FF 03/12/2016 8 . . . . . . . . . . . . . . . . . . . . . . . 1
GG 03/12/2016 8 . . . . . . . . . . . . . . . . . . . . . . . .
;

data want;
 set have;
 temp=cats(of mon1-mon24);
 pid=prxparse('/1+/');
 s=1;e=length(temp);
 call prxnext(pid,s,e,temp,p,l);
 do while(p>0);
  start=p;end=p+l-1; output;
  call prxnext(pid,s,e,temp,p,l);
 end;
drop temp pid s e p l;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1651 views
  • 0 likes
  • 4 in conversation