Obsidian | Level 7

## Creating dates based on different start and stop intervals

Dear SAS users,

Do you have suggestions on how I can create different start and stop dates based on noncontinuous data points?  For example, for each observation I want to create variables mon_start1, mon_end1, mon_start2, mon_end2, mon_start3, mon_end3, etc.  I can create mon_start1 and mon_end1, but cannot figure out how to calculate the remaining variables (mon_start2, mon_end2, mon_start3, mon_end3, etc.)

Here's a sample dataset for months Jan 2015 - Dec 2016 along with the beginning sas code where I created variables mon_start1 and mon_end1.   For example, for ID=AA, the start/end variables should be as follows:

mon_start1=1, mon_end1=6,

mon_start2=10, mon_end2=18,

mon_start3=22,  mon_end3=24

data have;

input id \$ dob mmddyy10. total_mon mon1-mon24;

format dob monyy7.;

datalines;

AA 01/11/2015 18 1 1 1 1 1 1 . . . 1 1 1 1 1 1 1 1 1 . . . 1 1 1

BB 01/29/2015 24 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

CC 08/23/2015 13 . . . . . . . 1 1 1 1 1 1 1 1 1 . . 1 1 1 1 . .

DD 02/11/2015 20 . 1 1 1 . . . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

EE 03/12/2016 8 . . . . . . . . . . . . . . 1 1 1 1 . 1 1 1 1 .

;

data have2;

set have;

start=1;

Mon_Start1=(intck('month','01Jan2015'd,dob))+1;

array month mon:;

do m=Mon_Start1-start+1 to dim(month) while(month[m]);

end;

if m>=24 then Mon_end1=total_mon;

else Mon_End1=m-1;

run;

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Diamond | Level 26

## Re: Creating dates based on different start and stop intervals

Like any question where you have many columns, it is better to drop the "Excel thinking" and normalise your data, i.e. have data going down in observations.  Now the below code, loops over the array and creates a normalised data structure with start/stop times, if the data was normalised in the first place it would be even simpler than this.  Remember your output report files can be transposed before output and so don't need to look like the data you work with.

```data have;
input id \$ dob mmddyy10. total_mon mon1-mon24;
format dob monyy7.;
datalines;
AA 01/11/2015 18 1 1 1 1 1 1 . . . 1 1 1 1 1 1 1 1 1 . . . 1 1 1
BB 01/29/2015 24 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
CC 08/23/2015 13 . . . . . . . 1 1 1 1 1 1 1 1 1 . . 1 1 1 1 . .
DD 02/11/2015 20 . 1 1 1 . . . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
EE 03/12/2016 8 . . . . . . . . . . . . . . 1 1 1 1 . 1 1 1 1 .
;
run;

data want (keep=id dob start_month end_month);
set have;
array mon{24};
length start_month end_month \$20;
do i=1 to 24;
if mon{i}=1 and start_month="" then start_month=vname(mon{i});
else if (mon{i}=. and end_month="" and start_month ne "") then do;
end_month=vname(mon{i-1});
output;
end_month="";
start_month="";
end;
else if i=24 and end_month="" and start_month ne "" then do;
end_month=vname(mon{i});
output;
end;
end;
run;```
6 REPLIES 6
Diamond | Level 26

## Re: Creating dates based on different start and stop intervals

Like any question where you have many columns, it is better to drop the "Excel thinking" and normalise your data, i.e. have data going down in observations.  Now the below code, loops over the array and creates a normalised data structure with start/stop times, if the data was normalised in the first place it would be even simpler than this.  Remember your output report files can be transposed before output and so don't need to look like the data you work with.

```data have;
input id \$ dob mmddyy10. total_mon mon1-mon24;
format dob monyy7.;
datalines;
AA 01/11/2015 18 1 1 1 1 1 1 . . . 1 1 1 1 1 1 1 1 1 . . . 1 1 1
BB 01/29/2015 24 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
CC 08/23/2015 13 . . . . . . . 1 1 1 1 1 1 1 1 1 . . 1 1 1 1 . .
DD 02/11/2015 20 . 1 1 1 . . . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
EE 03/12/2016 8 . . . . . . . . . . . . . . 1 1 1 1 . 1 1 1 1 .
;
run;

data want (keep=id dob start_month end_month);
set have;
array mon{24};
length start_month end_month \$20;
do i=1 to 24;
if mon{i}=1 and start_month="" then start_month=vname(mon{i});
else if (mon{i}=. and end_month="" and start_month ne "") then do;
end_month=vname(mon{i-1});
output;
end_month="";
start_month="";
end;
else if i=24 and end_month="" and start_month ne "" then do;
end_month=vname(mon{i});
output;
end;
end;
run;```
Obsidian | Level 7

## Re: Creating dates based on different start and stop intervals

This is a beautiful code and does even more than I had hoped for with my initial question! And yes, you are correct, I need to reframe how I think of my data - i.e. drop the Excel thinking.'

Thank you for putting it together so quickly.

Opal | Level 21

## Re: Creating dates based on different start and stop intervals

I wouldn't be so quick in deciding that @RW9's code does everything you want to accomplish. I added two example records where the code fails to work.

Of course, I'm not suggesting that my code is perfect either, but it doesn't require restructuring your data:

```data have;
input id \$ dob mmddyy10. total_mon mon1-mon24;
format dob monyy7.;
datalines;
AA 01/11/2015 18 1 1 1 1 1 1 . . . 1 1 1 1 1 1 1 1 1 . . . 1 1 1
BB 01/29/2015 24 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
CC 08/23/2015 13 . . . . . . . 1 1 1 1 1 1 1 1 1 . . 1 1 1 1 . .
DD 02/11/2015 20 . 1 1 1 . . . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
EE 03/12/2016 8 . . . . . . . . . . . . . . 1 1 1 1 . 1 1 1 1 .
FF 03/12/2016 8 . . . . . . . . . . . . . . . . . . . . . . . 1
GG 03/12/2016 8 . . . . . . . . . . . . . . . . . . . . . . . .
;

data want (drop=i start_month period);
set have;
array month{24} mon1-mon24;
array mon_start(3);
array mon_end(3);
period=1;
call missing(start_month);
do i=1 to 24;
if month{i}=1 then do;
if missing(start_month) then do;
start_month=i;
mon_start{period}=i;
end;
if i eq 24 then mon_end{period}=i;
end;
else if missing(month{i}) and missing(mon_end{period}) and not missing(start_month) then do;
mon_end{period}=i-1;
period+1;
call missing(start_month);
end;
end;
run;
```

Art, CEO, AnalystFinder.com

Obsidian | Level 7

## Re: Creating dates based on different start and stop intervals

Hi Art297,

Thanks for responding to my inquiry! You are right in that the earlier code did not catch your two additional IDs - i.e. FF and GG. In my dataset while I never have any GG instances, I do have instances similar to FF (i.e. only 1 data point in the last month). For these examples, I added in a dummy month variable (month25).

Thank you for carefully considering this question and your excellent code. I'm now trying to figure out which to use given my current data structure and plan forward.

Opal | Level 21

## Re: Creating dates based on different start and stop intervals

@Ksharp's code works well too and provides the result in a long form like the one that @RW9 proposed.

Art, CEO, AnalystFinder.com

Super User

## Re: Creating dates based on different start and stop intervals

``````options missing='x';
data have;
input id \$ dob mmddyy10. total_mon mon1-mon24;
format dob monyy7.;
datalines;
AA 01/11/2015 18 1 1 1 1 1 1 . . . 1 1 1 1 1 1 1 1 1 . . . 1 1 1
BB 01/29/2015 24 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
CC 08/23/2015 13 . . . . . . . 1 1 1 1 1 1 1 1 1 . . 1 1 1 1 . .
DD 02/11/2015 20 . 1 1 1 . . . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
EE 03/12/2016 8 . . . . . . . . . . . . . . 1 1 1 1 . 1 1 1 1 .
FF 03/12/2016 8 . . . . . . . . . . . . . . . . . . . . . . . 1
GG 03/12/2016 8 . . . . . . . . . . . . . . . . . . . . . . . .
;

data want;
set have;
temp=cats(of mon1-mon24);
pid=prxparse('/1+/');
s=1;e=length(temp);
call prxnext(pid,s,e,temp,p,l);
do while(p>0);
start=p;end=p+l-1; output;
call prxnext(pid,s,e,temp,p,l);
end;
drop temp pid s e p l;
run;``````
Discussion stats
• 6 replies
• 1354 views
• 0 likes
• 4 in conversation