BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
vraj1
Quartz | Level 8
Hi,

I want to split duration of the id on basis on day1-4 and the next day 5-7. 
This needs to be done only for the id at the start(first.id). If the duration for first.id is lesser than 7 than it should take from the next observation of same id.
Any help?
Input: data have; infile datalines dsd truncover; input id num (start end) (:date9.); format start end date9.; datalines4; 101,2,12MAR2018,14MAR201 101,3,15MAR2018,19MAR2018 101,4,20MAR2018,29MAR2018 102,2,12JAN2018,19JAN2018 102,3,20JAN2018,28JAN2018 102,4,29JAN2018,12FEB2018 102,5,13FEB2018,18FEB2018 102,6,19FEB2018,28FEB2018 ;;;; run; Output: data wanted; infile datalines dsd truncover; input id num (start end) (:date9.); format start end date9.; datalines4; 101,2,12MAR2018,15MAR2018 101,3,16MAR2018,18MAR2018 101,3,19MAR2018,19MAR2018 101,4,20MAR2018,29MAR2018 102,2,12JAN2018,15JAN2018 102,2,16JAN2018,18JAN2018 102,2,19JAN2018,19JAN2018 102,3,20JAN2018,28JAN2018 102,4,29JAN2018,12FEB2018 102,5,13FEB2018,18FEB2018 102,6,19FEB2018,28FEB2018 ;;;; run;
1 ACCEPTED SOLUTION

Accepted Solutions
data_null__
Jade | Level 19

This is what I was able to come up with.  It matches your WANTED.

 

data wanted;
   infile datalines dsd;
   input id num (start end) (:date9.);
   format start end date9.;
   datalines4;
101,2,12MAR2018,15MAR2018
101,3,16MAR2018,18MAR2018
101,3,19MAR2018,19MAR2018
101,4,20MAR2018,29MAR2018
102,2,12JAN2018,15JAN2018
102,2,16JAN2018,18JAN2018
102,2,19JAN2018,19JAN2018
102,3,20JAN2018,28JAN2018
102,4,29JAN2018,12FEB2018
102,5,13FEB2018,18FEB2018
102,6,19FEB2018,28FEB2018
;;;;
run;

data have;
   infile datalines dsd;
   input id num (start end) (:date9.);
   format start end date9.;
   datalines4;
101,2,12MAR2018,14MAR2018
101,3,15MAR2018,19MAR2018
101,4,20MAR2018,29MAR2018
102,2,12JAN2018,19JAN2018
102,3,20JAN2018,28JAN2018
102,4,29JAN2018,12FEB2018
102,5,13FEB2018,18FEB2018
102,6,19FEB2018,28FEB2018
;;;;
   run;
data testV / view=testv;
   d=0;
   do until(last.id);
      set have;
      by id start;
      do date=start to end;
         if d lt 8 then d + 1;
         if 1 le d le 4 then g=1;
         else if d le 7 then g=2;
         else                g=3;
         if g eq 3 then do; output; leave; end;
         else               output;
         end;
      end;
   format date date9.;
   run;
data test;
   do until(last.id);
      set testV;
      by id g start;
      if g in(1,2) then do;
         if first.g then do; nstart=date; nnum=num; end;
         if last.g then do; end=date; start=nstart; num=nnum; output; end;
         end; 
      else do;
         start=date;
         output;
         end;
      end;
   format nstart date9.;
   drop g d nstart nnum date;
   run;
proc print;
   run;
proc compare base=wanted compare=test listequalvars;
   run;

View solution in original post

5 REPLIES 5
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Are the dates always continuous?  If so pull out the first date and last date, then just split that by 4 days and then 3 more days.  It seems a very simple problem, what have you tried?

vraj1
Quartz | Level 8

I only need to split 2 times day1 -4 and 5-7 and keep the remaining dates as it is. was trying something like this but not sure what conditions need to be kept

proc sort data=have out = start;
	by id num;
run;

data v7_8;
	merge start (in=a)
	      start (in=b where=(num8=2) rename=(endt=endt8 std=std8 num=num8) keep= id num std endt)
		  ;
	by id;
	if first.id then new8=.;
	** Do the date calc **;
	dur=endt-std;
    if dur lt 7 then 

	retain new8;
	if num=3 then std = coalesce(new8, std);
run;	
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

does the split allow for roaming spans.

what I mean is record 1 Jan1-Jan-3, then record 2 has Jan-4 -- Jan 5, then record 3 has Jan-7 -- Jan 12.  Note that Jan-6 is missing how do you account for that?

do you want Jan-1 through Jan-4 as record 1

then do you want Jan-5 through Jan 5 as the second record.  or do you allow the missing day to be included which is Jan-6?

 

vraj1
Quartz | Level 8

Hi, No it will not miss roaming spans and is not sure how much duration for each id it has num which says each record and if the duration is not enough than it needs to take from next num and fit the split of 7 days which is day1-4 and then 5 to 7

 

 

data_null__
Jade | Level 19

This is what I was able to come up with.  It matches your WANTED.

 

data wanted;
   infile datalines dsd;
   input id num (start end) (:date9.);
   format start end date9.;
   datalines4;
101,2,12MAR2018,15MAR2018
101,3,16MAR2018,18MAR2018
101,3,19MAR2018,19MAR2018
101,4,20MAR2018,29MAR2018
102,2,12JAN2018,15JAN2018
102,2,16JAN2018,18JAN2018
102,2,19JAN2018,19JAN2018
102,3,20JAN2018,28JAN2018
102,4,29JAN2018,12FEB2018
102,5,13FEB2018,18FEB2018
102,6,19FEB2018,28FEB2018
;;;;
run;

data have;
   infile datalines dsd;
   input id num (start end) (:date9.);
   format start end date9.;
   datalines4;
101,2,12MAR2018,14MAR2018
101,3,15MAR2018,19MAR2018
101,4,20MAR2018,29MAR2018
102,2,12JAN2018,19JAN2018
102,3,20JAN2018,28JAN2018
102,4,29JAN2018,12FEB2018
102,5,13FEB2018,18FEB2018
102,6,19FEB2018,28FEB2018
;;;;
   run;
data testV / view=testv;
   d=0;
   do until(last.id);
      set have;
      by id start;
      do date=start to end;
         if d lt 8 then d + 1;
         if 1 le d le 4 then g=1;
         else if d le 7 then g=2;
         else                g=3;
         if g eq 3 then do; output; leave; end;
         else               output;
         end;
      end;
   format date date9.;
   run;
data test;
   do until(last.id);
      set testV;
      by id g start;
      if g in(1,2) then do;
         if first.g then do; nstart=date; nnum=num; end;
         if last.g then do; end=date; start=nstart; num=nnum; output; end;
         end; 
      else do;
         start=date;
         output;
         end;
      end;
   format nstart date9.;
   drop g d nstart nnum date;
   run;
proc print;
   run;
proc compare base=wanted compare=test listequalvars;
   run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1505 views
  • 0 likes
  • 4 in conversation