Hi, I want to split duration of the id on basis on day1-4 and the next day 5-7. This needs to be done only for the id at the start(first.id). If the duration for first.id is lesser than 7 than it should take from the next observation of same id. Any help?
Input: data have; infile datalines dsd truncover; input id num (start end) (:date9.); format start end date9.; datalines4; 101,2,12MAR2018,14MAR201 101,3,15MAR2018,19MAR2018 101,4,20MAR2018,29MAR2018 102,2,12JAN2018,19JAN2018 102,3,20JAN2018,28JAN2018 102,4,29JAN2018,12FEB2018 102,5,13FEB2018,18FEB2018 102,6,19FEB2018,28FEB2018 ;;;; run; Output: data wanted; infile datalines dsd truncover; input id num (start end) (:date9.); format start end date9.; datalines4; 101,2,12MAR2018,15MAR2018 101,3,16MAR2018,18MAR2018 101,3,19MAR2018,19MAR2018 101,4,20MAR2018,29MAR2018 102,2,12JAN2018,15JAN2018 102,2,16JAN2018,18JAN2018 102,2,19JAN2018,19JAN2018 102,3,20JAN2018,28JAN2018 102,4,29JAN2018,12FEB2018 102,5,13FEB2018,18FEB2018 102,6,19FEB2018,28FEB2018 ;;;; run;
This is what I was able to come up with. It matches your WANTED.
data wanted;
infile datalines dsd;
input id num (start end) (:date9.);
format start end date9.;
datalines4;
101,2,12MAR2018,15MAR2018
101,3,16MAR2018,18MAR2018
101,3,19MAR2018,19MAR2018
101,4,20MAR2018,29MAR2018
102,2,12JAN2018,15JAN2018
102,2,16JAN2018,18JAN2018
102,2,19JAN2018,19JAN2018
102,3,20JAN2018,28JAN2018
102,4,29JAN2018,12FEB2018
102,5,13FEB2018,18FEB2018
102,6,19FEB2018,28FEB2018
;;;;
run;
data have;
infile datalines dsd;
input id num (start end) (:date9.);
format start end date9.;
datalines4;
101,2,12MAR2018,14MAR2018
101,3,15MAR2018,19MAR2018
101,4,20MAR2018,29MAR2018
102,2,12JAN2018,19JAN2018
102,3,20JAN2018,28JAN2018
102,4,29JAN2018,12FEB2018
102,5,13FEB2018,18FEB2018
102,6,19FEB2018,28FEB2018
;;;;
run;
data testV / view=testv;
d=0;
do until(last.id);
set have;
by id start;
do date=start to end;
if d lt 8 then d + 1;
if 1 le d le 4 then g=1;
else if d le 7 then g=2;
else g=3;
if g eq 3 then do; output; leave; end;
else output;
end;
end;
format date date9.;
run;
data test;
do until(last.id);
set testV;
by id g start;
if g in(1,2) then do;
if first.g then do; nstart=date; nnum=num; end;
if last.g then do; end=date; start=nstart; num=nnum; output; end;
end;
else do;
start=date;
output;
end;
end;
format nstart date9.;
drop g d nstart nnum date;
run;
proc print;
run;
proc compare base=wanted compare=test listequalvars;
run;
Are the dates always continuous? If so pull out the first date and last date, then just split that by 4 days and then 3 more days. It seems a very simple problem, what have you tried?
I only need to split 2 times day1 -4 and 5-7 and keep the remaining dates as it is. was trying something like this but not sure what conditions need to be kept
proc sort data=have out = start; by id num; run; data v7_8; merge start (in=a) start (in=b where=(num8=2) rename=(endt=endt8 std=std8 num=num8) keep= id num std endt) ; by id; if first.id then new8=.; ** Do the date calc **; dur=endt-std; if dur lt 7 then retain new8; if num=3 then std = coalesce(new8, std); run;
does the split allow for roaming spans.
what I mean is record 1 Jan1-Jan-3, then record 2 has Jan-4 -- Jan 5, then record 3 has Jan-7 -- Jan 12. Note that Jan-6 is missing how do you account for that?
do you want Jan-1 through Jan-4 as record 1
then do you want Jan-5 through Jan 5 as the second record. or do you allow the missing day to be included which is Jan-6?
Hi, No it will not miss roaming spans and is not sure how much duration for each id it has num which says each record and if the duration is not enough than it needs to take from next num and fit the split of 7 days which is day1-4 and then 5 to 7
This is what I was able to come up with. It matches your WANTED.
data wanted;
infile datalines dsd;
input id num (start end) (:date9.);
format start end date9.;
datalines4;
101,2,12MAR2018,15MAR2018
101,3,16MAR2018,18MAR2018
101,3,19MAR2018,19MAR2018
101,4,20MAR2018,29MAR2018
102,2,12JAN2018,15JAN2018
102,2,16JAN2018,18JAN2018
102,2,19JAN2018,19JAN2018
102,3,20JAN2018,28JAN2018
102,4,29JAN2018,12FEB2018
102,5,13FEB2018,18FEB2018
102,6,19FEB2018,28FEB2018
;;;;
run;
data have;
infile datalines dsd;
input id num (start end) (:date9.);
format start end date9.;
datalines4;
101,2,12MAR2018,14MAR2018
101,3,15MAR2018,19MAR2018
101,4,20MAR2018,29MAR2018
102,2,12JAN2018,19JAN2018
102,3,20JAN2018,28JAN2018
102,4,29JAN2018,12FEB2018
102,5,13FEB2018,18FEB2018
102,6,19FEB2018,28FEB2018
;;;;
run;
data testV / view=testv;
d=0;
do until(last.id);
set have;
by id start;
do date=start to end;
if d lt 8 then d + 1;
if 1 le d le 4 then g=1;
else if d le 7 then g=2;
else g=3;
if g eq 3 then do; output; leave; end;
else output;
end;
end;
format date date9.;
run;
data test;
do until(last.id);
set testV;
by id g start;
if g in(1,2) then do;
if first.g then do; nstart=date; nnum=num; end;
if last.g then do; end=date; start=nstart; num=nnum; output; end;
end;
else do;
start=date;
output;
end;
end;
format nstart date9.;
drop g d nstart nnum date;
run;
proc print;
run;
proc compare base=wanted compare=test listequalvars;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.