data test;
input id$2. start end ;
attrib start format =date9. informat=date9.;
attrib end format =date9. informat=date9.;
datalines;
1 01JAN2015 14FEB2015
1 18FEB2015 30APR2015
1 05MAY2015 30AUG2015
2 01jan2015 28feb2015
2 01apr2015 30apr2015
3 01JAN2015 14FEB2015
3 15FEB2015 15MAR2015
3 20MAR2015 30APR2015
4 01JAN2015 31JAN2015
4 01JAN2015 15APR2015
;
run;
I want to create a database with continuous enrollment allowing a gap of 7 days between the end of one period and the start of the next one. I want to keep one period per id reflecting only the first period
Output
1 01JAN2015 30AUG2015
2 01jan2015 28feb2015
3 15FEB2015 30APR2015
4 01JAN2015 15APR2015
data test;
input id$2. start end ;
attrib start format =date9. informat=date9.;
attrib end format =date9. informat=date9.;
datalines;
1 01JAN2015 14FEB2015
1 18FEB2015 30APR2015
1 05MAY2015 30AUG2015
2 01jan2015 28feb2015
2 01apr2015 30apr2015
3 01JAN2015 14FEB2015
3 15FEB2015 15MAR2015
3 20MAR2015 30APR2015
4 01JAN2015 31JAN2015
4 01JAN2015 15APR2015
;
run;
data temp;
set test;
by id;
if start-lag(end)>7 or first.id then group+1;
run;
data temp;
do until(last.group);
set temp(rename=(start=_start));
by id group;
if first.group then start=_start;
end;
format start date9.;
drop _start group;
run;
data want;
set temp;
by id;
if first.id;
run;
You want to keep one period per id reflecting only the first period
Then what is this?
3 01JAN2015 14FEB2015
3 15FEB2015 30APR2015
Sorry the error was corrected
Does you data ever include overlapping periods? Or nested periods? If so that makes the problem a little harder.
Yes, I do have overlapping or nested periods.
data test;
input id$2. start end ;
attrib start format =date9. informat=date9.;
attrib end format =date9. informat=date9.;
datalines;
1 01JAN2015 14FEB2015
1 18FEB2015 30APR2015
1 05MAY2015 30AUG2015
2 01jan2015 28feb2015
2 01apr2015 30apr2015
3 01JAN2015 14FEB2015
3 15FEB2015 15MAR2015
3 20MAR2015 30APR2015
4 01JAN2015 31JAN2015
4 01JAN2015 15APR2015
;
run;
data temp;
set test;
by id;
if start-lag(end)>7 or first.id then group+1;
run;
data temp;
do until(last.group);
set temp(rename=(start=_start));
by id group;
if first.group then start=_start;
end;
format start date9.;
drop _start group;
run;
data want;
set temp;
by id;
if first.id;
run;
Thank you @Ksharp, you are the best!
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.