data test;
input id$2. start end ;
attrib start format =date9. informat=date9.;
attrib end format =date9. informat=date9.;
datalines;
1 01JAN2015 14FEB2015
1 18FEB2015 30APR2015
1 05MAY2015 30AUG2015
2 01jan2015 28feb2015
2 01apr2015 30apr2015
3 01JAN2015 14FEB2015
3 15FEB2015 15MAR2015
3 20MAR2015 30APR2015
4 01JAN2015 31JAN2015
4 01JAN2015 15APR2015
;
run;
I want to create a database with continuous enrollment allowing a gap of 7 days between the end of one period and the start of the next one. I want to keep one period per id reflecting only the first period
Output
1 01JAN2015 30AUG2015
2 01jan2015 28feb2015
3 15FEB2015 30APR2015
4 01JAN2015 15APR2015
data test;
input id$2. start end ;
attrib start format =date9. informat=date9.;
attrib end format =date9. informat=date9.;
datalines;
1 01JAN2015 14FEB2015
1 18FEB2015 30APR2015
1 05MAY2015 30AUG2015
2 01jan2015 28feb2015
2 01apr2015 30apr2015
3 01JAN2015 14FEB2015
3 15FEB2015 15MAR2015
3 20MAR2015 30APR2015
4 01JAN2015 31JAN2015
4 01JAN2015 15APR2015
;
run;
data temp;
set test;
by id;
if start-lag(end)>7 or first.id then group+1;
run;
data temp;
do until(last.group);
set temp(rename=(start=_start));
by id group;
if first.group then start=_start;
end;
format start date9.;
drop _start group;
run;
data want;
set temp;
by id;
if first.id;
run;
You want to keep one period per id reflecting only the first period
Then what is this?
3 01JAN2015 14FEB2015
3 15FEB2015 30APR2015
Sorry the error was corrected
Does you data ever include overlapping periods? Or nested periods? If so that makes the problem a little harder.
Yes, I do have overlapping or nested periods.
data test;
input id$2. start end ;
attrib start format =date9. informat=date9.;
attrib end format =date9. informat=date9.;
datalines;
1 01JAN2015 14FEB2015
1 18FEB2015 30APR2015
1 05MAY2015 30AUG2015
2 01jan2015 28feb2015
2 01apr2015 30apr2015
3 01JAN2015 14FEB2015
3 15FEB2015 15MAR2015
3 20MAR2015 30APR2015
4 01JAN2015 31JAN2015
4 01JAN2015 15APR2015
;
run;
data temp;
set test;
by id;
if start-lag(end)>7 or first.id then group+1;
run;
data temp;
do until(last.group);
set temp(rename=(start=_start));
by id group;
if first.group then start=_start;
end;
format start date9.;
drop _start group;
run;
data want;
set temp;
by id;
if first.id;
run;
Thank you @Ksharp, you are the best!
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.