I added some notes to the sample data, and generated the following:
data have;
input Inst_id :$1. Per_id $1. Movein :date9. Moveout :date9. ;
format move: date9.;
datalines;
A A 01JAN2022 10JAN2022
B A 03JAN2022 07JAN2022 /* To be Dropped */
A B 04JAN2022 08JAN2022
B B 07JAN2022 11JAN2022 /* Change movein to 08JAN2022, based on preceding moveout*/
C C 05JAN2022 08JAN2022 /* Combine this one and ... */
C C 08JAN2022 09JAN2022 /* ... this one into a single stay 05JAN2022-09Jan2022 */
E D 02JAN2022 02JAN2022
run;
Assuming the data are sorted by PER_ID/MOVEIN, then this produces what I think you want:
data want (drop=_:);
set have;
by per_id inst_id notsorted;
retain _min_movein _max_moveout;
if first.inst_id then call missing(_min_movein,_max_movout);
/* Delete totally "contained" reservations */
if first.per_id=0 and _min_movein<=movein and moveout<=_max_moveout then delete;
if first.per_id=1 then do;
_min_movein=movein;
_max_moveout=moveout;
end;
if first.per_id=0 and inst_id^=lag(inst_id) then do; /*First non-deleted record for this inst_id */
_min_movein=max(movein,_max_moveout); /*Remove possible overlap with preceding rec*/
_max_moveout=moveout;
end;
else _max_moveout=max(_max_moveout,moveout);
if last.inst_id;
movein=_min_movein;
moveout=_max_moveout;
run;
First, please provide sample data in the form of a working DATA step, so that suggested code can be tested by your respondents. And expected output to confirm the description of your objective. Help us help you.
I added some notes to the sample data, and generated the following:
data have;
input Inst_id :$1. Per_id $1. Movein :date9. Moveout :date9. ;
format move: date9.;
datalines;
A A 01JAN2022 10JAN2022
B A 03JAN2022 07JAN2022 /* To be Dropped */
A B 04JAN2022 08JAN2022
B B 07JAN2022 11JAN2022 /* Change movein to 08JAN2022, based on preceding moveout*/
C C 05JAN2022 08JAN2022 /* Combine this one and ... */
C C 08JAN2022 09JAN2022 /* ... this one into a single stay 05JAN2022-09Jan2022 */
E D 02JAN2022 02JAN2022
run;
Assuming the data are sorted by PER_ID/MOVEIN, then this produces what I think you want:
data want (drop=_:);
set have;
by per_id inst_id notsorted;
retain _min_movein _max_moveout;
if first.inst_id then call missing(_min_movein,_max_movout);
/* Delete totally "contained" reservations */
if first.per_id=0 and _min_movein<=movein and moveout<=_max_moveout then delete;
if first.per_id=1 then do;
_min_movein=movein;
_max_moveout=moveout;
end;
if first.per_id=0 and inst_id^=lag(inst_id) then do; /*First non-deleted record for this inst_id */
_min_movein=max(movein,_max_moveout); /*Remove possible overlap with preceding rec*/
_max_moveout=moveout;
end;
else _max_moveout=max(_max_moveout,moveout);
if last.inst_id;
movein=_min_movein;
moveout=_max_moveout;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.