I added some notes to the sample data, and generated the following:
data have;
input Inst_id :$1. Per_id $1. Movein :date9. Moveout :date9. ;
format move: date9.;
datalines;
A A 01JAN2022 10JAN2022
B A 03JAN2022 07JAN2022 /* To be Dropped */
A B 04JAN2022 08JAN2022
B B 07JAN2022 11JAN2022 /* Change movein to 08JAN2022, based on preceding moveout*/
C C 05JAN2022 08JAN2022 /* Combine this one and ... */
C C 08JAN2022 09JAN2022 /* ... this one into a single stay 05JAN2022-09Jan2022 */
E D 02JAN2022 02JAN2022
run;
Assuming the data are sorted by PER_ID/MOVEIN, then this produces what I think you want:
data want (drop=_:);
set have;
by per_id inst_id notsorted;
retain _min_movein _max_moveout;
if first.inst_id then call missing(_min_movein,_max_movout);
/* Delete totally "contained" reservations */
if first.per_id=0 and _min_movein<=movein and moveout<=_max_moveout then delete;
if first.per_id=1 then do;
_min_movein=movein;
_max_moveout=moveout;
end;
if first.per_id=0 and inst_id^=lag(inst_id) then do; /*First non-deleted record for this inst_id */
_min_movein=max(movein,_max_moveout); /*Remove possible overlap with preceding rec*/
_max_moveout=moveout;
end;
else _max_moveout=max(_max_moveout,moveout);
if last.inst_id;
movein=_min_movein;
moveout=_max_moveout;
run;
First, please provide sample data in the form of a working DATA step, so that suggested code can be tested by your respondents. And expected output to confirm the description of your objective. Help us help you.
I added some notes to the sample data, and generated the following:
data have;
input Inst_id :$1. Per_id $1. Movein :date9. Moveout :date9. ;
format move: date9.;
datalines;
A A 01JAN2022 10JAN2022
B A 03JAN2022 07JAN2022 /* To be Dropped */
A B 04JAN2022 08JAN2022
B B 07JAN2022 11JAN2022 /* Change movein to 08JAN2022, based on preceding moveout*/
C C 05JAN2022 08JAN2022 /* Combine this one and ... */
C C 08JAN2022 09JAN2022 /* ... this one into a single stay 05JAN2022-09Jan2022 */
E D 02JAN2022 02JAN2022
run;
Assuming the data are sorted by PER_ID/MOVEIN, then this produces what I think you want:
data want (drop=_:);
set have;
by per_id inst_id notsorted;
retain _min_movein _max_moveout;
if first.inst_id then call missing(_min_movein,_max_movout);
/* Delete totally "contained" reservations */
if first.per_id=0 and _min_movein<=movein and moveout<=_max_moveout then delete;
if first.per_id=1 then do;
_min_movein=movein;
_max_moveout=moveout;
end;
if first.per_id=0 and inst_id^=lag(inst_id) then do; /*First non-deleted record for this inst_id */
_min_movein=max(movein,_max_moveout); /*Remove possible overlap with preceding rec*/
_max_moveout=moveout;
end;
else _max_moveout=max(_max_moveout,moveout);
if last.inst_id;
movein=_min_movein;
moveout=_max_moveout;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.