Thanks, novinosrin, for your quick response. The synthetic data is shelter records but not hospital ones. Some shelters require people to register every day, some may not do so. A shelter may require daily registration in some specific periods (for example, winter) but not other periods. Because of that, the data contains the combination of the two types of stays across people/shelters. For the same day check-in/out record, it could be a person came to the door at midnight, a mistake to input the date, or some other unknow reasons. Unless these records are apparently incorrect, all obs are planned to be kept. Actually, my final dataset should look like the following - the start and end date for a continuous stay Final data format: id datin datout 1 1 6 1 9 13 1 14 17 Corresponding to the "cleaned" data: clientid datin datout 1 1 2 1 2 3 1 3 4 1 4 5 (deleted) (A mistake in my original post. Revision has been done) 1 5 6 1 9 10 1 10 11 1 11 12 1 12 13 (deleted) 1 14 15 1 15 16 1 16 17 @novinosrin wrote: Hi @NanZ I am not quite getting the intuition of your exercise as the sample suggests only 1 day interval between datein and dateout? Hmm such lucky patients or hotel customers who occupy just for a day. Anyway FWIW data h1;
input clientid datin datout ;
datalines;
1 1 2
1 2 2
1 3 3
1 4 4
1 5 7
1 6 6
1 9 10
1 11 11
1 12 12
1 13 13
1 14 16
1 15 15
1 16 16
1 16 17
;
data want;
do until(last.clientid);
set h1(drop=datout);
by clientid;
_min=min(_min,datin);
_max=max(_max,datin);
end;
do _n_=_min to _max;
datin=_n_;
dateout=datin+1;
output;
end;
drop _:;
run; A little more comprehensive information would help
... View more