*Update: I am dealing with bad data and trying to clean it up. The have data represents scenarios where there were multiple admits/releases within an hour. It is assumed these were errors (e.g. released the wrong person so then readmitted them (red) or used the wrong release reason so readmitted and then re-released them with the correct reason (purple).
Hello, I am using the following program to combine records that overlap allowing 1 hour between them. I am not getting the expected results. I suspect it has to do with the datetime format. Is 3600 not the number I should be using to represent 1 hour?
data a4;
do until (last.id);
set a3;
by id;
retain _start _end;
format _start _end datetime20.;
if first.id then
do;
_start=move_datetime;
_end=next_datetime;
end;
if move_datetime > _end+3600 then
do; *cases with only one hour (3600 seconds);
output;
_start=move_datetime;
_end=next_datetime;
end;
else _end=max(_end, next_datetime);
end;
output;
*drop move_datetime next_move_datetime;
rename _start=admit_date _end=release_date;
run;
| Have1: | |||
| ID | Move | move_datetime | |
| 1 | Admit | 01JAN2019:12:00:00 | |
| 1 | Release | 01JAN2019:12:01:00 | |
| 1 | Admit | 15JAN2019:12:00:00 | |
| 1 | Release | 17JAN2019:21:56:00 | |
| 1 | Admit | 17JAN2019:21:57:00 | |
| 1 | Release | 20JAN2019:08:05:00 | |
| 1 | Admit | 01FEB2019:01:20:00 | |
| 1 | Release | 15FEB2019:21:50:00 | |
| 1 | Admit | 15FEB2019:21:50:00 | |
| 1 | Release | 15FEB2019:21:50:00 | |
| I created the next_datetime variable by lagging the move_datetime. | |||
| Have2: | |||
| ID | Move | move_datetime | next_datetime | 
| 1 | Admit | 01JAN2019:12:00:00 | 01JAN2019:12:01:00 | 
| 1 | Release | 01JAN2019:12:01:00 | 15JAN2019:12:00:00 | 
| 1 | Admit | 15JAN2019:12:00:00 | 17JAN2019:21:56:00 | 
| 1 | Release | 17JAN2019:21:56:00 | 17JAN2019:21:57:00 | 
| 1 | Admit | 17JAN2019:21:57:00 | 20JAN2019:08:05:00 | 
| 1 | Release | 20JAN2019:08:05:00 | 01FEB2019:01:20:00 | 
| 1 | Admit | 01FEB2019:01:20:00 | 15FEB2019:21:50:00 | 
| 1 | Release | 15FEB2019:21:50:00 | 15FEB2019:21:50:00 | 
| 1 | Admit | 15FEB2019:21:50:00 | 15FEB2019:21:50:00 | 
| 1 | Release | 15FEB2019:21:50:00 | . | 
| *There is no next move for the last record so I replace the missing with the dummy 01JAN3000:00:00:00 | |||
| Want: | |||
| ID | admit_date | release_date | |
| 1 | 01JAN2019:12:00:00 | 01JAN2019:12:01:00 | |
| 1 | 15JAN2019:12:00:00 | 20JAN2019:08:05:00 | |
| 1 | 01FEB2019:01:20:00 | 15FEB2019:21:50:00 | |
Thanks in advance!
Please describe what you want to do with the datetime values.
While 3600 seconds does represent one hour you may be wanting to look at the INTNX function to provide actual interval boundaries.
result = intnx('hour',_start,1,'B')
would yield a result that is the beginning of the next hour on a clock. Use an appropriate format for result to see the results. Also the companion function INTCK returns the number of intervals of a given type occur between two date, time or datetime values.
It would likely be a good idea to provide a few rows of starting data for the variables you need to use and what the result should look like.
Sample data explaining what needs to happen and what is not happening would be good.
@ChristyN wrote:
Thank you I added data.
Your "want" data does not appear to show anything concerning either exactly one hour or hour-by-clock values. So you need to provide some clearer description about what your concern with one hour actually is and show the results in the "want" data.
Do you actually have two release times the same with an admit the same?
Here is one way to provide data step of example data:
data have; input ID $ Move $ move_datetime :datetime18.; format move_datetime datetime20.; datalines; 1 Admit 01JAN2019:12:00:00 1 Release 01JAN2019:12:01:00 1 Admit 15JAN2019:12:00:00 1 Release 17JAN2019:21:56:00 1 Admit 17JAN2019:21:57:00 1 Release 20JAN2019:08:05:00 1 Admit 01FEB2019:01:20:00 1 Release 15FEB2019:21:50:00 1 Admit 15FEB2019:21:50:00 1 Release 15FEB2019:21:50:00 ;
I'm not going to clean up tabs left from pasting "tables" of what ever sort.
Below code creates your table Have2. It also demonstrates how you can look-ahead.
I didn't fully understand the logic which leads to the Want dataset but you could extend the look-ahead logic below to pick the row with the release date you're after.
data have;
  infile datalines truncover dsd;
  input ID Move:$20. move_datetime:datetime20.;
  format move_datetime datetime20.;
datalines;
1,Admit,01JAN2019:12:00:00
1,Release,01JAN2019:12:01:00
1,Admit,15JAN2019:12:00:00
1,Release,17JAN2019:21:56:00
1,Admit,17JAN2019:21:57:00
1,Release,20JAN2019:08:05:00
1,Admit,01FEB2019:01:20:00
1,Release,15FEB2019:21:50:00
1,Admit,15FEB2019:21:50:00
1,Release,15FEB2019:21:50:00
;
data have2;
  set have;
  by id move_datetime;
  if not last.id then
    do;
      n_obs=_n_+1;
      set have
        (keep=move_datetime rename=(move_datetime=next_datetime))
        point=n_obs;
    end;
  if last.id then call missing(next_datetime);
run;
  
proc print;
run;Is this what you are looking for
DATA WORK.test;
    LENGTH
        ID               $ 14
        Move               $ 12 
         Move_datetime   6.;
    FORMAT
        ID               $CHAR14.
        Move               $CHAR12. 
         Move_datetime datetime18.;
    INFORMAT
        ID               $CHAR14.
        Move               $CHAR12. 
		Move_datetime datetime18.;
    INPUT
        ID               : $CHAR14.
        Move               : $CHAR12. 
        Move_datetime :datetime18.;
DATALINES4;
1	Admit	01JAN2019:12:00:00	 
1	Release	01JAN2019:12:01:00	 
1	Admit	15JAN2019:12:00:00	 
1	Release	17JAN2019:21:56:00	 
1	Admit	17JAN2019:21:57:00	 
1	Release	20JAN2019:08:05:00	 
1	Admit	01FEB2019:01:20:00	 
1	Release	15FEB2019:21:50:00	 
1	Admit	15FEB2019:21:50:00	 
1	Release	15FEB2019:21:50:00	 
;;;;
run;
proc sort; by ID Move_datetime; run;
data want;
set test nobs=obs;
call symputx ('nobs',obs);
stop;
run;
data want(keep= ID Admit_Date Release_Date);
set test;
by ID Move_datetime;
array Admit_date_array{&nobs} ;
array release_date_array{&nobs};
retain release_date_array: Admit_date_array: count 0;
if first.id then count=1;
if strip(Move)='Admit' then Admit_date_array[count] = Move_datetime;
if strip(Move)='Release' then do;
  Release_date_array[count] = Move_datetime;
  count=count+1;
end;
format Admit_date Release_date datetime18.;
if last.id then do;
     Admit_date = Admit_date_array[1];
	 Release_date = Release_date_array[1];
     do val = 1 to count-1;
	    
		if Admit_date_array[min(count,val+1)] - Release_date > 3600 then do;
		    Release_date = Release_date_array[Val];
			output;
			Admit_date = Admit_date_array[min(count-1,val+1)];
			Release_date = Release_date_array[min(count-1,val)];
			val=val+1;
			put '*********************************************';
		    put _all_;
			put '##############################################';
		end;
		else if val>=count-1 then do;
		   Release_date = Release_date_array[min(count-1,val)];
		   output;
		end;
		
	end;
end;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
