- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
In one data set I have datetime (dt) ranges for hospital days (hday). I another dataset I have datetimes of events. How do I merge the events to hospital days?:
(hday, start_dt, end_dt) X (event_dt) -> the event happened on which hospital day?
Thank you!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
you could use proc sql to do that:
proc sql;
create table output as
select d1.hday, d1.start, d1.end, d2.event
from d2 left join d1 on d1.start<=d2.event<=d1.end;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please provide, in the form of working data step, the two starting data sets, and what the resulting data set should look like.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here you go:
data d1; infile datalines delimiter=' '; input hday start end; datalines; 1 1 3
2 4 6
3 7 9 ;
run;
data d2;
infile datalines delimiter=' ';
input event;
2
3
5
8
;
run;
output:
hday start end event
1 1 3 2
1 1 3 3
2 4 6 5
3 7 9 8
Many thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for provide sample data (you missed a datalines statement for d2).
This code works, and works quickly if your data sets are sorted chronologically and the time ranges in d1 do not overlap:
data d1;
infile datalines delimiter=' ';
input hday start end;
datalines;
1 1 3
2 4 6
3 7 9
;
run;
data d2;
infile datalines delimiter=' ';
input event;
datalines;
2
3
5
8
;
run;
data want;
if 0 then merge d1 d2 ;
set d1 (keep=start rename=(start=event) in=d1_start)
d2 (in=event_rec)
d1 (keep=end rename=(end=event) in=d1_end);
by event;
if d1_start then set d1;
if d1_end then call missing(hday,start,end);
if event_rec;
run;
If you have a d2 event outside of the d1 ranges, the observation will be recorded with missing hday, start, and end.
The "if 0 then merge" statement is there just to produce a desired variable order. Values would be unchanged if you remove it.
The key concept here is the "if d1_start then set d1" statement which reads in all the D! variables (unrenamed). They are automatically retained until the next time D!_START=1.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, this works, but my example was over-simplified. In reality, the start would be something like 2:45 pm, and end - 3:00 pm. I would like to capture an event = 2:55 pm. Is there a way to merge conditionally to hospital day if event is between start and end?:
hday start end event
1 2:45 3:00 2:55
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
you could use proc sql to do that:
proc sql;
create table output as
select d1.hday, d1.start, d1.end, d2.event
from d2 left join d1 on d1.start<=d2.event<=d1.end;
quit;