BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
lizzy28
Quartz | Level 8

Hi all,

 

I have multiple rows of admission dates (admission_dt) and discharge dates (discharge_dt) per person (identified by DummyID) and wanted to identify if the timeframe (between admission_dt and discharge_dt) across multiple rows have any overlapped days. Any suggestion is greatly appreciated!

 

Below is the sample data:

DummyID admission_dt discharge_dt
1 1/1/2018 2/28/2018
1 1/1/2018  
1 1/27/2018 1/30/2018
1 3/15/2018  
1 4/1/2018  
1 4/7/2018  
1 5/1/2018  
1 6/1/2018  
1 7/1/2018  
1 8/1/2018  
1 9/1/2018  
1 10/1/2018  
1 11/1/2018  
1 12/1/2018  
2 1/18/2018 5/22/2018
2 3/11/2018 4/30/2018
2 4/11/2018 6/22/2018
1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

If missing discharge_dt means the individual has not been discharged, then you need to provide an "end_of_study" date:  Also, for the sample data you provided, there will be no days without overlap for dummyid=1.

 

This program creates an array indexed by the range of dates for your study.  It then populates each date with the count of obs having that date.  Then all one has to do is step through the completed date history of counts:

 

data have;
  infile datalines truncover dsd;
  input DummyID (admission_dt discharge_dt) (:mmddyy.);
  format admission_dt discharge_dt date9.;
  datalines;
1,1/1/2018,2/28/2018
1,1/1/2018, 
1,1/27/2018,1/30/2018
1,3/15/2018, 
1,4/1/2018, 
1,4/7/2018, 
1,5/1/2018, 
1,6/1/2018, 
1,7/1/2018, 
1,8/1/2018, 
1,9/1/2018, 
1,10/1/2018, 
1,11/1/2018, 
1,12/1/2018, 
2,1/18/2018,5/22/2018
2,3/11/2018,4/30/2018
2,4/11/2018,6/22/2018
;

%let beg_study=01jan2018;
%let end_study=31dec2018;

data want (keep=dummyid overlap_level beg_phase end_phase);
  array _overlap_history {%sysevalf("&beg_study"d):%eval(1+%sysevalf("&end_study"d))} ;
  do until (last.dummyid);
    set have;
    by dummyid;
    if discharge_dt=. then discharge_dt="&end_study"d;
    _mind=min(admission_dt,_mind);
    _maxd=max(discharge_dt,_maxd);
    do d=admission_dt to discharge_dt;
      _overlap_history{d}=sum(_overlap_history{d},1);
    end;
  end;

  format beg_phase end_phase _mind _maxd date9.;
  end_phase=_mind-1;
  do until (end_phase=_maxd);
    beg_phase=end_phase+1;
    overlap_level=_overlap_history{beg_phase};
    do end_phase=beg_phase to "&end_study"d until(_overlap_history{end_phase+1}^=overlap_level);
    end;
    output;
  end;
run;

BTW, if you are so inclined, you can replace the two statements

  end_phase=_mind-1;
  do until (end_phase=_maxd);

with

  do end_phase=_mind-1 by 0 until (end_phase=_maxd);

Note the array is defined to go one day beyond END_STUDY.  This is to support the until condition in

    do end_phase=beg_phase to "&end_study"d-1 until(_overlap_history{end_phase+1}^=overlap_level);

all the way through end_phase="*end_study"d.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

3 REPLIES 3
Patrick
Opal | Level 21

I wasn't sure what missing discharge days mean so I've excluded these rows. If the meaning of missings is not yet discharged then all rows would overlap in your sample data.

 

One of below two options below should "show you the way".

data have;
  infile datalines truncover dsd;
  input DummyID (admission_dt discharge_dt) (:mmddyy.);
  format admission_dt discharge_dt date9.;
  datalines;
1,1/1/2018,2/28/2018
1,1/1/2018, 
1,1/27/2018,1/30/2018
1,3/15/2018, 
1,4/1/2018, 
1,4/7/2018, 
1,5/1/2018, 
1,6/1/2018, 
1,7/1/2018, 
1,8/1/2018, 
1,9/1/2018, 
1,10/1/2018, 
1,11/1/2018, 
1,12/1/2018, 
2,1/18/2018,5/22/2018
2,3/11/2018,4/30/2018
2,4/11/2018,6/22/2018
;

proc sql;
  select 
    t1.DummyID,
    t1.admission_dt,
    t1.discharge_dt,
    t2.admission_dt as t2_admission_dt format=date9.,
    t2.discharge_dt as t2_discharge_dt format=date9.
  from 
    have t1
    left join 
    have t2
  on
    not missing (t1.discharge_dt)
    and
    (
      t2.admission_dt < t1.admission_dt < t2.discharge_dt
      or
      t2.admission_dt < t1.discharge_dt < t2.discharge_dt
    )
  order by t1.DummyID, t1.admission_dt
  ;
quit;


proc sql;
  select 
    t1.DummyID,
    t1.admission_dt,
    t1.discharge_dt,
    count(*) as n_overlaps
  from 
    have t1
    left join 
    have t2
  on
    not missing (t1.discharge_dt)
    and
    (
      t2.admission_dt < t1.admission_dt < t2.discharge_dt
      or
      t2.admission_dt < t1.discharge_dt < t2.discharge_dt
    )
  group by t1.DummyID, t1.admission_dt, t1.discharge_dt
  ;
quit;

Patrick_0-1657240697073.png

 

lizzy28
Quartz | Level 8
Thank you, Patrick!

I forgot to mention that missing discharge date means one has not been discharged. So this case the rows are technically all overlapped per individual.
mkeintz
PROC Star

If missing discharge_dt means the individual has not been discharged, then you need to provide an "end_of_study" date:  Also, for the sample data you provided, there will be no days without overlap for dummyid=1.

 

This program creates an array indexed by the range of dates for your study.  It then populates each date with the count of obs having that date.  Then all one has to do is step through the completed date history of counts:

 

data have;
  infile datalines truncover dsd;
  input DummyID (admission_dt discharge_dt) (:mmddyy.);
  format admission_dt discharge_dt date9.;
  datalines;
1,1/1/2018,2/28/2018
1,1/1/2018, 
1,1/27/2018,1/30/2018
1,3/15/2018, 
1,4/1/2018, 
1,4/7/2018, 
1,5/1/2018, 
1,6/1/2018, 
1,7/1/2018, 
1,8/1/2018, 
1,9/1/2018, 
1,10/1/2018, 
1,11/1/2018, 
1,12/1/2018, 
2,1/18/2018,5/22/2018
2,3/11/2018,4/30/2018
2,4/11/2018,6/22/2018
;

%let beg_study=01jan2018;
%let end_study=31dec2018;

data want (keep=dummyid overlap_level beg_phase end_phase);
  array _overlap_history {%sysevalf("&beg_study"d):%eval(1+%sysevalf("&end_study"d))} ;
  do until (last.dummyid);
    set have;
    by dummyid;
    if discharge_dt=. then discharge_dt="&end_study"d;
    _mind=min(admission_dt,_mind);
    _maxd=max(discharge_dt,_maxd);
    do d=admission_dt to discharge_dt;
      _overlap_history{d}=sum(_overlap_history{d},1);
    end;
  end;

  format beg_phase end_phase _mind _maxd date9.;
  end_phase=_mind-1;
  do until (end_phase=_maxd);
    beg_phase=end_phase+1;
    overlap_level=_overlap_history{beg_phase};
    do end_phase=beg_phase to "&end_study"d until(_overlap_history{end_phase+1}^=overlap_level);
    end;
    output;
  end;
run;

BTW, if you are so inclined, you can replace the two statements

  end_phase=_mind-1;
  do until (end_phase=_maxd);

with

  do end_phase=_mind-1 by 0 until (end_phase=_maxd);

Note the array is defined to go one day beyond END_STUDY.  This is to support the until condition in

    do end_phase=beg_phase to "&end_study"d-1 until(_overlap_history{end_phase+1}^=overlap_level);

all the way through end_phase="*end_study"d.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 572 views
  • 2 likes
  • 3 in conversation