DATA Step, Macro, Functions and more

Collapsing inpatient admissions

Accepted Solution Solved
Reply
Contributor
Posts: 34
Accepted Solution

Collapsing inpatient admissions

I have a tricky problem collapsing inpatient hospital admissions and haven't had success getting a solution from previous blog posts. What I would like to do, is collapse rows where the ADMISSION_START_DT = the ADMISSION_END_DT of the preceeding row. The idea is that this is really one admission. In this event I would like to keep the APR_MDC_DESC and HOSPITAL_ID from the more current, second row.

 

In the event that the ADMISSION_START_DT is < 15 days from the ADMISSION_END_DT from the preceeding row, I would like to populate a new variable, READMIT_FLAG_15_DAY so that we can track readmissions. 

 

I have some sample data below. Any help would be greatly appreciated.

 

 

Have

PATIENT_ID ADMISSION_START_DT ADMISSION_END_DT APR_MDC_DESC SERVICE_CAT HOSPITAL_ID
ABC 24-Nov-15 26-Nov-15 Diabetes MED 123
ABC 26-Nov-15 3-Dec-15 Surgery SURG 456
ABC 16-Jan-16 22-Jan-16 Hypertension  MED 123
ABC 25-Feb-16 29-Feb-16 Diabetes MED 123
ABC 29-Feb-16 29-Feb-16 Observation MED 123
ABC 2-Mar-16 2-Mar-16 Hypertension  MED 123
ABC 19-Mar-16 23-Mar-16 Observation MED 123

 

Want

PATIENT_ID ADMISSION_START_DT ADMISSION_END_DT APR_MDC_DESC SERVICE_CAT HOSPITAL_ID READMIT_FLAG_15_DAY
ABC 24-Nov-15 3-Dec-15 Surgery MED 456  
ABC 16-Jan-16 22-Jan-16 Hypertension  MED 123  
ABC 25-Feb-16 29-Feb-16 Observation MED 123  
ABC 2-Mar-16 2-Mar-16 Hypertension  MED 123  
ABC 19-Mar-16 23-Mar-16 Observation MED 123 Y

Accepted Solutions
Solution
‎01-23-2017 03:20 PM
Trusted Advisor
Posts: 1,019

Re: Collapsing inpatient admissions

Posted in reply to mikemangini

Why does the 5th record of your desired output have the flag='Y'?  Its start date is 19-mar-16 and the enddate of the preceding record is 2-mar-16, which is more than 15 days separation.  I believe you want the flag='Y' for the 4th record, yes?  (2-feb-16 vs 2-mar-16).  If so then this works:

 


data want (drop=next_stdate temp_:);

  do recnum=1 by 1 until (last.patient_id or next_stdate>admission_end_dt);
    set have (keep=patient_id);
    by patient_id;

    merge have
          have(firstobs=2 keep=admission_start_dt 
          rename=(admission_start_dt=next_stdate));
    if recnum=1 then do;
      temp_date=admission_start_dt;
      temp_cat=service_cat;
    end;
  end;

  if lag(patient_id)=patient_id 
     and admission_start_dt-15 < lag(admission_end_dt)
     then readmit_flag_15_day='Y';

  admission_start_dt=temp_date;
  service_cat=temp_cat;
run;

 

 

View solution in original post


All Replies
Contributor
Posts: 34

Re: Collapsing inpatient admissions

Posted in reply to mikemangini

Hi 

 

This is easily solved with the LAG function. Note that this code puts the Y in the second-last row, where I suppose it ought to be according to your spec, and not in the last.

 

* Just to get your data into SAS - ;

data have;

    input PATIENT_ID $3. @5 ADMISSION_START_DT anydtdte9. @15 ADMISSION_END_DT anydtdte9.

        @25 APR_MDC_DESC $12. @38 SERVICE_CAT $4. @43 HOSPITAL_ID 8.;

    format ADMISSION_START_DT ADMISSION_END_DT ddmmyy10.;

    cards;

ABC 24-Nov-15 26-Nov-15 Diabetes     MED  123

ABC 26-Nov-15 3-Dec-15  Surgery      SURG 456

ABC 16-Jan-16 22-Jan-16 Hypertension MED  123

ABC 25-Feb-16 29-Feb-16 Diabetes     MED  123

ABC 29-Feb-16 29-Feb-16 Observation  MED  123

ABC 2-Mar-16  2-Mar-16  Hypertension MED  123

ABC 19-Mar-16 23-Mar-16 Observation  MED  123;

run;

 

* Sort in ascending date order pr patient - just to make sure..;

proc sort data=have; by PATIENT_ID ADMISSION_START_DT;

run;

 

* And here is where the LAG function comes in handy..;

data want (drop=Old_End); set have; by PATIENT_ID;

    Old_End = lag(ADMISSION_END_DT);

        if not first.PATIENT_ID then do;

            if ADMISSION_START_DT = Old_End then delete;

            else if (ADMISSION_START_DT - Old_End) < 15 then READMIT_FLAG_15_DAY = 'Y';

        end;

run;

 

  PATIENT_ID ADMISSION_START_DT ADMISSION_END_DT APR_MDC_DESC SERVICE_CAT HOSPITAL_ID READMIT_FLAG_15_DAY
1 ABC 24/11/2015 26/11/2015 Diabetes MED 123  
2 ABC 16/01/2016 22/01/2016 Hypertension MED 123  
3 ABC 25/02/2016 29/02/2016 Diabetes MED 123  
4 ABC 02/03/2016 02/03/2016 Hypertension MED 123 Y
5 ABC 19/03/2016 23/03/2016 Observation MED 123  

 

 

Trusted Advisor
Posts: 1,019

Re: Collapsing inpatient admissions

Posted in reply to ErikLund_Jensen

@ErikLund_Jensen

 

Your program deletes the latter record and keeps the earlier record.  I believe the OP wants the opposite.  I don't think lag alone is sufficient.  You need a way to look ahead and delete the prior record when it is known that the subsequent record has the desired data.

 

 

Contributor
Posts: 34

Re: Collapsing inpatient admissions

You are right. I missed that one.
Solution
‎01-23-2017 03:20 PM
Trusted Advisor
Posts: 1,019

Re: Collapsing inpatient admissions

Posted in reply to mikemangini

Why does the 5th record of your desired output have the flag='Y'?  Its start date is 19-mar-16 and the enddate of the preceding record is 2-mar-16, which is more than 15 days separation.  I believe you want the flag='Y' for the 4th record, yes?  (2-feb-16 vs 2-mar-16).  If so then this works:

 


data want (drop=next_stdate temp_:);

  do recnum=1 by 1 until (last.patient_id or next_stdate>admission_end_dt);
    set have (keep=patient_id);
    by patient_id;

    merge have
          have(firstobs=2 keep=admission_start_dt 
          rename=(admission_start_dt=next_stdate));
    if recnum=1 then do;
      temp_date=admission_start_dt;
      temp_cat=service_cat;
    end;
  end;

  if lag(patient_id)=patient_id 
     and admission_start_dt-15 < lag(admission_end_dt)
     then readmit_flag_15_day='Y';

  admission_start_dt=temp_date;
  service_cat=temp_cat;
run;

 

 

Contributor
Posts: 34

Re: Collapsing inpatient admissions

Thank you for assisting with my question. I am still testing but your solution seems to be working great! You are correct on the 'Y' flag being put on the wrong record. As I was simplifying and deidentifying the records to create an example, I let that slip. Thanks again for your assistance!

Contributor
Posts: 34

Re: Collapsing inpatient admissions

Posted in reply to mikemangini

@mkeintz is right. It requires look-ahead. But I fear that using a merge inside a loop in the data step would not be very efficient with large data sets. I suggest doing it in a previous step instead:

 

proc sql;

   create table inter as

       select

            a.PATIENT_ID,

            a.ADMISSION_START_DT,

            a.ADMISSION_END_DT,

            coalesce(b.APR_MDC_DESC, a.APR_MDC_DESC) as APR_MDC_DESC,

            a.SERVICE_CAT,

            coalesce(b.HOSPITAL_ID, a.HOSPITAL_ID) as HOSPITAL_ID

   from have as a left join have as b

   on  a.PATIENT_ID = b.PATIENT_ID

       and a.ADMISSION_END_DT = b.ADMISSION_START_DT

       order by a.PATIENT_ID, a.ADMISSION_START_DT;

quit;

 

data want2 (drop=Old_End); set inter; by PATIENT_ID;

   Old_End = lag(ADMISSION_END_DT);

   if not first.PATIENT_ID then do;

       if ADMISSION_START_DT = Old_End then delete;

       else if (ADMISSION_START_DT - Old_End) < 15 then READMIT_FLAG_15_DAY = 'Y';

   end;

run;

Trusted Advisor
Posts: 1,019

Re: Collapsing inpatient admissions

Posted in reply to ErikLund_Jensen

@ErikLund_Jensen

 

Why would a self-merge inside a loop be inefficient?  The two data streams (3 if you count the SET statement) are synchronzied, taking advantage of the data set order.  The same disk i/o buffer would be used for all data streams, reducing disk input/output operations.  Once a small group of records is completely processed, they are flushed from memory.  This all makes the operating system very happy.   It's effectively one pass of the data, requires no creation of an intermediate data set, and doesn't use much memory.

 

The alternative you propose has two unsynchronized reads of data set have in the proc sql.  And SQL (apparently) does not know the data set order, so it has to hold all the keys (and data) of one side of the join in memory (or in utility files on disk) in order to have all those keys available for matching against every record in the other side of the join.  It also writes and rereads an intermediate data set on the way to data set WANT.

 

I admit, when data set have is sorted in a useful way, I am NOT a fan of sql in these situations.

Contributor
Posts: 34

Re: Collapsing inpatient admissions

@mkeintz

 

I humbly admit you are right. I tried both methods on a large unsorted data set with 1000000 Patient-ID's, and your approach is about 1.8 times faster on an empty machine and almost 4 times faster on a heavy loaded machine. Thanks for the lesson!.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 238 views
  • 2 likes
  • 3 in conversation