BookmarkSubscribeRSS Feed
NewUsrStat
Quartz | Level 8

Hi guys, 

suppose to have the following: 

data DB1;
  input ID Index_date Code Admission Discharge Status Date;
  format Admission Discharge date9.;
cards;
0001 1 49121 11JAN2018 07FEB2018 Died  .
0001 1 4660  11JAN2018 07FEB2018 Died  .
0001 0 4821  23MAY2021 21JUN2021 Died  .
0002 1 4660  01OCT2017 10OCT2017 Died  .
0003 1 4659  30MAY2017 7JUN2017  Died  .
0003 0 4659  01JAN2018 10JAN2018 Died  .
0004 1 V0182 11NOV2021 17NOV2021 Died  .
0004 1 V0182 11NOV2021 17NOV2021 Died  .
0004 1 4829  11NOV2021 17NOV2021 Died  .
;


data DB2;
  input ID Index_date Code Admission Discharge Status Date;
  format Admission Discharge date9.;
cards;
0001 1 49121 11JAN2018 07FEB2018  Died 22JUN2021
0001 1 4660  11JAN2018 07FEB2018  Died 22JUN2021
0001 0 4821  23MAY2021 21JUN2021    .      .
0002 1 4660  01OCT2017 10OCT2017 Died 11OCT2017
0003 1 4659  30MAY2017 07JUN2017 Died 11JAN2018
0003 0 4659  01JAN2018 10JAN2018    .     .
0004 1 V0182 11NOV2021 17NOV2021 Died  18NOV2021
0004 1 V0182 11NOV2021 17NOV2021 Died  18NOV2021
0004 1 4829  11NOV2021 17NOV2021 Died  18NOV2021
;

The desired output id DB2. 

I would like to assign the death date as the day after the last recorded discharge date for each ID. It could happen that there is only one Admission-Discharge date for a patient like for ID = 002. Doesn't matter. It could also happen that the Admission-Discharge date is repeated equally (es: ID: 004). This happens because of different recorded codes. Doesn't matter. The death date should be the first day after the last (and repeated) discharge date. Patients are sorted by ID and Admission date. Note that there is also an Index_date that indicate the first admission for that patient.

Finally the format of the table DB2 should be changed with respect to DB1. The death date and the word "Died" should be added to the row where Index_date = 1.

 

Can anyone help me please? 

 

Thank you in advance

 

3 REPLIES 3
tarheel13
Rhodochrosite | Level 12
data DB1;
  input ID Index_date Code $ Admission :date9. Discharge :date9. Status $ Date :date9.;
  format Admission Discharge Date date9.;
  datalines;
0001 1 49121 11JAN2018 07FEB2018 Died .
0001 1 4660 11JAN2018 07FEB2018 Died .
0001 0 4821 23MAY2021 21JUN2021 Died .
0002 1 4660 01OCT2017 10OCT2017 Died .
0003 1 4659 30MAY2017 7JUN2017 Died .
0003 0 4659 01JAN2018 10JAN2018 Died .
0004 1 V0182 11NOV2021 17NOV2021 Died .
0004 1 V0182 11NOV2021 17NOV2021 Died .
0004 1 4829 11NOV2021 17NOV2021 Died .
;
run;

*Sort so we can use last.id; proc sort data = db1; by id admission discharge; run;
*Creating death dataset by getting last discharge date per ID and adding 1 day to it; data deaths; set db1; by id admission discharge;
*Derive death_date variable; death_date = discharge+1; if last.id; format death_date date9.; keep id admission discharge death_date; run;
*Merge death dataset back to original dataset; data db2; merge db1 deaths(in=b); by id;

*if in death dataset and index_date = 1 then set status to 'Died' and date to death_date; if b and index_date=1 then do; date = death_date; status='Died'; end;
*Leave as missing if index_date is not 1; else do; date=.; status=''; end; drop death_date; /*dropping unnecessary variables*/ run;
mkeintz
PROC Star

Why does index_date have values like "V0182"?  That is not a date value, formatted or unformatted.

 

I misread the index_date variable.  Here is a single step solution.  It assumes

  1. data are sorted by ID
  2. the index_date=0 record is always the last one for a given ID.

 

data want;;
  set db1 (in=firstpass)  db1 (in=secondpass);
  by id;
  retain death_date;  
  format death_date date9. ;

  if first.id then death_date=. ;
  if firstpass then death_date=max(death_date,discharge+1);

  if secondpass;
  if index_date=0 then call missing(status,death_date);
run;

 

m

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tarheel13
Rhodochrosite | Level 12
Index_date has values of 0 and 1. You are talking about the code variable

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 219 views
  • 0 likes
  • 3 in conversation