BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
epardo
Calcite | Level 5

Hello everyone, I would love some guidance.  I have a dataset of ER visits and hospital admissions. I have transposed the data from long to wide. Here is an example below:

patient_id

encounter_type1

encounter_type2

admtdate1

admtdate2

dschdate1

dschdate2

status1

status2

1

er

 

1/1/2020

 

 

1/1/2020

died

 

2

er

inpatient

3/2/2020

3/6/2020

3/2/2020

5/3/2020

alive

died

3

er

er

4/4/2020

4/5/202

4/4/2020

4/5/2020

alive

died

4

inpatient

er

5/2/2020

5/12/2020

5/7/2020

5/12/2020

alive

alive

 

encounter_type = ER visit or inpatient (i.e. hospital admission)

admtdate = date of admission (inpatient) or service (ER)

dschdate = date of discharge

status = alive or not

I now want to: 1) Create a new variable that sums how many ER visits someone had within the month prior to a hospital admission (if they were admitted); and 2) Create a new variable that specifies where a person died (ER or inpatient)

I am imagining the output looking something like this:

 

patient_id

encounter_type1

encounter_type2

admtdate1

admtdate2

dschdate1

dschdate2

status1

status2

newvar1

newvar2

1

er

 

1/1/2020

 

 

1/1/2020

died

 

.

er

2

er

inpatient

3/2/2020

3/6/2020

3/2/2020

5/3/2020

alive

died

1

inpatient

3

er

er

4/4/2020

4/5/202

4/4/2020

4/5/2020

alive

died

.

er

4

inpatient

er

5/2/2020

5/12/2020

5/7/2020

5/12/2020

alive

alive

0

 

I thank you in advance for your time and expertise. 

1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

One way to approach this is with "double DoW loop". You read the data twice in one data step, one patient at a time.

 

data have;
input
  patient_id $
  encounter_type :$10.
  admtdate :mmddyy10.
  dschdate :mmddyy10.
  status $
;
format
  admtdate
  dschdate yymmdd10.
;
datalines;
1 er 1/1/2020 1/1/2020 died
2 er 3/2/2020 3/2/2020 alive
2 inpatient 3/6/2020 5/3/2020 died
3 er 4/4/2020 4/4/2020 alive
3 er 4/5/2020 4/5/2020 died
4 inpatient 5/2/2020 5/7/2020 alive
4 er 5/12/2020 5/12/2020 alive
;

data want (keep=patient_id inpatientdate ercount deathlocation);
  *find the inpatient admission date for the patient (will be null if no inpatient admission) ;
  do until (last.patient_id) ;
    set have ;
    by patient_id ;
    if encounter_type='inpatient' then inpatientdate=admtdate ;
  end ;

  ercount=0 ;

  *look through all the records for the patient again, count the er addmisions within 30 days of inpatient ;
  *and assign the death location ;
  do until (last.patient_id) ;
    set have ;
    by patient_id ;
    if encounter_type='er' and (not missing(inpatientdate)) then do ;
      if (0<=(inpatientdate-admtdate)<=30) then ercount=ercount+1 ;
    end ;   
    if status='died'       then deathlocation=encounter_type ;
  end ;

  output ;

  format inpatientdate yymmdd10. ;
run ;

proc print data=want ;
run ;

Returns:

       patient_
Obs       id       inpatientdate    ercount    deathlocation

 1        1                  .         0         er
 2        2         2020-03-06         1         inpatient
 3        3                  .         0         er
 4        4         2020-05-02         0
BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

View solution in original post

10 REPLIES 10
epardo
Calcite | Level 5

Please see below:

 

patient_id

encounter_type

admtdate

dschdate

status

1

er

1/1/2020

1/1/2020

died

2

er

3/2/2020

3/2/2020

alive

2

inpatient

3/6/2020

5/3/2020

died

3

er

4/4/2020

4/4/2020

alive

3

er

4/5/2020

4/5/2020

died

4

inpatient

5/2/2020

5/7/2020

alive

4

er

5/12/2020

5/12/2020

alive

Kurt_Bremser
Super User

See this code:

data have;
input
  patient_id $
  encounter_type :$10.
  admtdate :mmddyy10.
  dschdate :mmddyy10.
  status $
;
format
  admtdate
  dschdate yymmdd10.
;
datalines;
1 er 1/1/2020 1/1/2020 died
2 er 3/2/2020 3/2/2020 alive
2 inpatient 3/6/2020 5/3/2020 died
3 er 4/4/2020 4/4/2020 alive
3 er 4/5/2020 4/5/2020 died
4 inpatient 5/2/2020 5/7/2020 alive
4 er 5/12/2020 5/12/2020 alive
;

%let start = %sysfunc(inputn(2000-01-01,yymmdd10.));
%let end = %sysfunc(today());

data want;
array visits {&start.:&end.} _temporary_;
set have;
by patient_id;
retain
  newvar1
  newvar2
;
if first.patient_id
then do;
  call missing(of visits{*});
  newvar1 = .;
  newvar2 = "          ";
end;
visits{admtdate} = (encounter_type = "er");
if encounter_type = "inpatient" then newvar1 = sum(of visits{*});
if status = "died" then newvar2 = encounter_type;
if last.patient_id
then do;
  output;
end;
keep patient_id newvar1 newvar2;
run;

proc print data=want noobs;
run;

Result:

patient_id	newvar1	newvar2
1	.	er
2	1	inpatient
3	.	er
4	0	 

If you need to deal with multiple "inpatient" encounters, the code needs to be modified for this (resetting the array, OUTPUTting).

epardo
Calcite | Level 5

Thank you so much for taking the time to help me with this code. The one outstanding question I have is how to ensure that only ER visits within the 30 days prior to the inpatient visit are counted? Thus no ER visits after the inpatient visit are counted nor are ER visits 30+ days prior to the inpatient visit counted.

Quentin
Super User

Can a patient have two in-patient visits?  

 

If a patient has an ER visit on Jan 1, then in-patient on Jan 10, then ER visit January 15 and  and in-patient on January 25, how would you want to count the ER visits?  Would you want 1 ER visit prior to the JAN 10 in-patient visit and 2 ER visits prior to the January 25 in-patient visit (so essentially double- count the Jan 1 ER visit)?  Or would you want the January 25 in-patient visit to have just 1 prior ER visit?  

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
epardo
Calcite | Level 5

This is a great question. Sorry I didn't address it in my original post. For right now we are focusing on how many ER visits a given patient has in the 30 days prior to their initial inpatient admission. Thus, this dataset only includes one inpatient visit per rln (patient). 

Quentin
Super User

One way to approach this is with "double DoW loop". You read the data twice in one data step, one patient at a time.

 

data have;
input
  patient_id $
  encounter_type :$10.
  admtdate :mmddyy10.
  dschdate :mmddyy10.
  status $
;
format
  admtdate
  dschdate yymmdd10.
;
datalines;
1 er 1/1/2020 1/1/2020 died
2 er 3/2/2020 3/2/2020 alive
2 inpatient 3/6/2020 5/3/2020 died
3 er 4/4/2020 4/4/2020 alive
3 er 4/5/2020 4/5/2020 died
4 inpatient 5/2/2020 5/7/2020 alive
4 er 5/12/2020 5/12/2020 alive
;

data want (keep=patient_id inpatientdate ercount deathlocation);
  *find the inpatient admission date for the patient (will be null if no inpatient admission) ;
  do until (last.patient_id) ;
    set have ;
    by patient_id ;
    if encounter_type='inpatient' then inpatientdate=admtdate ;
  end ;

  ercount=0 ;

  *look through all the records for the patient again, count the er addmisions within 30 days of inpatient ;
  *and assign the death location ;
  do until (last.patient_id) ;
    set have ;
    by patient_id ;
    if encounter_type='er' and (not missing(inpatientdate)) then do ;
      if (0<=(inpatientdate-admtdate)<=30) then ercount=ercount+1 ;
    end ;   
    if status='died'       then deathlocation=encounter_type ;
  end ;

  output ;

  format inpatientdate yymmdd10. ;
run ;

proc print data=want ;
run ;

Returns:

       patient_
Obs       id       inpatientdate    ercount    deathlocation

 1        1                  .         0         er
 2        2         2020-03-06         1         inpatient
 3        3                  .         0         er
 4        4         2020-05-02         0
BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
epardo
Calcite | Level 5

Thank you for the beautiful solution and for explaining the code. On to the next part of the analysis!

Kurt_Bremser
Super User

This is the array code, adapted to deal with the 30 days window and multiple "inpatient" encounters:

data want;
array visits {&start.:&end.} _temporary_;
set have;
by patient_id;
if first.patient_id
then do;
  call missing(of visits{*});
  newvar1 = .;
  newvar2 = "          ";
end;
visits{admtdate} = (encounter_type = "er");
if status = "died" then newvar2 = encounter_type;
if encounter_type = "inpatient"
then do;
  do i = admtdate - 30 to admtdate - 1;
    newvar1 + sum(0,visits{i});
  end;
  output;
  call missing(of visits{*});
  newvar1 = .;
  newvar2 = "";
end;
if last.patient_id and (newvar1 ne . or newvar2 ne "")
then output;
retain
  newvar2
;
keep patient_id newvar1 newvar2;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 592 views
  • 0 likes
  • 3 in conversation