Hello Great SAS community experts!!
I have a dataset with hospital admissions and discharges by different service levels. I am trying to answer multiple questions with this dataset and am hoping that the great minds here can help me code efficiently and smartly.
First the questions: There are many questions I would like to be able to answer, but I will start with two. 1) how many unique people who have an inpatient (service=IN) discharge are admitted to an outpatient (service=OU) within in 30 days of the inpatient discharge I would like to know this by each hospital as well as overall. So if a patient visited Hospital A and was admitted and discharged under both the cateogries in hospital A I would give credit to hospital A. Also if a client was discharged from inpatient from Hospital F but had their admission at hospital G I would give Hospital F the credit. The other question (2) would be how many people had multiple inpatient discharge and outpatient admissions again looking at it both ways, within the same hospital and overall.
Here is the date set example. following this I have an example of what I think the final dataset would look like. But I am open to ideas.
data have; length Hosp $1. ID $3. service $2. ; input Hosp ID service admission :mmddyy10. discharge :mmddyy10.; format admission mmddyy10. discharge mmddyy10.; datalines; A 123 IN 01/01/2019 01/05/2019 A 123 ER 04/01/2019 04/03/2019 A 123 OU 06/05/2019 06/30/2019 A 123 OU 08/01/2019 08/29/2019 A 123 ER 09/01/2019 09/05/2019 A 123 IN 10/01/2019 10/05/2019 A 456 IN 01/05/2019 01/15/2019 A 456 OU 01/25/2019 01/30/2019 A 456 IN 02/05/2019 02/15/2019 A 456 ER 02/16/2018 02/18/2019 B 234 IN 01/01/2019 01/05/2019 B 234 ER 04/01/2019 04/03/2019 B 234 IN 06/05/2019 06/30/2019 B 234 IN 08/01/2019 08/29/2019 B 234 OU 09/01/2019 09/05/2019 B 234 IN 10/01/2019 10/05/2019 B 345 IN 10/01/2019 10/05/2019 B 345 OU 12/15/2019 12/16/2019 C 567 IN 02/03/2019 02/05/2019 C 567 ER 02/07/2019 02/15/2019 C 567 OU 02/20/2019 03/01/2019 C 567 OU 03/06/2019 03/10/2019 D 567 IN 04/01/2019 04/03/2019 D 567 OU 04/15/2019 04/16/2019 E 789 OU 05/01/2019 05/15/2019 E 789 IN 07/01/2019 07/15/2019 E 789 OU 08/01/2019 08/13/2019 F 234 IN 04/05/2019 04/15/2019 F 234 ER 01/25/2019 01/30/2019 G 234 OU 04/30/2019 05/01/2019 ; run;
I would think that maybe if we can get to the table below, where I have a flag for an inpatient row that has the 30 days outpatient visit, and then another indicator that would flag the 30 day outpatient. That way I may be able to run a sql program including the indicators to answer the above 2 questions within the hospital and overall, right?
HOSP | ID | Service | admission | discharge | inpatient with 30 day outpatient | the 30 day outpatient |
A | 123 | IN | 1/1/2019 | 1/5/2019 | 0 | 0 |
A | 123 | ER | 4/1/2019 | 4/3/2019 | 0 | 0 |
A | 123 | OU | 6/5/2019 | 6/30/2019 | 0 | 0 |
A | 123 | OU | 8/1/2019 | 8/29/2019 | 0 | 0 |
A | 123 | ER | 9/1/2019 | 9/5/2019 | 0 | 0 |
A | 123 | IN | 10/1/2019 | 10/5/2019 | 0 | 0 |
A | 456 | IN | 1/5/2019 | 1/15/2019 | 1 | 0 |
A | 456 | OU | 1/25/2019 | 1/30/2019 | 0 | 1 |
A | 456 | IN | 2/5/2019 | 2/15/2019 | 0 | 0 |
A | 456 | ER | 2/16/2018 | 2/18/2019 | 0 | 0 |
B | 234 | IN | 1/1/2019 | 1/5/2019 | 0 | 0 |
B | 234 | ER | 4/1/2019 | 4/3/2019 | 0 | 0 |
B | 234 | IN | 6/5/2019 | 6/30/2019 | 0 | 0 |
B | 234 | IN | 8/1/2019 | 8/29/2019 | 1 | 0 |
B | 234 | OU | 9/1/2019 | 9/5/2019 | 0 | 1 |
B | 234 | IN | 10/1/2019 | 10/5/2019 | 0 | 0 |
B | 345 | IN | 10/1/2019 | 10/5/2019 | 1 | 0 |
B | 345 | OU | 10/15/2019 | 10/16/2019 | 0 | 1 |
C | 567 | IN | 2/3/2019 | 2/5/2019 | 1 | 0 |
C | 567 | ER | 2/7/2019 | 2/15/2019 | 0 | 0 |
C | 567 | OU | 2/20/2019 | 3/1/2019 | 0 | 1 |
C | 567 | OU | 3/6/2019 | 3/10/2019 | 0 | 0 |
D | 567 | IN | 4/1/2019 | 4/3/2019 | 1 | 0 |
D | 567 | OU | 4/15/2019 | 4/16/2019 | 0 | 1 |
E | 789 | OU | 5/1/2019 | 5/15/2019 | 0 | 0 |
E | 789 | IN | 7/1/2019 | 7/15/2019 | 1 | 0 |
E | 789 | OU | 8/1/2019 | 8/13/2019 | 0 | 1 |
F | 234 | IN | 4/5/2019 | 4/15/2019 | 1 | 0 |
F | 234 | ER | 1/25/2019 | 1/30/2019 | 0 | 0 |
G | 234 | OU | 4/30/2019 | 5/1/2019 | 0 | 1 |
To me it seems that you can get the answers you want directly in SQL.
You may want to start with finding all the valid combinations of discharge from IN and admission to OUT:
proc sql;
create table all_in2out as select
in.id,in.hosp as in_hosp,in.admission as in_adm, in.discharge as in_disc,
out.hosp as out_hosp, out.admission as out_adm, out.discharge as out_disc
from have in join have out
on in.id=out.id and
out.admission between in.discharge and in.discharge+30 and
in.service='IN' and
out.service='OU'
;
quit;
It should then be easy to find the answers to your questions, like
how many unique people who have an inpatient (service=IN) discharge are admitted to an outpatient (service=OU) within in 30 days of the inpatient discharge
select count(distinct id) from all_in2out;
I would like to know this by each hospital as well as overall.
select in_hosp,count(distinct id) from all_in2out
group by in_hosp;
how many people had multiple inpatient discharge and outpatient admissions
select count(*) from
(select distinct id from all_in2out
group by id
having count(*)>1);
Just be aware that real-life data do not always come as clean as your example - there may be cases where a patient has been discharged from IN and the admitted to more than one OU placement within the same 30 days. And even if you could solve that by only including the first OU admission in the ALL_IN2OUT table, you may still get the problem that a patient was admitted to two different OU records on the same day. So you should take some time to study the real-life data and look for such problems.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.