Hello Community,
I am working with some healthcare data and I would like to create a variable that indicates the total number of healthcare encounters per each patient in my dataset according to 2 conditions:
1) encounter_type=’Telemedicine’ or outpatient=’Yes’
AND
2) intnx('month',index_date,-6,'s') <= encounter_date <= index_date;
Could anyone please help me with a Proc Sql code for doing this? I’ve provided below an example of my current dataset and my desired output. Any help with this would be greatly appreciated!
data have;
input Prescription_date Encounter_ID Index_date Encounter_date Encounter_type outpatient;
datalines;
01 123 15JUN2020 01JAN2020 . Yes
01 425 15JUN2020 02FEB2020 Telemedicine No
01 376 15JUN2020 15MAR2020 . Yes
01 974 15JUN2020 20MAY2020 . No
01 173 15JUN2020 02OCT2020 . Yes
02 298 02DEC2019 01SEP2019 Telemedicine Yes
02 245 02DEC2019 30OCT2019 . Yes
02 846 02DEC2019 01NOV2019 . Yes
02 935 02DEC2019 11NOV2019 . No
02 275 02DEC2019 02DEC2019 Telemedicine No
;Have:
Patient_ID | Encounter_ID | Index_date | Encounter_date | Encounter_type | outpatient |
01 | 123 | 15JUN2020 | 01JAN2020 | . | Yes |
01 | 425 | 15JUN2020 | 02FEB2020 | Telemedicine | No |
01 | 376 | 15JUN2020 | 15MAR2020 | . | Yes |
01 | 974 | 15JUN2020 | 20MAY2020 | . | No |
01 | 173 | 15JUN2020 | 02OCT2020 | . | Yes |
02 | 298 | 02DEC2019 | 01SEP2019 | Telemedicine | Yes |
02 | 245 | 02DEC2019 | 30OCT2019 | . | Yes |
02 | 846 | 02DEC2019 | 01NOV2019 | . | Yes |
02 | 935 | 02DEC2019 | 11NOV2019 | . | No |
02 | 275 | 02DEC2019 | 02DEC2019 | Telemedicine | No |
Want:
Patient_ID | New_var |
01 | 2 |
02 | 4 |
data have;
infile cards expandtabs truncover;
input Patient_ID Encounter_ID Index_date :date9. Encounter_date :date9. Encounter_type :$20. outpatient :$20.;
format Index_date Encounter_date :date9.;
datalines;
01 123 15JUN2020 01JAN2020 . Yes
01 425 15JUN2020 02FEB2020 Telemedicine No
01 376 15JUN2020 15MAR2020 . Yes
01 974 15JUN2020 20MAY2020 . No
01 173 15JUN2020 02OCT2020 . Yes
02 298 02DEC2019 01SEP2019 Telemedicine Yes
02 245 02DEC2019 30OCT2019 . Yes
02 846 02DEC2019 01NOV2019 . Yes
02 935 02DEC2019 11NOV2019 . No
02 275 02DEC2019 02DEC2019 Telemedicine No
;
proc sql;
select Patient_ID,
sum((intnx('month',index_date,-6,'s')<=encounter_date<=index_date) and
(encounter_type='Telemedicine' or outpatient='Yes') ) as New_var
from have
group by Patient_ID ;
quit;
data have;
infile cards expandtabs truncover;
input Patient_ID Encounter_ID Index_date :date9. Encounter_date :date9. Encounter_type :$20. outpatient :$20.;
format Index_date Encounter_date :date9.;
datalines;
01 123 15JUN2020 01JAN2020 . Yes
01 425 15JUN2020 02FEB2020 Telemedicine No
01 376 15JUN2020 15MAR2020 . Yes
01 974 15JUN2020 20MAY2020 . No
01 173 15JUN2020 02OCT2020 . Yes
02 298 02DEC2019 01SEP2019 Telemedicine Yes
02 245 02DEC2019 30OCT2019 . Yes
02 846 02DEC2019 01NOV2019 . Yes
02 935 02DEC2019 11NOV2019 . No
02 275 02DEC2019 02DEC2019 Telemedicine No
;
proc sql;
select Patient_ID,
sum((intnx('month',index_date,-6,'s')<=encounter_date<=index_date) and
(encounter_type='Telemedicine' or outpatient='Yes') ) as New_var
from have
group by Patient_ID ;
quit;
@Ksharp This is perfect, thank you!
Why isn't the count for ID 1 three?
For ID 1 I see one telemedicine and three outpatient encounters for a count of 4, but then the last one has an encounter date which isn't less than the index date.
Given a data step of
data have;
input patient_id encounter_id
index_date:anydtdte12. encounter_date:anydtdte12.
encounter_type:$20. outpatient:$3.;
format index_date encounter_date date10.;
datalines;
1 123 15-Jun-20 1-Jan-20 M Yes
1 425 15-Jun-20 2-Feb-20 Telemedicine No
1 376 15-Jun-20 15-Mar-20 M Yes
1 974 15-Jun-20 20-May-20 M No
1 173 15-Jun-20 2-Oct-20 M Yes
2 298 2-Dec-19 1-Sep-19 Telemedicine Yes
2 245 2-Dec-19 30-Oct-19 M Yes
2 846 2-Dec-19 1-Nov-19 M Yes
2 935 2-Dec-19 11-Nov-19 M No
2 275 2-Dec-19 2-Dec-19 Telemedicine No
;
run;
I think SQL like
proc sql;
select patient_id,
sum(case
when encounter_type= "Telemedicine" or outpatient= "Yes"
then 1
else 0
end) as new_var
from have
where
intnx('month',index_date,-6,'s') <= encounter_date <= index_date
group by patient_id;
quit;
produces
| patient_id | new_var |
|---|---|
| 1 | 3 |
| 2 | 4 |
@HB the first entry for ID 1 has an encounter date more than 6 months before the index date. Therefore, the two dates for this observation are out of the specified range and would not count. Does this makes sense? I currently have the following code below. Do you think this is right?
proc sql;
select Patient_ID,
sum((intnx('month',index_date,-6,'s')<=encounter_date<=index_date) and
(encounter_type='Telemedicine' or outpatient='Yes') ) as New_var
from have
group by Patient_ID ;
quit;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.