Hello,
I am new to complex SAS coding and I would appreciate your help to have a code to match cases to controls, while using risk set sampling, and not Propensity score matching.
I have two datasets, cases and controls-please see below some examples. Each dataset has the min(discharge_time) as index_date and I already formatted the date. I also grouped by Patient_id. Each patient_id is a unique patient identifier and the patient can have multiple encounters.
I created age_low and age_high variables, as well as index_date_low and index_date_high in the Controls datasest.
I need to do the following based on the required procedure:
1. Rank the cases by index_date.
2. temp: For each case, pull all encounters within an interval around index date, except for the case (remove the patient_id for the case)
3. From temp, draw encounters that match to sex, hospitalID and age, while taking the closest encounter_id to the index.
4. Randomly sample matching encounters.
Unfortunately, I don't know how to do that. Would you be able to provide a code for this?
Here is an example for each dataset.
Cases dataset is as follows:
EncounterID | Patient_id | age | sex | hospitalID | index_date | Diabetes_Event |
123 | 1 | M | 44 | 20053 | 1 | |
124 | 1 | M | 44 | 20053 | 1 | |
125 | 1 | M | 45 | 20053 | 1 | |
12345 | 2 | F | 67 | 19875 | 1 | |
12367 | 2 | F | 67 | 19875 | 1 |
Controls Dataset is:
EncounterID | Patient_id | age | sex | hospitalID | index_date | Diabetes_Event | Age_low | Age_high | Index_date low |
333 | 4 | M | 40 | 20049 | 0 | 38 | 42 | 20019 | |
33667 | 5 | F | 66 | 19849 | 0 | 64 | 68 | 19819 | |
33885 | 6 | F | 66 | 19849 | 0 | 64 | 68 | 19819 | |
33889 | 6 | F | 66 | 19849 | 0 | 64 | 68 | 19819 | |
33999 | 6 | F | 66 | 19849 | 0 | 64 | 68 | 19819 | |
1238 | 7 | M | 46 | 18920 | 0 | 44 | 48 | 18890 | |
1239 | 7 | M | 46 | 18920 | 0 | 44 | 48 | 18890 | |
1300 | 7 | M | 46 | 18920 | 0 | 44 | 48 | 18890 | |
145 | 8 | F | 30 | 20207 | 0 | 28 | 32 | 20177 | |
146 | 8 | F | 30 | 20207 | 0 | 28 | 32 | 20177 |
Thanks,
Schtroumpfette
If possible, after filling in the AGE, can you clarify the results of each of the processes 1-3 as data, based on the Control data set you presented?
I am not fully aware of what kind of processing you are assuming.
For example, what kind of process is "Rank the cases by index_date"?
Is it enough to rank them in order from 1?
How are the results of this process stored?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.