Hello,
I am new to complex SAS coding and I would appreciate your help to have a code to match cases to controls, while using risk set sampling, and not Propensity score matching.
I have two datasets, cases and controls-please see below some examples. Each dataset has the min(discharge_time) as index_date and I already formatted the date. I also grouped by Patient_id. Each patient_id is a unique patient identifier and the patient can have multiple encounters.
I created age_low and age_high variables, as well as index_date_low and index_date_high in the Controls datasest.
I need to do the following based on the required procedure:
1. Rank the cases by index_date.
2. temp: For each case, pull all encounters within an interval around index date, except for the case (remove the patient_id for the case)
3. From temp, draw encounters that match to sex, hospitalID and age, while taking the closest encounter_id to the index.
4. Randomly sample matching encounters.
Unfortunately, I don't know how to do that. Would you be able to provide a code for this?
Here is an example for each dataset.
Cases dataset is as follows:
EncounterID | Patient_id | age | sex | hospitalID | index_date | Diabetes_Event |
123 | 1 | M | 44 | 20053 | 1 | |
124 | 1 | M | 44 | 20053 | 1 | |
125 | 1 | M | 45 | 20053 | 1 | |
12345 | 2 | F | 67 | 19875 | 1 | |
12367 | 2 | F | 67 | 19875 | 1 |
Controls Dataset is:
EncounterID | Patient_id | age | sex | hospitalID | index_date | Diabetes_Event | Age_low | Age_high | Index_date low |
333 | 4 | M | 40 | 20049 | 0 | 38 | 42 | 20019 | |
33667 | 5 | F | 66 | 19849 | 0 | 64 | 68 | 19819 | |
33885 | 6 | F | 66 | 19849 | 0 | 64 | 68 | 19819 | |
33889 | 6 | F | 66 | 19849 | 0 | 64 | 68 | 19819 | |
33999 | 6 | F | 66 | 19849 | 0 | 64 | 68 | 19819 | |
1238 | 7 | M | 46 | 18920 | 0 | 44 | 48 | 18890 | |
1239 | 7 | M | 46 | 18920 | 0 | 44 | 48 | 18890 | |
1300 | 7 | M | 46 | 18920 | 0 | 44 | 48 | 18890 | |
145 | 8 | F | 30 | 20207 | 0 | 28 | 32 | 20177 | |
146 | 8 | F | 30 | 20207 | 0 | 28 | 32 | 20177 |
Thanks,
Schtroumpfette
If possible, after filling in the AGE, can you clarify the results of each of the processes 1-3 as data, based on the Control data set you presented?
I am not fully aware of what kind of processing you are assuming.
For example, what kind of process is "Rank the cases by index_date"?
Is it enough to rank them in order from 1?
How are the results of this process stored?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.