BookmarkSubscribeRSS Feed
Schtroumpfette
Obsidian | Level 7

Hello,

 

I am new to complex SAS coding and I would appreciate your help to have a code to match cases to controls, while using risk set sampling, and not Propensity score matching. 

 

I have two datasets, cases and controls-please see below some examples.  Each dataset has the min(discharge_time) as index_date and I already formatted the date.  I also grouped by Patient_id.  Each patient_id is a unique patient identifier and the patient can have multiple encounters. 

 

I created age_low and age_high variables, as well as index_date_low and index_date_high in the Controls datasest. 

 

  • A control may become a case, as long as it is prior to the index date of the would be case. 
  • I can select the controls within a range of the index date (example between index_date_low and index_date of the case) to provide a suitable range for index_date. 
  • I also added age_low and age_high to have a range for age
  • I need to match exactly on sex and hospital_id.
  • I would keep all cases even if the case has only 1 control. 

 

I need to do the following based on the required procedure:

1. Rank the cases by index_date.

2. temp: For each case, pull all encounters within an interval around index date, except for the case (remove the patient_id for the case)

3. From temp, draw encounters that match to sex, hospitalID and age, while taking the closest encounter_id to the index. 

4. Randomly sample matching encounters.

 

Unfortunately, I don't know how to do that.  Would you be able to provide a code for this? 

 

Here is an example for each dataset. 

Cases dataset is as follows:

 

EncounterIDPatient_idagesexhospitalIDindex_dateDiabetes_Event
1231 M44200531
1241 M44200531
1251 M45200531
123452 F67198751
123672 F67198751

 

Controls Dataset is:

EncounterIDPatient_idagesexhospitalIDindex_dateDiabetes_EventAge_lowAge_highIndex_date low 
3334 M40200490384220019
336675 F66198490646819819
338856 F66198490646819819
338896 F66198490646819819
339996 F66198490646819819
12387 M46189200444818890
12397 M46189200444818890
13007 M46189200444818890
1458 F30202070283220177
1468 F30202070283220177

 

Thanks, 

Schtroumpfette

2 REPLIES 2
japelin
Rhodochrosite | Level 12

If possible, after filling in the AGE, can you clarify the results of each of the processes 1-3 as data, based on the Control data set you presented?


I am not fully aware of what kind of processing you are assuming.

 

For example, what kind of process is "Rank the cases by index_date"?
Is it enough to rank them in order from 1?
How are the results of this process stored?

 

Schtroumpfette
Obsidian | Level 7
Thanks Kawakami,

I will get back to you with more information, hopefully shortly. There was a change of plan to this.

Thanks,

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 831 views
  • 0 likes
  • 2 in conversation