Hi everyone,
I am working on a case-control study in which I need to create a survey ID variable for records. For every case in dataset1, I pull 3 controls randomly from a source dataset to create dataset2 (this is being done via proc survey select) resulting in a dataset of cases (dataset1) and a dataset of controls (dataset2). The survey ID variable naming scheme should be as follows:
Case1: 12345-1
Control1: 12345-1-1
Control2: 12345-1-2
Control3: 12345-1-3
Case 2: 12345-2
Control1: 12345-2-1
Control2: 12345-2-2
Control3: 12345-2-3
This naming scheme can be applied to each dataset separately or can be applied to the combined dataset of all records. A key detail is that the controls are simply frequency matches and NOT paired matches. How would you all code this?
Thanks!
Keep the data set long, not wide. You don't want separate variables here, you want separate records.
Beyond that, a more concrete explanation of the data, or showing us a portion of what the data set would look like, would be a great help.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.