05-20-2016 01:12 PM
I need help in merging two SAS data sets:
(1) data set 1 (matched cases and controls): cases & controls matched 1:2 on age groups, chemo, and radiation
(2) data set 2 (main): main data set containing all patients and their characteristics.
Data set 1 looks like this, where two rows represent one pair (case and its matched control):
caseid controlid agegrp chemo rad num
0001 00052 45+ 1 1 1
0001 00082 45+ 1 1 2
0002 00045 25-30 1 0 1
0002 00036 25-30 1 0 2
Data set 2 looks like this (main pool of patients with baseline characteristics):
id stage er pr status
0001 1 0 1 0
0002 2 1 0 1
How do I merge data set 1 with the data set 2 to incorporate the baseline characteristics of patients?
I would want to keep the same format as in the dataset 1 (two rows for one pair).
Thank you for your help!
05-20-2016 01:58 PM
Easiest for me is
proc sql; create table want as select dataset1.*, dataset2.* from dataset1 left join dataset2 on dataset1.Caseid = Dataset2.Id; quit;
This may reorder your data if the caseids aren't sorted.
05-20-2016 02:03 PM
Assuming both of your data sets are sorted, you could try a simple merge:
merge dataset1 dataset2 (rename=(id=caseid));
That might be what you are looking for. But even if it isn't, it would give you a starting point to describe how it is different from what you want.
05-20-2016 02:29 PM
Thanks, Astounding and Ballard!
I dont think i properly explained this :smileyhappy .
By merging the two dataset, I want to create a final dataset which would contain characteristics or cases and their matched controls.
Since I have two controls per case, my final dataset should contain the sets of matched pairs. For example:
row 1 case characteristics of this case in the next columns
row 2 control characteristics of this control in the next columns
row 3 control characteristics of this control in the next column
So these three rows make one set of matched case with its controls.
05-20-2016 02:35 PM
For that, you will have to spell out exactly which variables you want and what values they should take on for each of your rows. SAS forces you to keep all variables for all rows. Just give an example for one ID (3 rows) ... that should be enough.