Hi,
I need help in merging two SAS data sets:
(1) data set 1 (matched cases and controls): cases & controls matched 1:2 on age groups, chemo, and radiation
(2) data set 2 (main): main data set containing all patients and their characteristics.
Data set 1 looks like this, where two rows represent one pair (case and its matched control):
caseid controlid agegrp chemo rad num
0001 00052 45+ 1 1 1
0001 00082 45+ 1 1 2
0002 00045 25-30 1 0 1
0002 00036 25-30 1 0 2
Data set 2 looks like this (main pool of patients with baseline characteristics):
id stage er pr status
0001 1 0 1 0
0002 2 1 0 1
How do I merge data set 1 with the data set 2 to incorporate the baseline characteristics of patients?
I would want to keep the same format as in the dataset 1 (two rows for one pair).
Thank you for your help!
Tony
Easiest for me is
proc sql;
create table want as
select dataset1.*, dataset2.*
from dataset1 left join dataset2
on dataset1.Caseid = Dataset2.Id;
quit;
This may reorder your data if the caseids aren't sorted.
Assuming both of your data sets are sorted, you could try a simple merge:
data want;
merge dataset1 dataset2 (rename=(id=caseid));
by caseid;
run;
That might be what you are looking for. But even if it isn't, it would give you a starting point to describe how it is different from what you want.
Thanks, Astounding and Ballard!
I dont think i properly explained this :smileyhappy :).
By merging the two dataset, I want to create a final dataset which would contain characteristics or cases and their matched controls.
Since I have two controls per case, my final dataset should contain the sets of matched pairs. For example:
row 1 case characteristics of this case in the next columns
row 2 control characteristics of this control in the next columns
row 3 control characteristics of this control in the next column
So these three rows make one set of matched case with its controls.
Thanks
For that, you will have to spell out exactly which variables you want and what values they should take on for each of your rows. SAS forces you to keep all variables for all rows. Just give an example for one ID (3 rows) ... that should be enough.
Did you try the code?
Why do you have 3 rows per case when you had 2 before?
Try this:
data want;
set dataset2 (in=in1)
dataset1
;
if in1 then type='Case';
else type='Control';
run;
proc sort data=want;
by id type;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.