BookmarkSubscribeRSS Feed
JD
Calcite | Level 5 JD
Calcite | Level 5

Hi,

I need help in merging two SAS data sets:

(1) data set 1 (matched cases and controls): cases & controls matched 1:2 on age groups, chemo, and radiation

(2) data set 2 (main): main data set containing all patients and their characteristics.

 

Data set 1 looks like this, where two rows represent one pair (case and its matched control):

caseid controlid agegrp chemo rad num
0001   00052     45+         1        1     1
0001   00082     45+         1        1     2

0002   00045    25-30      1        0     1
0002   00036    25-30      1        0     2

 

Data set 2 looks like this (main pool of patients with baseline characteristics):

id        stage   er   pr    status
0001     1        0    1       0
0002     2        1    0       1

 

How do I merge data set 1 with the data set 2 to incorporate the baseline characteristics of patients?

I would want to keep the same format as in the dataset 1 (two rows for one pair).

 

Thank you for your help!

Tony



6 REPLIES 6
ballardw
Super User

Easiest for me is

proc sql;
   create table want as
   select dataset1.*, dataset2.*
   from dataset1 left join dataset2 
        on dataset1.Caseid = Dataset2.Id;
quit;

This may reorder your data if the caseids aren't sorted.

 

Astounding
PROC Star

Assuming both of your data sets are sorted, you could try a simple merge:

 

data want;

merge dataset1 dataset2 (rename=(id=caseid));

by caseid;

run;

 

That might be what you are looking for.  But even if it isn't, it would give you a starting point to describe how it is different from what you want.

JD
Calcite | Level 5 JD
Calcite | Level 5

Thanks, Astounding and Ballard!

 

I dont think i properly explained this :smileyhappy :). 

 

By merging the two dataset, I want to create a final dataset which would contain characteristics or cases and their matched controls. 

Since I have two controls per case, my final dataset should contain the sets of matched pairs. For example:

 

row 1     case        characteristics of this case in the next columns

row 2     control    characteristics of this control in the next columns

row 3     control    characteristics of this control in the next column

 

So these three rows make one set of matched case with its controls. 

 

Thanks

Astounding
PROC Star

For that, you will have to spell out exactly which variables you want and what values they should take on for each of your rows.  SAS forces you to keep all variables for all rows.  Just give an example for one ID (3 rows) ... that should be enough.

Reeza
Super User

Did you try the code? 

 

Why do you have 3 rows per case when you had 2 before?

ballardw
Super User

Try this:

data want;
   set dataset2 (in=in1)
       dataset1 
   ;
   if in1 then type='Case';
   else type='Control';
run;
proc sort data=want;
  by id type;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1381 views
  • 0 likes
  • 4 in conversation