Hi
I am a new SAS user. I have been trying to match cancer cases to controls in a ratio of 1:5. I have sucessfully matched them in one data set that looks like this,
study_id control_id num
11 13 1
11 111 2
11 112 3
11 113 4
11 13 5
14 15 1
14 16 2
14 19 3
14 20 4
14 115 5
29 30 1
29 31 2
29 32 3
29 129 4
2 131 5
However, I would like to combine the study_id and control_id in one column, and also generate a new variable called set_id that appears as in the table below (this will be used as the strata in subsequent analysis). How do i transform my data to have the variables ID and SET_ID as in the table below? Thanks
ID SET_ID CANCER
11 1 1
13 1 0
111 1 0
112 1 0
113 1 0
14 2 1
15 2 0
16 2 0
19 2 0
20 2 0
115 2 0
29 3 1
30 3 0
31 3 0
32 3 0
129 3 0
131 3 0
If the dataset is sorted by study_id, try this:
data want;
set have;
by study_id;
if first.study_id then do;
set_id + 1;
id = study_id;
cancer = 1;
end;
else do;
id = control_id;
cancer = 0;
end;
drop study_id control_id num;
run;
If the dataset is sorted by study_id, try this:
data want;
set have;
by study_id;
if first.study_id then do;
set_id + 1;
id = study_id;
cancer = 1;
end;
else do;
id = control_id;
cancer = 0;
end;
drop study_id control_id num;
run;
data have;
input study_id control_id num;
cards;
11 13 1
11 111 2
11 112 3
11 113 4
11 13 5
14 15 1
14 16 2
14 19 3
14 20 4
14 115 5
29 30 1
29 31 2
29 32 3
29 129 4
29 131 5
;
data want;
set have;
by study_id;
set_id+first.study_id;
if first.study_id then do;
id=study_id;cancer=1;output;
end;
id=control_id;cancer=0;output;
keep id set_id cancer;
run;
Thank you so much, this works too
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.