Hi! I used data merge to match two tables with a ratio of 1:4, but the merged table presented a ratio of 1:6 or even more. I would like to only keep the first 4 rows for each case ID. How could I do it?
Here are the codes that I used for data merge:
proc sql ;
create view _controls
as select control_ID,agegrp, sex, A, B, C, rand('uniform') as random1, year
from &controls
order by year,agegrp, sex, A, B, C,random1;
quit;
data _cases;
set &cases;
do rep=1 to 4;
random2=rand('uniform');
output;
end;
keep case_id year agegrp sex A B C rep random2;
run;
proc sort data=_cases;
by year agegrp sex A B C rep random2;
run;
data &dataout &missing;
merge _cases(in=in1) _controls(in=in2);
by year agegrp sex A B C;
if in1=1 and in2=1 then output &dataout;
else if in1=1 and in2=0 then output &missing;
run;
proc sort data=&dataout;
by case_id;
run;
then the merged table looks like this:
control_id case_id rep
0001 1 1
0002 1 2
0003 1 3
0004 1 4
0005 1 4
0006 1 4
0007 2 1
0008 2 2
0009 2 3
0010 2 4
0011 2 4
0012 3 1
I only want the first 4 rows for each unique case_id.
Thanks!
data want;
set &dataout.;
by case_id;
if first.case_id
then count = 1;
else count + 1;
if count le 4;
drop count;
run;
data want;
set &dataout.;
by case_id;
if first.case_id
then count = 1;
else count + 1;
if count le 4;
drop count;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.