I am trying to do frequency matching for case control studies. I have results from Dataset A with the frequency distribution for certain variables (Dataset A: Male 30%, Females 70% | Age>=65: 40%, Age <65: 60% | Region: West 10%, Northeast 20%, Southwest 40%, Midwest 25%, Southeast 5%). Please note that I do not have access to Dataset A and just have the frequency distribution for those 3 variables (Gender, Age, Region).
I have Dataset B and I need to create a subset of dataset B which will provide the same frequency distribution for those 3 variables as Dataset A i.e. when I create the subset of Dataset B and run proc freq on age, gender and region it should give the same results as given above for dataset A.
Could you please suggest what is the best way to do that?
Thanks.
%let sample_size=1000 ;
proc plan seed=27371 ;
factors n=&sample_size. ordered sex=10 /noprint;
output out=sex ;
factors n=&sample_size. ordered age=10 /noprint;
output out=age ;
run;
data sex;
set sex;
char_sex=ifc(sex in (1:3),'Male ','Female');
keep char_sex;
run;
data age;
set age;
char_age=ifc(age in (1:4),'Age>=65 ','Age <65');
keep char_age;
run;
data want;
merge sex age;
run;
%let sample_size=1000 ;
proc plan seed=27371 ;
factors sex=&sample_size. /noprint;
output out=sex ;
factors age=&sample_size. /noprint;
output out=age ;
factors Region=&sample_size. /noprint;
output out=Region ;
quit;
data temp;
merge sex age region;
run;
proc rank data=temp out=temp2 groups=100 ;
var sex age region;
ranks r_sex r_age r_region;
run;
data want;
set temp2;
char_sex=ifc(r_sex in (0:29),'Male ','Female');
char_age=ifc(r_age in (0:39),'Age>=65 ','Age <65');
select;
when(r_region in (0:9)) char_Region='West ';
when(r_region in (10:29)) char_Region='Northeast';
when(r_region in (30:69)) char_Region='Southwest';
when(r_region in (70:94)) char_Region='Midwest ';
when(r_region in (95:99)) char_Region='Southeast ';
otherwise;
end;
keep char_:;
run;
proc freq data=want;
table char_:;
run;
Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.
Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.