I am working on my doctoral dissertation right now. I have my dataset *wide format, 12,000 subjects) which contains Subject ID, race age and sex for my research study. My goal is to randomize the Subject Ids in my dataset into 2 equal groups (1:1 ratio) and assign each subject to a treatment arm (Group A or Group B) for each subject in my dataset. I would like to have the treatment arm to be closely balanced in race (White, Black, Asian, Other), age (continuous variable) and sex (binary). I assume I need to recode age into age groups of some sort to help with the balance. I have been reading up on randomization methods in SAS and looks like PROC PLAN or PROC SURVEYSELECT is the way to do it. At the moment, I can't figure out the differences between the two approaches, which approach fits would get this task done correctly and how exactly to write my code to create this treatment arm variable. So I was wondering if someone can help me figure out how to do this? I haven't done any type of randomizations before, so I am completely lost here.
Screenshot below is my example, I have the dataset, I want to randomize 12,000 subjects into either group A or Group B and have a that as variable treatment arm. When comparing the group A vs group B: t-test (age) or chi-square (Race/sex), p-value are non-significant. The goal is to make the 2 groups as comparable as possible (e.g. Non-Sig). I know it won't be a perfect non-sig P-value, but overall want it as P>0.05.
An example:
data heart;
set sashelp.heart;
agegrp = int(AgeAtStart/20);
run;
proc sort data=heart; by Sex agegrp Smoking_Status; run;
proc surveyselect data=heart out=assigned groups=2 seed=96876;
where Smoking_Status is not missing;
strata Sex agegrp Smoking_Status;
run;
proc freq data=assigned;
tables (Sex agegrp Smoking_Status) * GroupID / list nocum;
run;
Can you post some sample of your data that represents your actual data and a description of your desired result? Makes it much easier to provide a usable code answer.
Added to my post
“I would like to have the treatment arm to be closely balanced in race (White, Black, Asian, Other), age (continuous variable) and sex (binary). ”
Can you post an example to explain above.
The following code could "randomize the Subject Ids in my dataset into 2 equal groups (1:1 ratio) ":
proc surveyselect data=sashelp.class out=want group=2 seed=123;
run;
thanks, in this case, how do I account for characteristics to ensure the 2 groups are the comparable in terms of their demographics in this randomization code? Overall, after the randomization, I would want to show that randomization work.
Added to my post
It's better to add as text not images please.
Do you want it balanced or proportional?
https://documentation.sas.com/doc/en/statcdc/14.2/statug/statug_surveyselect_examples04.htm
An example:
data heart;
set sashelp.heart;
agegrp = int(AgeAtStart/20);
run;
proc sort data=heart; by Sex agegrp Smoking_Status; run;
proc surveyselect data=heart out=assigned groups=2 seed=96876;
where Smoking_Status is not missing;
strata Sex agegrp Smoking_Status;
run;
proc freq data=assigned;
tables (Sex agegrp Smoking_Status) * GroupID / list nocum;
run;
Thanks for your help on this. I've adapted your code and looks like it works!
Calling @Rick_SAS
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.