I have a dataset from a survey. The question about gender is:
What is your gender?
1 Male
2 Female
3 Transgender
99 Prefer not to answer
The frequency (cell-count) for option 3 (transgender ) is less than 10. I would like to assign value of option 3 (transgender) to option 1 or 2 randomly. Any one has suggestions on how to randomly assigning a small category to other categories?
Thanks for your help in advance!
data want;
set have;
if gender=3 then gender = rand('uniform')<0.5 + 1;
run;
I add that I am skeptical about this being a statistically valid thing to do, you should be concerned about that and entertain (perhaps) more valid ways to handle this, such as just leaving gender=3 out of your analysis. In the end, you (not anyone else) have to vouch for this being an acceptable method.
My mistake. I left out parentheses. Try it this way
data want;
set have;
if gender=3 then gender = (rand('uniform')<0.5) + 1;
run;
Thanks!
That works.
@ting1 wrote:
I would like to assign value of option 3 (transgender) to option 1 or 2 randomly. Any one has suggestions on how to randomly assigning a small category to other categories?
From a representation and inclusion standpoint, this is not something that should be done and is in fact the opposite. Please do not do this. If you do put a giant warning around your analysis so that people know you did this and your results are not representative. Also, do not collect information if you don't know how you're planning to use it.
I agree with @Reeza . IF I were forced to treat your 3 as a different value on that scale then I would be more likely to combine the 99 and 3 codes. Which would be easily done with a format and not lose any information in the data:
Proc format;
value gender_r
1='Male'
2='Female'
3,99='Trans/ Prefer not to Answer' /* or some other text */
;
Then use that format gender_r with your gender variable for any summary statistics, reporting or graphing purpose in SAS procedures.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.