# How to randomly impute gender values

What statements do I add after the DO statement to randomly impute gender values of 'F' and 'M' (with 60% of people being female).

data imputed;

set missing;

if geneder = ' ' then do;

..........????

..........????

end;

run;

## Re: how do I do this?

The below sample code should guide you through.

``````data have ;
input id sex \$ ;
datalines ;
23 M
25 M
26 F
27 F
28 F
;
RUN ;

DATA WANT;
length Y \$10. ;
set have ;
if sex='M' then do Y='Male' ;
end ;
else if sex = 'F' then do Y='Female' ;
end ;
run ;

proc print data=want;
run;``````
## Re: how do I do this?

anoopmohandas7, that didn't help.

I need to replace the missing observations by generating Ms and Fs randomly in a 4 to 6 ratio....

## Re: how do I do this?

Can you show me how your data looks like and what you expect it to be.
Posts: 4,920

## Re: how do I do this?

Use the random number generator RAND which returns a random number uniformly distributed between 0 and 1 :

``````data imputed;
set missing;
if missing(gender) then
if rand("uniform") < 0.6 then gender = "F"; else gender ="M";
run;
``````
## Re: how do I do this?

Thank you, PG Stats! This does the trick!

## Re: how do I do this?

@PGStats: You're the statistician .. I'm only the Psychologist, but wouldn't LE .5 be the more appropriate cutoff?

## Re: how do I do this?

My question stated 0.6-PGStats just gave me what I asked for

## Re: how do I do this?

I hadn't seen your 4 to 6 ratio post. @PGStats: my sincere appologies. I should have know better than to question you

Posts: 4,920

## Re: how do I do this?

@art297, you are welcome to question me any time

## Re: how do I do this?

@PGStats idea of using RAND is what I'd use, but not a Uniform Distribution, but a Bernoulli.

``````if rand('bernoulli', 0.6) = 1 then sex='F';
else sex='M';``````

Posts: 4,920

## Re: how do I do this?

Same thing, under the hood. You could also use "table".

## Re: how do I do this?

PGStats wrote:

Same thing, under the hood. You could also use "table".

I was wondering about that...it's been a while since I've studied random number generators