BookmarkSubscribeRSS Feed
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

What statements do I add after the DO statement to randomly impute gender values of 'F' and 'M' (with 60% of people being female).

 

data imputed;

set missing;

if geneder = ' ' then do;

..........????

..........????

end;

run;

 

 

15 REPLIES 15
anoopmohandas7
Quartz | Level 8

The below sample code should guide you through.

data have ;
input id sex $ ;
datalines ;
23 M
25 M
26 F
27 F
28 F
;
RUN ;

DATA WANT;
length Y $10. ;
set have ;
if sex='M' then do Y='Male' ;
end ;
else if sex = 'F' then do Y='Female' ;
end ;
run ;

proc print data=want;
run;
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

anoopmohandas7, that didn't help. 😞

I need to replace the missing observations by generating Ms and Fs randomly in a 4 to 6 ratio....

anoopmohandas7
Quartz | Level 8
Can you show me how your data looks like and what you expect it to be.
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

Please see PGStats reply for what I was looking for.

Thank you so much for your reply though! 🙂

PGStats
Opal | Level 21

Use the random number generator RAND which returns a random number uniformly distributed between 0 and 1 :

 

data imputed;
set missing;
if missing(gender) then 
	if rand("uniform") < 0.6 then gender = "F"; else gender ="M";
run;
PG
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

Thank you, PG Stats! This does the trick! 😉

 

art297
Opal | Level 21

@PGStats: You're the statistician .. I'm only the Psychologist, but wouldn't LE .5 be the more appropriate cutoff?

 

Art, CEO, AnalystFinder.com

 

K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

My question stated 0.6-PGStats just gave me what I asked for 😉

art297
Opal | Level 21

I hadn't seen your 4 to 6 ratio post. @PGStats: my sincere appologies. I should have know better than to question you 🙂

 

Art, CEO, AnalystFinder.com

PGStats
Opal | Level 21

@art297, you are welcome to question me any time Smiley Happy

PG
Reeza
Super User

@PGStats idea of using RAND is what I'd use, but not a Uniform Distribution, but a Bernoulli.

 

if rand('bernoulli', 0.6) = 1 then sex='F';
else sex='M';

 

PGStats
Opal | Level 21

Same thing, under the hood. You could also use "table".

PG
Reeza
Super User

@PGStats wrote:

Same thing, under the hood. You could also use "table".


I was wondering about that...it's been a while since I've studied random number generators 🙂

PGStats
Opal | Level 21

Of course, I haven't seen what's under the hood, I can only guess. But as far as I know, all RAND generators are based on transformations of the uniform generator.

PG

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 2730 views
  • 1 like
  • 6 in conversation