BookmarkSubscribeRSS Feed
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

What statements do I add after the DO statement to randomly impute gender values of 'F' and 'M' (with 60% of people being female).

 

data imputed;

set missing;

if geneder = ' ' then do;

..........????

..........????

end;

run;

 

 

15 REPLIES 15
anoopmohandas7
Quartz | Level 8

The below sample code should guide you through.

data have ;
input id sex $ ;
datalines ;
23 M
25 M
26 F
27 F
28 F
;
RUN ;

DATA WANT;
length Y $10. ;
set have ;
if sex='M' then do Y='Male' ;
end ;
else if sex = 'F' then do Y='Female' ;
end ;
run ;

proc print data=want;
run;
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

anoopmohandas7, that didn't help. 😞

I need to replace the missing observations by generating Ms and Fs randomly in a 4 to 6 ratio....

anoopmohandas7
Quartz | Level 8
Can you show me how your data looks like and what you expect it to be.
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

Please see PGStats reply for what I was looking for.

Thank you so much for your reply though! 🙂

PGStats
Opal | Level 21

Use the random number generator RAND which returns a random number uniformly distributed between 0 and 1 :

 

data imputed;
set missing;
if missing(gender) then 
	if rand("uniform") < 0.6 then gender = "F"; else gender ="M";
run;
PG
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

Thank you, PG Stats! This does the trick! 😉

 

art297
Opal | Level 21

@PGStats: You're the statistician .. I'm only the Psychologist, but wouldn't LE .5 be the more appropriate cutoff?

 

Art, CEO, AnalystFinder.com

 

K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

My question stated 0.6-PGStats just gave me what I asked for 😉

art297
Opal | Level 21

I hadn't seen your 4 to 6 ratio post. @PGStats: my sincere appologies. I should have know better than to question you 🙂

 

Art, CEO, AnalystFinder.com

PGStats
Opal | Level 21

@art297, you are welcome to question me any time Smiley Happy

PG
Reeza
Super User

@PGStats idea of using RAND is what I'd use, but not a Uniform Distribution, but a Bernoulli.

 

if rand('bernoulli', 0.6) = 1 then sex='F';
else sex='M';

 

PGStats
Opal | Level 21

Same thing, under the hood. You could also use "table".

PG
Reeza
Super User

@PGStats wrote:

Same thing, under the hood. You could also use "table".


I was wondering about that...it's been a while since I've studied random number generators 🙂

PGStats
Opal | Level 21

Of course, I haven't seen what's under the hood, I can only guess. But as far as I know, all RAND generators are based on transformations of the uniform generator.

PG

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 2568 views
  • 1 like
  • 6 in conversation