BookmarkSubscribeRSS Feed
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

What statements do I add after the DO statement to randomly impute gender values of 'F' and 'M' (with 60% of people being female).

 

data imputed;

set missing;

if geneder = ' ' then do;

..........????

..........????

end;

run;

 

 

15 REPLIES 15
anoopmohandas7
Quartz | Level 8

The below sample code should guide you through.

data have ;
input id sex $ ;
datalines ;
23 M
25 M
26 F
27 F
28 F
;
RUN ;

DATA WANT;
length Y $10. ;
set have ;
if sex='M' then do Y='Male' ;
end ;
else if sex = 'F' then do Y='Female' ;
end ;
run ;

proc print data=want;
run;
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

anoopmohandas7, that didn't help. 😞

I need to replace the missing observations by generating Ms and Fs randomly in a 4 to 6 ratio....

anoopmohandas7
Quartz | Level 8
Can you show me how your data looks like and what you expect it to be.
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

Please see PGStats reply for what I was looking for.

Thank you so much for your reply though! 🙂

PGStats
Opal | Level 21

Use the random number generator RAND which returns a random number uniformly distributed between 0 and 1 :

 

data imputed;
set missing;
if missing(gender) then 
	if rand("uniform") < 0.6 then gender = "F"; else gender ="M";
run;
PG
K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

Thank you, PG Stats! This does the trick! 😉

 

art297
Opal | Level 21

@PGStats: You're the statistician .. I'm only the Psychologist, but wouldn't LE .5 be the more appropriate cutoff?

 

Art, CEO, AnalystFinder.com

 

K_S
Obsidian | Level 7 K_S
Obsidian | Level 7

My question stated 0.6-PGStats just gave me what I asked for 😉

art297
Opal | Level 21

I hadn't seen your 4 to 6 ratio post. @PGStats: my sincere appologies. I should have know better than to question you 🙂

 

Art, CEO, AnalystFinder.com

PGStats
Opal | Level 21

@art297, you are welcome to question me any time Smiley Happy

PG
Reeza
Super User

@PGStats idea of using RAND is what I'd use, but not a Uniform Distribution, but a Bernoulli.

 

if rand('bernoulli', 0.6) = 1 then sex='F';
else sex='M';

 

PGStats
Opal | Level 21

Same thing, under the hood. You could also use "table".

PG
Reeza
Super User

@PGStats wrote:

Same thing, under the hood. You could also use "table".


I was wondering about that...it's been a while since I've studied random number generators 🙂

PGStats
Opal | Level 21

Of course, I haven't seen what's under the hood, I can only guess. But as far as I know, all RAND generators are based on transformations of the uniform generator.

PG

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 1750 views
  • 1 like
  • 6 in conversation