BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Reader587
Calcite | Level 5

Hi Everyone,

I was wondering how to create dummy variables with "white" being the baseline. I have 5 categories (American Indian, White, Black, Pacific Islanders, Asian).

The variable race is coded (0=Black, 1=White,2=American Indian,3=Asian,4=Native Hawaiian). 

My code is: 

if race=0 then Black=1;
else Black=0;
if race=2 then AmericanIndian=1;
else AmericanIndian=0;
if race=3 then Asian=1;
else Asian=0;
if race=4 then Pacific_Islander=1;
else Pacific_Islander=0;
where White is currently the baseline. However, the variable race have missing data and I'm assuming those missing data are given a value of zero too. 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisHemedinger
Community Manager

For a good article on using a SAS procedure, see this topic from @Rick_SAS :

 

The best way to generate dummy variables in SAS

Register for SAS Innovate 2025!! The premier event for SAS users, May 6-9 in Orlando FL. Sign up now for the best deals!

View solution in original post

12 REPLIES 12
WarrenKuhfeld
Ammonite | Level 13

There are procedures like TRANSREG and GLMMOD that will code for you.

 

If you want to code yourself, recognize that the result of a Boolean (logical) expression is 1 if true and 0 if false. So each IF/ELSE pair could be replaced by a simple assignment statement.

 

Black = (Race = 0);


The parens are not required, but you might want them for clarity.

ChrisHemedinger
Community Manager

For a good article on using a SAS procedure, see this topic from @Rick_SAS :

 

The best way to generate dummy variables in SAS

Register for SAS Innovate 2025!! The premier event for SAS users, May 6-9 in Orlando FL. Sign up now for the best deals!
Reader587
Calcite | Level 5
Thanks. How do I get "White" to be the baseline with missing data? Since it appears "White" and "Missing" would both be coded zero if I code everything else 1.
ChrisHemedinger
Community Manager

I think you might need an "unknown" category in that case:

 

Unknown = missing(race);
Register for SAS Innovate 2025!! The premier event for SAS users, May 6-9 in Orlando FL. Sign up now for the best deals!
subhashmantha
Fluorite | Level 6

data abc;

set abc;

Black = (race=0)*1;

White = (race=1)*1;

AmericanIndian = (race=2)*1;

Asian = (race=3)*1;

NativeHawaiian = (race=4)*1;

run;

 

Reader587
Calcite | Level 5
But "White" is not the baseline here.
PaigeMiller
Diamond | Level 26

Why do you need dummy variables at all? Almost every SAS modeling procedure creates the dummy variables for you behind the scenes so you don't have to, avoiding all of these pitfalls and potential errors. And you can specify which level is the reference (or baseline) level. This is one of the great advantages of using SAS for modeling.

--
Paige Miller
data_null__
Jade | Level 19

PROC TRANSREG and also be useful here.

 

/*0=Black, 1=White,2=American Indian,3=Asian,4=Native Hawaiian). */
data race;
   do race = 0,.,1 to 4;
      output;
      end;
   run;
proc transreg data=race;
   model class(race/zero='1');
   id race;
   output out=design(drop=intercept) design;
   run;

Capture.PNG

 

Or perhaps this one.

/*0=Black, 1=White,2=American Indian,3=Asian,4=Native Hawaiian). */
data race;
   do race = 0,.,1 to 4;
      output;
      end;
   run;
proc transreg data=race;
   model class(race/dev zero='1');
   id race;
   output out=design(drop=intercept) design;
   run;

Capture.PNG

subhashmantha
Fluorite | Level 6

white = (missing(race) or (race=1))*1;

PaigeMiller
Diamond | Level 26

@subhashmantha wrote:

white = (missing(race) or (race=1))*1;


Or don't even try to create the dummy variables yourself, SAS can create them for you from your data by using the CLASS statement in a modeling procedure, so you don't have to create the dummy variables.

--
Paige Miller
ballardw
Super User

@PaigeMiller wrote:

@subhashmantha wrote:

white = (missing(race) or (race=1))*1;


Or don't even try to create the dummy variables yourself, SAS can create them for you from your data by using the CLASS statement in a modeling procedure, so you don't have to create the dummy variables.


@PaigeMiller 

Wouldn't a missing value for a class variable like Race either remove the observation or with the MISSING option treat it as a different level than Race=1? Perhaps this special case of the missing value could be use of the MISSING option plus a custom format to have missing and 1 treated as a single class?

PaigeMiller
Diamond | Level 26

@ballardw 

 

Custom format or custom informat, and then stop writing your own DUMMY variables and use SAS PROCs to compute the dummy variables behind the scenes.

--
Paige Miller

sas-innovate-white.png

🚨 Early Bird Rate Extended!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Lock in the best rate now before the price increases on April 1.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2305 views
  • 6 likes
  • 7 in conversation