BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
RohiniJ
Calcite | Level 5

Hi, 

I have a data set that gives me Race as as R1, R2, R3, R4, and R5. How do I format it to: R1 = American Indian/Alaskan Native, R2 = Asian, R3 = Black or African American, or R4 = Native Hawaiian or Other Pacific Islander, and R5 = White? 

When I used a format code is on variables R1, R2, R3, R4, and R5 either get uninitialized or all the columns comes up as American Indian/Alaskan Native. 

 

This is the code that I use: 

 

data merge_dataset2;
set merge_dataset;
format RACE1-RACE5 $100.;
if RACE1-RACE5 = R1 then RACE = "American Indian/Alaskan Native";
else if RACE1-RACE5 = R2 then RACE = "Asian";
else if RACE1-RACE5 = R3 then RACE = "Black or African American";
else if RACE1-RACE5 = R4 then RACE = "Native Hawaiian or Other Pacific Islander";
else if RACE1-RACE5 = R5 then RACE = "White"; run;

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

A possibly critical discussion first: Do any individuals have values for more than one of the R1 through R5 variables?

Such as R1 and R5 which would indicate the person is both American/Indian and White (mixed or multiracial )?

If so you need to provide a hierarchy or set of rules to determine how all the possible combinations are to be treated.

 

Regardless you should provide at least some dummy data with expected results. From your code I cannot tell if R1 is numeric or character or what the value(s) might look like to suggest working code.

If you looked at your log (ALWAYS look at the log) you would likely see some Conversion to numeric message because this means the variables Race1, Race2, Race3, Race4 and Race5 should be character:

format RACE1-RACE5 $100.;

So when you use " If Race1-Race5 = "  then SAS attempts to convert the variables Race1 and Race5 to numeric to perform arithmetic and do subtraction. Then compare the  result of that same subtraction to 5 other variables.

 

IF the variables R1 through R5 are valued as numeric 1, meaning true or is that race, or 0, meaning not that race and only one of R1 through R5 has a value of 1 then you may be looking for this code:

data merge_dataset2;
   set merge_dataset;
   if      R1 then RACE = "American Indian/Alaskan Native";
   else if R2 then RACE = "Asian";
   else if R3 then RACE = "Black or African American";
   else if R4 then RACE = "Native Hawaiian or Other Pacific Islander";
   else if R5 then RACE = "White"; 
run;

SAS will treat 1 (actually any non-zero) value as true and 0 as false.

 

But if you have other than numeric 1 / 0 values or more than one 1 then you have a (possibly very slightly) more complex problem and we need rules on how to deal with the more than one race and possibly the "none" are indicated.

 

 

 

View solution in original post

2 REPLIES 2
ballardw
Super User

A possibly critical discussion first: Do any individuals have values for more than one of the R1 through R5 variables?

Such as R1 and R5 which would indicate the person is both American/Indian and White (mixed or multiracial )?

If so you need to provide a hierarchy or set of rules to determine how all the possible combinations are to be treated.

 

Regardless you should provide at least some dummy data with expected results. From your code I cannot tell if R1 is numeric or character or what the value(s) might look like to suggest working code.

If you looked at your log (ALWAYS look at the log) you would likely see some Conversion to numeric message because this means the variables Race1, Race2, Race3, Race4 and Race5 should be character:

format RACE1-RACE5 $100.;

So when you use " If Race1-Race5 = "  then SAS attempts to convert the variables Race1 and Race5 to numeric to perform arithmetic and do subtraction. Then compare the  result of that same subtraction to 5 other variables.

 

IF the variables R1 through R5 are valued as numeric 1, meaning true or is that race, or 0, meaning not that race and only one of R1 through R5 has a value of 1 then you may be looking for this code:

data merge_dataset2;
   set merge_dataset;
   if      R1 then RACE = "American Indian/Alaskan Native";
   else if R2 then RACE = "Asian";
   else if R3 then RACE = "Black or African American";
   else if R4 then RACE = "Native Hawaiian or Other Pacific Islander";
   else if R5 then RACE = "White"; 
run;

SAS will treat 1 (actually any non-zero) value as true and 0 as false.

 

But if you have other than numeric 1 / 0 values or more than one 1 then you have a (possibly very slightly) more complex problem and we need rules on how to deal with the more than one race and possibly the "none" are indicated.

 

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 295 views
  • 0 likes
  • 3 in conversation