Hi,
I have a data set that gives me Race as as R1, R2, R3, R4, and R5. How do I format it to: R1 = American Indian/Alaskan Native, R2 = Asian, R3 = Black or African American, or R4 = Native Hawaiian or Other Pacific Islander, and R5 = White?
When I used a format code is on variables R1, R2, R3, R4, and R5 either get uninitialized or all the columns comes up as American Indian/Alaskan Native.
This is the code that I use:
data merge_dataset2;
set merge_dataset;
format RACE1-RACE5 $100.;
if RACE1-RACE5 = R1 then RACE = "American Indian/Alaskan Native";
else if RACE1-RACE5 = R2 then RACE = "Asian";
else if RACE1-RACE5 = R3 then RACE = "Black or African American";
else if RACE1-RACE5 = R4 then RACE = "Native Hawaiian or Other Pacific Islander";
else if RACE1-RACE5 = R5 then RACE = "White"; run;
Thanks
A possibly critical discussion first: Do any individuals have values for more than one of the R1 through R5 variables?
Such as R1 and R5 which would indicate the person is both American/Indian and White (mixed or multiracial )?
If so you need to provide a hierarchy or set of rules to determine how all the possible combinations are to be treated.
Regardless you should provide at least some dummy data with expected results. From your code I cannot tell if R1 is numeric or character or what the value(s) might look like to suggest working code.
If you looked at your log (ALWAYS look at the log) you would likely see some Conversion to numeric message because this means the variables Race1, Race2, Race3, Race4 and Race5 should be character:
format RACE1-RACE5 $100.;
So when you use " If Race1-Race5 = " then SAS attempts to convert the variables Race1 and Race5 to numeric to perform arithmetic and do subtraction. Then compare the result of that same subtraction to 5 other variables.
IF the variables R1 through R5 are valued as numeric 1, meaning true or is that race, or 0, meaning not that race and only one of R1 through R5 has a value of 1 then you may be looking for this code:
data merge_dataset2; set merge_dataset; if R1 then RACE = "American Indian/Alaskan Native"; else if R2 then RACE = "Asian"; else if R3 then RACE = "Black or African American"; else if R4 then RACE = "Native Hawaiian or Other Pacific Islander"; else if R5 then RACE = "White"; run;
SAS will treat 1 (actually any non-zero) value as true and 0 as false.
But if you have other than numeric 1 / 0 values or more than one 1 then you have a (possibly very slightly) more complex problem and we need rules on how to deal with the more than one race and possibly the "none" are indicated.
A possibly critical discussion first: Do any individuals have values for more than one of the R1 through R5 variables?
Such as R1 and R5 which would indicate the person is both American/Indian and White (mixed or multiracial )?
If so you need to provide a hierarchy or set of rules to determine how all the possible combinations are to be treated.
Regardless you should provide at least some dummy data with expected results. From your code I cannot tell if R1 is numeric or character or what the value(s) might look like to suggest working code.
If you looked at your log (ALWAYS look at the log) you would likely see some Conversion to numeric message because this means the variables Race1, Race2, Race3, Race4 and Race5 should be character:
format RACE1-RACE5 $100.;
So when you use " If Race1-Race5 = " then SAS attempts to convert the variables Race1 and Race5 to numeric to perform arithmetic and do subtraction. Then compare the result of that same subtraction to 5 other variables.
IF the variables R1 through R5 are valued as numeric 1, meaning true or is that race, or 0, meaning not that race and only one of R1 through R5 has a value of 1 then you may be looking for this code:
data merge_dataset2; set merge_dataset; if R1 then RACE = "American Indian/Alaskan Native"; else if R2 then RACE = "Asian"; else if R3 then RACE = "Black or African American"; else if R4 then RACE = "Native Hawaiian or Other Pacific Islander"; else if R5 then RACE = "White"; run;
SAS will treat 1 (actually any non-zero) value as true and 0 as false.
But if you have other than numeric 1 / 0 values or more than one 1 then you have a (possibly very slightly) more complex problem and we need rules on how to deal with the more than one race and possibly the "none" are indicated.
Which variables are present in your dataset merge_dataset?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.