I have been working on this code for three hours for a school assignment, can you please help? I have a large dataset where I need to combine the ethnicity and race and create a variable called group with the following values: 1 (black non-Hispanic) , 2 (white non-Hispanic), and 3 (Hispanic).
The following is the "key"
"1"= "American Indian or Alaska Native"
"2"= "Asian"
"3"= "Black or African American"
"4"= "Native Hawaiian or Other Pacific Islander"
"5"= "White"
"6"= "Other"
"7"= "Unknown";
"E1"= "Hispanic or Latino"
"E2"= "Non-Hispanic or Latino"
"E7"= "Unknown";
I think I might be on to something here with this code but I think I need to make a variable named "group"
/*Create a new variable called group*/
DATA=APPENDICITIS.PROJECT3;
if ETHNICITY = "E2" AND RACE = "3" THEN GROUP = "1";
if ETHNICITY = "E2" AND RACE = "5" THEN GROUP = "2";
IF ETHNICITY = "E1" THEN GROUP = "3";
RUN;Thanks so much for your expertise and time.
There's no equals sign in a DATA statement and LIBREFs are limited to 8 characters so I've shortened APPENDICITIS to APP. This logic is a bit better:
DATA APP.PROJECT3;
set ???; * Need to read your input dataset;
if ETHNICITY = "E2" AND RACE = "3" THEN GROUP = "1";
else if ETHNICITY = "E2" AND RACE = "5" THEN GROUP = "2";
else IF ETHNICITY = "E1" THEN GROUP = "3";
RUN;
You need to use a valid data step. Can you fix that?
Your code is only assigning values to the new variable for 3 of the possible 32 combinations of your two variables.
RACE has 7 valid values and I assume could also have missing values, so 8 possible values.
ETHNICITY has 3 valid values and again I assume could also have missing value, so 4 possible values.
8*4= 32 possible combinations.
Do you just want group to be missing for the other 31 possible combinations?
Thanks for responding. What do you mean by using a valid data step? I have bitten off more than I can chew in this class but really want to finish, so please dumb it down for me. My instructions only want me to analyze these 3 values in relationship to mean age.
@whyamihere wrote:
Thanks for responding. What do you mean by using a valid data step? I have bitten off more than I can chew in this class but really want to finish, so please dumb it down for me. My instructions only want me to analyze these 3 values in relationship to mean age.
Data step starts with a DATA statement that names the dataset(s) it is creating.
It needs to have a source of data. If that is an existing dataset then you need a SET statement.
Example:
data want;
set have;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.