Hi everyone!
I'm very new to SAS, and I'm confusing myself when it comes to renaming, formatting, and labeling data. I have a data set with the variable RACE, and I need to rename one variable, and then rename + combine 3 into 1. Would I need to use a rename statement, or a format statement?
RACE:
Black or African American- I need to rename this to African American
Asian Indian- I need this to be "Asian/Pacific Islander"
Asian or Pacific Islander, no specific (individual) race- I need this to be "Asian/Pacific Islander"
Other Asian or Pacific Islander- I need this to be "Asian/Pacific Islander"
Chinese
Guamanian or Chamorro
Japanese
White
Unknown
You can rename a variable, but you cannot rename a value.
You can attach a label to a variable, but you cannot attach a label to a value (this is not SPSS).
You can use a format to make values display in a different way. (Formats are used to determine how values are displayed as text.)
You can use code to create new values, either into new variables or back into the original variable. You can even use a format in combination with code to make creating new values easier.
Do you have one variable named RACE with values like "Black or African American" and "Asian Indian"?
Or do you have multiple variables?
Why don't you start by creating a format for you RACE values that will display them the way you want.
proc format;
value $racefmt
'Black or African American' = 'African American'
'Asian Indian'
,'Asian or Pacific Islander, no specific (individual) race'
,'Other Asian or Pacific Islander'
= 'Asian/Pacific Islander'
;
run;
You could try using the format with your existing data (without making any changes to the data). Usually that is all you need to do. For example by including a FORMAT statement in the procedure you want to run against the data.
proc freq data=have;
tables race;
format race $racefmt.;
run;
Hi Tom,
When I create a format statement for race, they are all working except the "Asian or Pacific Islander, no specific (individual) race". For some reason, it will not format? I copied and pasted directly from the excel spreadsheet to make sure I had the spaces and spelling correct, but it still won't format. I also tried creating IF statements, and they all work except the one previously mentioned. There are no errors in my log. Any suggestions?
value $racegrp
'Black or African American' = 'African American'
'Asian Indian'
,'Asian or Pacific Islander, no specific (individual) race'
,'Other Asian or Pacific Islander'
= 'Asian/Pacific Islander'
;
run;
How would I do this using a FORMAT statement?
Are you only looking at HTML (or other "fancy" output) instead of plain text listing outputs?
Then most likely you have leading spaces in the values that are not matching.
Just remove the leading spaces.
racename=left(racename);
Another possibility is some of the spaces are not actually spaces. Instead that might be null character or tab or line feed or carriage return or "non-breaking space".
racename=left(translate(racename,' ','00090A0DA0'x));
Use IF/THEN statements is the easiest for a beginner.
Note the usage of IN to check for a series of values, not just one value at a time.
data want; *want is the name of output data set;
set have; *have is the name of the input data set;
if race = 'Black or African American' then race = 'African American';
else if race in ( 'Asian Indian', 'Asian or Pacific Islander, no specific (individual) race', 'Other Asian or Pacific Islander' ) then race = 'Asian/Pacific Islander';
run;
@Pixydust12 wrote:
Hi everyone!
I'm very new to SAS, and I'm confusing myself when it comes to renaming, formatting, and labeling data. I have a data set with the variable RACE, and I need to rename one variable, and then rename + combine 3 into 1. Would I need to use a rename statement, or a format statement?
RACE:
Black or African American- I need to rename this to African American
Asian Indian- I need this to be "Asian/Pacific Islander"
Asian or Pacific Islander, no specific (individual) race- I need this to be "Asian/Pacific Islander"
Other Asian or Pacific Islander- I need this to be "Asian/Pacific Islander"
Chinese
Guamanian or Chamorro
Japanese
White
Unknown
Thank you! If I have other "IF" statements, do those go after those?
What order do these statements go below? When I run this I get the error "Statement is not valid or is used out of proper order"
Example:
if gender eq 'Female';
if income= 'High';
if race= 'Unknown' then delete;
keep race income gender occupation;
run;
You cannot have statements like IF outside of a DATA step. That is probably the cause of the error message.
The order the statements appear in the data step will make a difference in the order that they will be executed.
Note there is a big difference between the IF/THEN statements in @Reeza 's answer and the subsetting IF statements in your example here. A subsetting IF will stop the current iteration of the data step when the condition is false, so any other statements after it will not execute for this observation. The same thing will happen when you execute the DELETE statement.
Note that IF/THEN DELETE is very much like a subsetting IF statement, only with the logic reversed.
So these two statements do the same thing.
if gender eq 'Female';
if gender ne 'Female' then delete;
RUN constitutes a step boundary and terminates the data step, causing the following to be "open code" where these statements are invalid.
@Pixydust12 wrote:
I should note that Gender is my observation while race,income, and occupation are my variables
What do you mean by gender is my observation?
Typically an 'observation' is a row of data.
For your code, the RUN terminates your data step, removing it fixes your issues.
@Pixydust12 wrote:
data Female;
set Original;
if race= 'Black or African American' then race= 'African American';
else if race in ( 'Asian Indian', 'Asian or Pacific Islander, no specific (individual) race', 'Other Asian or Pacific Islander' ) then race= 'Asian/Pacific Islander';
run;
if gender eq 'Female';
if income eq 'High’;
if race= 'Unknown’ then delete;
keep race income gender occupation;
run;
proc print data=Female (obs=20);
run;
I should note that Gender is my observation while race,income, and occupation are my variables
data Female;
set Original;
if race= 'Black or African American' then race= 'African American';
else if race in ( 'Asian Indian', 'Asian or Pacific Islander, no specific (individual) race', 'Other Asian or Pacific Islander' ) then race= 'Asian/Pacific Islander';
*inclusion criteria;
if gender eq 'Female' and income eq 'High’ and race ne 'Unknown’;
keep race income gender occupation;
run;
proc print data=Female (obs=20);
run;
Removing the RUN should fix your code. Note that you can combine your conditions to one using AND, no need for multiple filtering statements.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.