Hi All,
I am a fairly new SAS programmer! Below is my code. I am getting no errors in my log but yet my new variable is blank. Can someone help me? Not sure what is going on. I have formatted the variable. I have a series of these types of if-then statements creating a few different variables and none of them are working. I believe if I can figure out the issue here I can fix the rest of them quickly. Thank you in advance!
if race=";2028-9;" or race="Asian" then racenew="A";
if race=";2054-5;" or race="Black/African American" then racenew="B";
if race=";2106-3;" or race="White" then racenew="W";
if race=";2131-1;" or race="Other" then racenew="O";
if race="Unknown" then racenew=" ";
run;
@Blndbrm727 wrote:
Race ;2106-3; ;2106-3; White White White ;2106-3; ;2106-3; ;2106-3; ;2106-3; ;2106-3; White ;2106-3; White White ;2106-3; ;2106-3; This is straight from the Excel spreadsheet.Race Char 22 $15. $22. Race This is from Proc Contents
1) Show the entire code of your data step. For instance if you have an OUTPUT statement prior to the lines that assign the value then the values are not populated when the data is written to the output data set.
2) It would be helpful to provide example data that mapped to ALL of the expected codes instead of only W.
The code does work for W given that text as read.
data example; input race $ ; if race=";2028-9;" or race="Asian" then racenew="A"; if race=";2054-5;" or race="Black/African American" then racenew="B"; if race=";2106-3;" or race="White" then racenew="W"; if race=";2131-1;" or race="Other" then racenew="O"; if race="Unknown" then racenew=" "; datalines4; ;2106-3; ;2106-3; White White White ;2106-3; ;2106-3; ;2106-3; ;2106-3; ;2106-3; White ;2106-3; White White ;2106-3; ;2106-3; ;;;;
I would add that I am a little concerned about have a code value like ";2106-3;" as the presence of two typical delimiter values, the semicolons, for some values but not others, "White" makes me suspect a possible data reading issue in the history.
We'd need to see your data (or a portion of it) in order to give an answer. Some common errors are not having the same capitalization in the data as in the data, and the code not matching the exact spelling and punctuation that exists in the data.
I had to put examples in this Word file.
Sorry, can't open attachments. You can embed pictures using the mountain type icon/picture on the editor.
Sorry, please don't post pictures, they hide the issues so posting as text or showing proc contents and proc freq output is likely more helpful. You can post that text directly into the window here.
EDITED.
Race ;2106-3; ;2106-3; White White White ;2106-3; ;2106-3; ;2106-3; ;2106-3; ;2106-3; White ;2106-3; White White ;2106-3; ;2106-3; This is straight from the Excel spreadsheet.
Race Char 22 $15. $22. Race This is from Proc Contents
@Blndbrm727 wrote:
Race ;2106-3; ;2106-3; White White White ;2106-3; ;2106-3; ;2106-3; ;2106-3; ;2106-3; White ;2106-3; White White ;2106-3; ;2106-3; This is straight from the Excel spreadsheet.Race Char 22 $15. $22. Race This is from Proc Contents
1) Show the entire code of your data step. For instance if you have an OUTPUT statement prior to the lines that assign the value then the values are not populated when the data is written to the output data set.
2) It would be helpful to provide example data that mapped to ALL of the expected codes instead of only W.
The code does work for W given that text as read.
data example; input race $ ; if race=";2028-9;" or race="Asian" then racenew="A"; if race=";2054-5;" or race="Black/African American" then racenew="B"; if race=";2106-3;" or race="White" then racenew="W"; if race=";2131-1;" or race="Other" then racenew="O"; if race="Unknown" then racenew=" "; datalines4; ;2106-3; ;2106-3; White White White ;2106-3; ;2106-3; ;2106-3; ;2106-3; ;2106-3; White ;2106-3; White White ;2106-3; ;2106-3; ;;;;
I would add that I am a little concerned about have a code value like ";2106-3;" as the presence of two typical delimiter values, the semicolons, for some values but not others, "White" makes me suspect a possible data reading issue in the history.
Thank you for your insight and guidance. I wish I had control over those fields. They are coming from two separate healthcare systems, hence why one is numeric and the other is simply by race. I agree it makes it very challenging, especially for coding.
In addition to @ballardw request to show the whole data step because of an OUTPUT statement before the IF statements, another possibility is that you are not reading the data into SAS properly, and so even though it appears as
;2106-3;
in Excel, it was not read properly in SAS. So it is irrelevant how it appears in Excel, what is relevant is how it appears in SAS.
So we need to see the entire data step, for that reason.
Furthermore, you need to actually LOOK at the SAS data set that gets created WITH YOUR OWN EYES (this is always a helpful debugging step) to see what is in there for the values of variable RACE. IF the data hasn't been read properly, then this looking at the SAS data set will pinpoint the problem.
Many of us will not download and open MS Office documents as they can be a security threat.
Just click on the {i} icon and paste the sample of your data into that window.
Your code is correct so the issue is either logical or data related.
Posting an example of your data is the easiest solution, if you can't do that, please show the output from proc contents and/or proc freq for the race variable.
@Blndbrm727 wrote:
Hi All,
I am a fairly new SAS programmer! Below is my code. I am getting no errors in my log but yet my new variable is blank. Can someone help me? Not sure what is going on. I have formatted the variable. I have a series of these types of if-then statements creating a few different variables and none of them are working. I believe if I can figure out the issue here I can fix the rest of them quickly. Thank you in advance!
if race=";2028-9;" or race="Asian" then racenew="A";
if race=";2054-5;" or race="Black/African American" then racenew="B";
if race=";2106-3;" or race="White" then racenew="W";
if race=";2131-1;" or race="Other" then racenew="O";
if race="Unknown" then racenew=" ";
run;
The files were imported into SAS as Excel files originally (Jan-Mar) and output as Excel files as a quarterly file. Importing Quarterly file along with April, concatenating, and creating a Jan-Apr file for output.
@Blndbrm727 wrote:
The files were imported into SAS as Excel files originally (Jan-Mar) and output as Excel files as a quarterly file. Importing Quarterly file along with April, concatenating, and creating a Jan-Apr file for output.
Since it sounds like you will be "importing" data repeatedly it is a good idea to take control of the process and NOT rely on Proc Import.
Save Excel files as CSV and then you can use a data step to read the data. (Hint: a proc import on the csv will give you data step code in the log you can copy, edit and save.)
What you gain with a data step is the ability to READ data with custom formats into a consistent value and create error messages when encountering unexpected values.
Example:
proc format library=work; invalue $rnew (upcase default=1) ";2028-9;" ,"ASIAN" ="A" ";2054-5;","BLACK/AFRICAN AMERICAN" ="B" ";2106-3;","WHITE" ="W" ";2131-1;","OTHER" ="O" "Unknown"," " =" " other = _error_; data example; informat race $rnew.; infile datalines dlm=',' truncover; input race; datalines4; ;2028-9; Asian ;2054-5; Black/African American ;2106-3; White ;2131-1; Other Unknown NA Multi ;;;;
Note that the LOG shows errors for the "unexpected" values of NA and Multi that appear in the example data. So you could either modify the INVALUE to properly code the meaning of NA or report to the source that you are getting unexpected values.
The UPCASE on the INVALUE says to convert the text to uppercase before comparing the values to the (notice upcased) text for assignment. This addresses possible data entry or extraction issue that might have Asian, asian, ASIAN or other case combinations. Which you might have to consider in an IF/THEN/ELSE construct.
Another benefit is if you have multiple race fields in your data then you can use the same the informat to read all of them to get the A,B,W etc coding. You would need a separate If/Then/Else block for each variable.
From a certain amount of experience I suggest that you also research the codes / text before they appear of the American Indian and/or Native Hawaiian Pacific Islander potential race codes as well.
Hi Reeza,
No the file is an Excel file. Prior to this coding I did a proc transpose.
Thanks!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.