BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Blndbrm727
Calcite | Level 5

Hi All,

 

I am a fairly new SAS programmer! Below is my code. I am getting no errors in my log but yet my new variable is blank. Can someone help me? Not sure what is going on. I have formatted the variable. I have a series of these types of if-then statements creating a few different variables and none of them are working. I believe if I can figure out the issue here I can fix the rest of them quickly. Thank you in advance! 

 

if race=";2028-9;" or race="Asian" then racenew="A";
if race=";2054-5;" or race="Black/African American" then racenew="B";
if race=";2106-3;" or race="White" then racenew="W";
if race=";2131-1;" or race="Other" then racenew="O";
if race="Unknown" then racenew=" ";
run;

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@Blndbrm727 wrote:
Race
;2106-3;
;2106-3;
White
White
White
;2106-3;
;2106-3;
;2106-3;
;2106-3;
;2106-3;
White
;2106-3;
White
White
;2106-3;
;2106-3;
 This is straight from the Excel spreadsheet.

 Race Char 22 $15. $22. Race 
This is from Proc Contents

1) Show the entire code of your data step. For instance if you have an OUTPUT statement prior to the lines that assign the value then the values are not populated when the data is written to the output data set.

2) It would be helpful to provide example data that mapped to ALL of the expected codes instead of only W.

 

The code does work for W given that text as read.

data example;
    input race $  ;
   if race=";2028-9;" or race="Asian" then racenew="A";
   if race=";2054-5;" or race="Black/African American" then racenew="B";
   if race=";2106-3;" or race="White" then racenew="W";
   if race=";2131-1;" or race="Other" then racenew="O";
   if race="Unknown" then racenew=" ";
datalines4;
;2106-3;
;2106-3;
White
White
White
;2106-3;
;2106-3;
;2106-3;
;2106-3;
;2106-3;
White
;2106-3;
White
White
;2106-3;
;2106-3;
;;;;

I would add that I am a little concerned about have a code value like ";2106-3;" as the presence of two typical delimiter values, the semicolons, for some values but not others, "White" makes me suspect a possible data reading issue in the history.

View solution in original post

13 REPLIES 13
PaigeMiller
Diamond | Level 26

We'd need to see your data (or a portion of it) in order to give an answer. Some common errors are not having the same capitalization in the data as in the data, and the code not matching the exact spelling and punctuation that exists in the data.

--
Paige Miller
Blndbrm727
Calcite | Level 5

I had to put examples in this Word file. 

Reeza
Super User

Sorry, can't open attachments. You can embed pictures using the mountain type icon/picture on the editor.

Sorry, please don't post pictures, they hide the issues so posting as text or showing proc contents and proc freq output is likely more helpful. You can post that text directly into the window here.

EDITED.

Blndbrm727
Calcite | Level 5
Race
;2106-3;
;2106-3;
White
White
White
;2106-3;
;2106-3;
;2106-3;
;2106-3;
;2106-3;
White
;2106-3;
White
White
;2106-3;
;2106-3;
 This is straight from the Excel spreadsheet.

 Race Char 22 $15. $22. Race 
This is from Proc Contents
ballardw
Super User

@Blndbrm727 wrote:
Race
;2106-3;
;2106-3;
White
White
White
;2106-3;
;2106-3;
;2106-3;
;2106-3;
;2106-3;
White
;2106-3;
White
White
;2106-3;
;2106-3;
 This is straight from the Excel spreadsheet.

 Race Char 22 $15. $22. Race 
This is from Proc Contents

1) Show the entire code of your data step. For instance if you have an OUTPUT statement prior to the lines that assign the value then the values are not populated when the data is written to the output data set.

2) It would be helpful to provide example data that mapped to ALL of the expected codes instead of only W.

 

The code does work for W given that text as read.

data example;
    input race $  ;
   if race=";2028-9;" or race="Asian" then racenew="A";
   if race=";2054-5;" or race="Black/African American" then racenew="B";
   if race=";2106-3;" or race="White" then racenew="W";
   if race=";2131-1;" or race="Other" then racenew="O";
   if race="Unknown" then racenew=" ";
datalines4;
;2106-3;
;2106-3;
White
White
White
;2106-3;
;2106-3;
;2106-3;
;2106-3;
;2106-3;
White
;2106-3;
White
White
;2106-3;
;2106-3;
;;;;

I would add that I am a little concerned about have a code value like ";2106-3;" as the presence of two typical delimiter values, the semicolons, for some values but not others, "White" makes me suspect a possible data reading issue in the history.

Blndbrm727
Calcite | Level 5

Thank you for your insight and guidance. I wish I had control over those fields. They are coming from two separate healthcare systems, hence why one is numeric and the other is simply by race. I agree it makes it very challenging, especially for coding. 

PaigeMiller
Diamond | Level 26

In addition to @ballardw request to show the whole data step because of an OUTPUT statement before the IF statements, another possibility is that you are not reading the data into SAS properly, and so even though it appears as 

;2106-3;

in Excel, it was not read properly in SAS. So it is irrelevant how it appears in Excel, what is relevant is how it appears in SAS.

 

So we need to see the entire data step, for that reason.

 

Furthermore, you need to actually LOOK at the SAS data set that gets created WITH YOUR OWN EYES (this is always a helpful debugging step) to see what is in there for the values of variable RACE. IF the data hasn't been read properly, then this looking at the SAS data set will pinpoint the problem.

--
Paige Miller
PaigeMiller
Diamond | Level 26

Many of us will not download and open MS Office documents as they can be a security threat.

 

Just click on the {i} icon and paste the sample of your data into that window.

--
Paige Miller
Reeza
Super User

Your code is correct so the issue is either logical or data related.

 

Posting an example of your data is the easiest solution, if you can't do that, please show the output from proc contents and/or proc freq for the race variable. 

 


@Blndbrm727 wrote:

Hi All,

 

I am a fairly new SAS programmer! Below is my code. I am getting no errors in my log but yet my new variable is blank. Can someone help me? Not sure what is going on. I have formatted the variable. I have a series of these types of if-then statements creating a few different variables and none of them are working. I believe if I can figure out the issue here I can fix the rest of them quickly. Thank you in advance! 

 

if race=";2028-9;" or race="Asian" then racenew="A";
if race=";2054-5;" or race="Black/African American" then racenew="B";
if race=";2106-3;" or race="White" then racenew="W";
if race=";2131-1;" or race="Other" then racenew="O";
if race="Unknown" then racenew=" ";
run;


 

Reeza
Super User
By any chance did you get a CSV file and import that into SAS using PROC IMPORT? And opened the file in Excel to view it?
Blndbrm727
Calcite | Level 5

The files were imported into SAS as Excel files originally (Jan-Mar) and output as Excel files as a quarterly file. Importing Quarterly file along with April, concatenating, and creating a Jan-Apr file for output.

ballardw
Super User

@Blndbrm727 wrote:

The files were imported into SAS as Excel files originally (Jan-Mar) and output as Excel files as a quarterly file. Importing Quarterly file along with April, concatenating, and creating a Jan-Apr file for output.


Since it sounds like you will be "importing" data repeatedly it is a good idea to take control of the process and NOT rely on Proc Import.

Save Excel files as CSV and then you can use a data step to read the data. (Hint: a proc import on the  csv will give you data step code in the log you can copy, edit and save.)

 

What you gain with a data step is the ability to READ data with custom formats into a consistent value and create error messages when encountering unexpected values.

Example:

proc format library=work;
invalue $rnew  (upcase default=1)
";2028-9;" ,"ASIAN" ="A"
";2054-5;","BLACK/AFRICAN AMERICAN" ="B"
";2106-3;","WHITE" ="W"
";2131-1;","OTHER" ="O"
"Unknown"," " =" "
other = _error_;

data example;
   informat race $rnew.;
   infile datalines dlm=',' truncover;
   input race;
datalines4;
;2028-9; 
Asian
;2054-5;
Black/African American 
;2106-3;
White
;2131-1;
Other 
Unknown
  
NA
Multi
;;;;

 

Note that the LOG shows errors for the "unexpected" values of NA and Multi that appear in the example data. So you could either modify the INVALUE to properly code the meaning of NA or report to the source that you are getting unexpected values.

The UPCASE on the INVALUE says to convert the text to uppercase before comparing the values to the (notice upcased) text for assignment. This addresses possible data entry or extraction issue that might have Asian, asian, ASIAN or other case combinations. Which you might have to consider in an IF/THEN/ELSE construct.

Another benefit is if you have multiple race fields in your data then you can use the same the informat to read all of them to get the A,B,W etc coding. You would need a separate If/Then/Else block for each variable.

 

From a certain amount of experience I suggest that you also research the codes / text before they appear of the American Indian and/or Native Hawaiian Pacific Islander potential race codes as well.

Blndbrm727
Calcite | Level 5

Hi Reeza,

 

No the file is an Excel file. Prior to this coding I did a proc transpose.

 

Thanks!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 2502 views
  • 2 likes
  • 4 in conversation