BookmarkSubscribeRSS Feed
turcay
Lapis Lazuli | Level 10

Hello everyone,

 

I have a sample code as below, which includes if statements . As we know, if the variable has character type we need to put " " after if statement, if it has numeric type we need to put .(period) after if statement.

 

For character type variable, In the following code, "if "statement works fine when I put " " in the if statement. However, if I put .(period) in the "if" statement ,it changes the whole values .

 

I want to understand the reason why .(period) changes the all values if variable is character type.

 

I'm asking this question because I pull the whole if statements from SQL and this variables can be numeric type or character type, if I replace whole .(period) values with " " then some of my character type will be affected wrong.

 

Here is my sample code;

 

Data Have;
Length Variable $ 20 Value 8;
Infile Datalines Missover;
Input Variable Value;
Datalines;
. 1000
Istanbul 1000
Istanbul 1000
. 1000
;
Run;

Data Want;
Set Have;
If Variable=" " Then Value=1600;
Run;
/*Change Whole Values*/
Data Want;
Set Have;
If Variable=. Then Value=1600;
Run;

 May I have breif information, please?

 

Thank you

2 REPLIES 2
RW9
Diamond | Level 26 RW9
Diamond | Level 26

It is because you are mixing types.  Your first datastep:

data have;
  length variable $ 20 value 8;
  infile datalines missover;
  input variable value;
datalines;
. 1000
Istanbul 1000
Istanbul 1000
. 1000
;
Run;

Creates a dataset with 1 character datatype and one numeric datatype.  The "." when read in becomes "" as that is the empty character equivalent.

 

The your next datastep:

data want;
  set have;
  if variable="" then value=1600;
  /* or: if missing(variable) then... */
run;

I checking is the character string in variable equal to the character string on the right of the =.  You can use no space, a space, or the missing function they all do the same thing.  This works as ""="" then 1600, but "Istanbul"="" not true.  This is fine logic.

 

Your next datastep:

data want;
  set have;
  if variable=. then value=1600;
run;

Is invalid code.  What is happening is that at the if statement it is looking at the logic, = numeric value, and that to the left is a character string.  So it starts to implicitly convert your string into numeric, however "Istanbul" cannot be a number, so gets set to . or missing.  Therefore, all rows equate to missing empty=missing, and "Istanbul" cannot convert so missing.  

Its a good example of why you should always explicitly convert datatypes - know your data, program for your data.

 

 

 

 

Kurt_Bremser
Super User

The dot denotes the default MISSING value for numeric variables only. With character variables, the dot is simpy a 1-byte string containing the dot.

 

The log of your second data step is very verbose about your mistake:

23         Data Want;
24         Set Have;
25         If Variable=. Then Value=1600;
26         Run;

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).
      25:4   
NOTE: Invalid numeric data, Variable='Istanbul' , at line 25 column 4.
Variable=Istanbul Value=1600 _ERROR_=1 _N_=2
NOTE: Invalid numeric data, Variable='Istanbul' , at line 25 column 4.
Variable=Istanbul Value=1600 _ERROR_=1 _N_=3
NOTE: There were 4 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 4 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.03 seconds
      cpu time            0.00 seconds

SAS tries to convert your Variable to numeric, fails in observations 2 and 3, which causes a missing numeric value, represented by the dot; therefore the condition is also true. In observation 1 and 4, the empty string is correctly converted to a missing value, causing the same result, but without ERROR.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 2863 views
  • 0 likes
  • 3 in conversation