DATA Step, Macro, Functions and more

Between Operator for Char Variable

Reply
Occasional Contributor Pra
Occasional Contributor
Posts: 6

Between Operator for Char Variable

Dear All,

Here is a scenario where  BETWEEN Operator is not working as expected.Please help me in understanding what could be the possible reason for this.

751 DATA VS;

752 SET STUDY.VS;

753 KEEP USUBJID VISIT VSTESTCD VSTEST VSORRES VSORRESU ;

754 WHERE VSTESTCD ='WEIGHT' AND VSORRES BETWEEN '35' AND '150';

755

756 /*WHERE  (VSTESTCD='WEIGHT') AND (INPUT(VSORRES,BEST.) BETWEEN 35 AND 150);*/ This is working Smiley Happy;

757 RUN;

NOTE: There were 6 observations read from the data set STUDY.VS.

WHERE (VSTESTCD='WEIGHT') and (VSORRES>='150' and VSORRES<='35'); - Please help in comprehending how the condition was negated. 

NOTE: The data set WORK.VS has 6 observations and 6 variables.

NOTE: Compressing data set WORK.VS increased size by 100.00 percent.

Compressed is 2 pages; un-compressed would require 1 pages.

NOTE: DATA statement used (Total process time):

real time           0.00 seconds

cpu time            0.01 seconds


VSORRES : CHAR VARIABLE HAVING NUMERIC VALUES



Super User
Super User
Posts: 7,942

Re: Between Operator for Char Variable

Well, to be honest I am not really going to look at the why this is happening and there is a simple reason, I would never recommend to use character data in a place where numeric data is expected.  Its simply bad and lazy coding practice and leads to these kind of issues.  I would suggest explicitly writing the where as you expect it to perform:

where vstestcd="WEIGHT" and 35 <= input(vsorres,best.) <= 150;

From the above statement we can clearly see that we are doing a numeric test and the upper and lower boundary (plus whether ton include the boundaries) is clearly shown and no "guessing" needs to take place.

As for the why it is doing that, well, could be any number of reasons, perhaps when it converts the character data to numeric (which is what it is doing, your just leaving it to fate to decide what to do), it is sorting the information it has, hence "150" as text appears before "35".

Contributor
Posts: 37

Re: Between Operator for Char Variable

Usually Character comparisons happens character by character from left to right, if you use any comparison operators like < , > etc . '150' might appear to be a smaller value when compared to '35', so you may not get what you want. Better convert them to integer  before you do anything on them.

Thanks

Karthik

Occasional Contributor Pra
Occasional Contributor
Posts: 6

Re: Between Operator for Char Variable

Posted in reply to Karthikeyan

I too believe its the character conversation taking place.

Thanks for giving your thought Smiley Happy 

Respected Advisor
Posts: 3,799

Re: Between Operator for Char Variable

Think about it.

data _null_;
  
x = '35' min '150';
  
put x=;
   run;

x=150
Super User
Super User
Posts: 7,039

Re: Between Operator for Char Variable

What is the question here?

Are you surprised that SAS let you specify the higher value first when using the BETWEEN operator?  That is a nice enhancement that SAS allows that some other languages do not allow. Makes the BETWEEN operator work more like natural language.  Works the same way with numeric variables.

4    data x; set sashelp.class ;

5      where age between 12 and 9 ;

6      where also name between 'Sara' and 'Adam';

NOTE: WHERE clause has been augmented.

7    run;

NOTE: There were 6 observations read from the data set SASHELP.CLASS.

      WHERE (age>=9 and age<=12) and (name>='Adam' and name<='Sara');

NOTE: The data set WORK.X has 6 observations and 5 variables.

Or are you confused about the lexical ordering of character variables?  '150' is less than '35' because '1' is less than '3' .

Super User
Posts: 11,343

Re: Between Operator for Char Variable

Run this program an look at the results. It may help you see the difference in character variable order compared with numeric. It creates some character variables that look like numbers, sorts them and prints the result.

data test;
   input x $;
datalines;
1
2
3
10
25
38
150
220
3487
;
run;

proc sort data=test; by x;run;
proc print data=test;run;

Ask a Question
Discussion stats
  • 6 replies
  • 572 views
  • 9 likes
  • 6 in conversation