Dear All,
Here is a scenario where BETWEEN Operator is not working as expected.Please help me in understanding what could be the possible reason for this.
751 DATA VS;
752 SET STUDY.VS;
753 KEEP USUBJID VISIT VSTESTCD VSTEST VSORRES VSORRESU ;
754 WHERE VSTESTCD ='WEIGHT' AND VSORRES BETWEEN '35' AND '150';
755
756 /*WHERE (VSTESTCD='WEIGHT') AND (INPUT(VSORRES,BEST.) BETWEEN 35 AND 150);*/ This is working ;
757 RUN;
NOTE: There were 6 observations read from the data set STUDY.VS.
WHERE (VSTESTCD='WEIGHT') and (VSORRES>='150' and VSORRES<='35'); - Please help in comprehending how the condition was negated.
NOTE: The data set WORK.VS has 6 observations and 6 variables.
NOTE: Compressing data set WORK.VS increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
VSORRES : CHAR VARIABLE HAVING NUMERIC VALUES
Well, to be honest I am not really going to look at the why this is happening and there is a simple reason, I would never recommend to use character data in a place where numeric data is expected. Its simply bad and lazy coding practice and leads to these kind of issues. I would suggest explicitly writing the where as you expect it to perform:
where vstestcd="WEIGHT" and 35 <= input(vsorres,best.) <= 150;
From the above statement we can clearly see that we are doing a numeric test and the upper and lower boundary (plus whether ton include the boundaries) is clearly shown and no "guessing" needs to take place.
As for the why it is doing that, well, could be any number of reasons, perhaps when it converts the character data to numeric (which is what it is doing, your just leaving it to fate to decide what to do), it is sorting the information it has, hence "150" as text appears before "35".
Usually Character comparisons happens character by character from left to right, if you use any comparison operators like < , > etc . '150' might appear to be a smaller value when compared to '35', so you may not get what you want. Better convert them to integer before you do anything on them.
Thanks
Karthik
I too believe its the character conversation taking place.
Thanks for giving your thought
Think about it.
What is the question here?
Are you surprised that SAS let you specify the higher value first when using the BETWEEN operator? That is a nice enhancement that SAS allows that some other languages do not allow. Makes the BETWEEN operator work more like natural language. Works the same way with numeric variables.
4 data x; set sashelp.class ;
5 where age between 12 and 9 ;
6 where also name between 'Sara' and 'Adam';
NOTE: WHERE clause has been augmented.
7 run;
NOTE: There were 6 observations read from the data set SASHELP.CLASS.
WHERE (age>=9 and age<=12) and (name>='Adam' and name<='Sara');
NOTE: The data set WORK.X has 6 observations and 5 variables.
Or are you confused about the lexical ordering of character variables? '150' is less than '35' because '1' is less than '3' .
Run this program an look at the results. It may help you see the difference in character variable order compared with numeric. It creates some character variables that look like numbers, sorts them and prints the result.
data test;
input x $;
datalines;
1
2
3
10
25
38
150
220
3487
;
run;
proc sort data=test; by x;run;
proc print data=test;run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.