DATA Step, Macro, Functions and more

BETWEEN-AND in logical expressions: WHERE, IF-THEN and SELECT-WHEN

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

BETWEEN-AND in logical expressions: WHERE, IF-THEN and SELECT-WHEN

[ Edited ]

 

QUESTION: Is there any systemic reason in the SAS Base 9.4 programming model, why the logical expression BETWEEN-AND cannot be used in IF-THEN and in SELECT-WHEN statements?

 

BETWEEN-AND seem to be working only in WHERE statements and when used in IF-THEN or SELECT-WHEN there are errors reported. Does it mean that BETWEEN-AND expression has some kind of a special status within SAS Base 9.4?

 

Example

 

First we create a sample data set WORK.test where we have only one numeric variable (column) Order_ID.

 

 

data WORK.test;
   INFILE DATALINES;
   INPUT Order_ID : 12.;
DATALINES;
1
2
3
4
5
6
7
8
9
10
;;;
run;

 

 

Then we would like to output to a new data set work.test2 only observations which have value ORDER_ID between 5 and 7.

 

We can approach this in two ways:

  • Scenario #1: filter input data using WHERE clause
  • Scenario #2: control output using either IF-THEN statement or SELECT-WHEN statement

Let's start with Scenario #1

 

Using 5 <= Order_ID <=7

 

DATA work.test2;
	SET work.test;
        WHERE (5 <= Order_ID <= 7);
RUN;

or using Order_ID BETWEEN 5 AND 7

 

DATA work.test2;
	SET work.test;
        WHERE (Order_ID BETWEEN 5 AND 7);
RUN;

generates exactly the same result:

 

 

scenario-1.png

 

 

But when we follow the Scenario #2 (controlling the output with IF-THEN or with SELECT-WHEN), we will get the following results:

 

DATA work.test2;
	SET work.test;
        IF (5 <= Order_ID <= 7) THEN OUTPUT;
RUN;

and

DATA work.test2;
    SET work.test;
    SELECT; 
WHEN (5 <= Order_ID <= 7) OUTPUT;
OTHERWISE;
END;
RUN;

both generate the expected data set:

scenario-1.png

 

but using the logical expression BETWEEN-AND will generate errors when used in IF-THEN and in SELECT-WHEN statements:

 

 

DATA work.test2;
    SET work.test;
    IF (Order_ID BETWEEN 5 AND 7) THEN OUTPUT;
RUN;
we get error messages: 
 
ERROR 388-185: Expecting an arithmetic operator.
ERROR 202-322: The option or parameter is not recognized and will be ignored.

 

and

  

DATA work.test2;
    SET work.test;
    SELECT; 
WHEN (Order_ID BETWEEN 5 AND 7) OUTPUT;
OTHERWISE;
END;
RUN;

we get error messages: 

ERROR 388-185: Expecting an arithmetic operator.
ERROR 202-322: The option or parameter is not recognized and will be ignored.

 

The above examples have been tested in SAS Studio 9.4.

 

I am looking forward to feedback from more experienced SAS Programmers.

 

Is it a feature or a bug? ;-)

 

Thank You.

 


Accepted Solutions
Solution
a week ago
Super User
Posts: 7,868

Re: BETWEEN-AND in logical expressions: WHERE, IF-THEN and SELECT-WHEN

The between-and operator works only in a "where-expression", the use of which is limited to the where statement in data steps, the where= dataset option, and the where clause in proc sql.

See (for instance) https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a001116114.htm

The where startement as part of the select-block is different from the where statement used to limit observations being read, and only supports syntax allowed in standard conditions.

 

IMHO, the between-and syntax is a leftover from COBOL-influenced times. If I were SAS, I'd have dropped it long ago in favor of "x < value < y", or even stayed only with the "x < value and value < y" used in all other languages.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers

View solution in original post


All Replies
Super Contributor
Posts: 345

Re: BETWEEN-AND in logical expressions: WHERE, IF-THEN and SELECT-WHEN

I know, reading documentation is boring, but often enlightening. The bad news: you haven't discovered a bug. Between can't be used in if-statements.

Occasional Contributor
Posts: 5

Re: BETWEEN-AND in logical expressions: WHERE, IF-THEN and SELECT-WHEN

Posted in reply to andreas_lds

 

 

 

Thank You for your response as a more experience SAS programmer.. I truly appreciate it.

 

I'm sorry that I didn't include information that I read the manuals. As a new user of SAS 9.4, I first had read several documents before preparing an example and posting my question.

 

In particular I went through:

  1. Step-by-Step Programming with Base SAS 9.4
  2. SAS 9.4 DATA Step Statements: Reference
  3. SAS 9.4 Language Reference: Concepts

There is no specific information in [1].

 

In [2] there is a section "WHERE Expression Only" with BETWEEN-AND operator. But there is no explanation why a logical expression with BETWEEN-AND operator as such can be only used in WHERE statements and not in IF-THEN or SELECT-WHEN. It is something unusual in programming languages.

 

In [3] there is a section "Deciding Whether to Use a WHERE Expression or a Subsetting IF Statement", but there is still no explanation why this particular operator expression cannot be used in in IF-THEN and SELECT-WHEN statements.

 

For me it was quite strange that there are two types of operators, ones that can be used everywhere and there are specific operators:
BETWEEN-AND, CONTAINS, IS MISSING or IS NULL, LIKE, SAME-AND, and Sounds-Like
which can be only used in the WHERE statement.

 

That is why, I asked my question and prepared my detailed example. I'm sorry if I made an impression that I didn't RTFM :-)

 

Once again, Thank You for Your comment.

Solution
a week ago
Super User
Posts: 7,868

Re: BETWEEN-AND in logical expressions: WHERE, IF-THEN and SELECT-WHEN

The between-and operator works only in a "where-expression", the use of which is limited to the where statement in data steps, the where= dataset option, and the where clause in proc sql.

See (for instance) https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a001116114.htm

The where startement as part of the select-block is different from the where statement used to limit observations being read, and only supports syntax allowed in standard conditions.

 

IMHO, the between-and syntax is a leftover from COBOL-influenced times. If I were SAS, I'd have dropped it long ago in favor of "x < value < y", or even stayed only with the "x < value and value < y" used in all other languages.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 5

Re: BETWEEN-AND in logical expressions: WHERE, IF-THEN and SELECT-WHEN

Posted in reply to KurtBremser

Thank You for providing a detailed answer and confirmation that BETWEEN-AND and also other operators: CONTAINS, IS MISSING or IS NULL, LIKE, SAME-AND, and Sounds-Like  can only be used in WHERE statements. 

 

I am still wondering, why such a strange dichotomy between operators was introduced into the SAS Base Language. That is why, I asked the question with a smile: is it a feature or a bug? :-)

 

Thank You again for your detailed answer and providing the link - it is always very helpful.

Super User
Posts: 7,868

Re: BETWEEN-AND in logical expressions: WHERE, IF-THEN and SELECT-WHEN

Thinking through it again, I came to the conclusion that the between-and exists because it is part of ANSI SQL, and SAS decided to make all where-statements (SQL and elsewhere) to use that as standard. It's therefore only available when applied to subsetting datasets on input, but not in any other conditions. See it as part of the read mechanism, but not of programming logic.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 5

Re: BETWEEN-AND in logical expressions: WHERE, IF-THEN and SELECT-WHEN

Posted in reply to KurtBremser

At this stage, I know, BETWEEN-AND can only be used in WHERE statements.

 

After Your indication of SQL, I have just investigated the subject further. There is a chapter Understanding SAS Indexes in SAS 9.4 Language Reference: Concepts, Sixth Edition. It provides detailed explanation about indexing of data by SAS.

 

It can be possible that specific operators i.e. BETWEEN-AND, CONTAINS, IS MISSING or IS NULL, LIKE, SAME-AND, and Sounds-Like are only allowed in the WHERE statement, so the User is somehow forced to use them at the stage of filtering the input before the data is delivered into the PDV (Program Data Vector) for further processing in a cursor-like fashion (in a loop).

 

In case of operators which are applied to strings (text): CONTAINS, LIKE, Sound-Like the performance gain is quite obvious. But in case of BETWEEN-AND it is not so obvious. For instance: is it always better to use (x BETWEEN a AND b) over (a<=x<=b) expressions in the WHERE statements?

 

Maybe I will learn something more in the future.

 

Super User
Posts: 11,343

Re: BETWEEN-AND in logical expressions: WHERE, IF-THEN and SELECT-WHEN

Some of those tasks are available using other functions. SAS data step code existed before the incorporation of Proc SQL to SAS I believe.

All of those statements you mention appear to come from the SQL world and the uses are defined there. SAS has maintained the same limits. I think it makes sense that you don't want to use a code element from something like SQL that behaves differently as much as practical so you don't have to switch as many gears when going from data step to Proc SQL and back.

 

Instead of "is missing" use Missing(expression). "Is null" is equivalent to missing as far as SAS is concerned.

I have no idea what a "Same And" code element may be so I can't provide an example.

The function Soundex is used to encode strings to match for sounds similar searching. Note that the documentation for the WHERE expression says:

The sounds-like ( =*) operator selects observations that contain a spelling variation of a specified word or words. The operator uses the Soundex algorithm to compare the variable value and the operand. For more information, see the SOUNDEX function in SAS Functions and CALL Routines: Reference. 

Note: Note that the SOUNDEX algorithm is English-biased, and is less useful for languages other than English. 

These other functions are usable in any comparison or calculation you need.

 

 

If you are attempting to transfer existing knowledge of SQL (of some flavor) to data step programming then you want to look at references for data step

SAS Super FREQ
Posts: 8,869

Re: BETWEEN-AND in logical expressions: WHERE, IF-THEN and SELECT-WHEN

Hi:
My .02 ...

The DATA step processing and IF logic and IF operators existed long before SAS implemented SQL capability. And, yes, those features of SQL that are allowed in WHERE were implemented in PROC SQL when it was introduced.

There are some Tech Support notes that discuss how you would achieve the equivalent of LIKE or CONTAINS, for example, in DATA step: http://support.sas.com/kb/43/303.html

However, I learned =: and INDEX before I learned CONTAINS, so I never thought it was a bug when WHERE operators only worked in WHERE statements and clauses.

You will find the documentation for the WHERE statement operators here:
http://go.documentation.sas.com/?docsetId=lestmtsref&docsetTarget=n1xbr9r0s9veq0n137iftzxq4g7e.htm&d...

You will find the documentation for the operators you can use in SAS expressions here: http://go.documentation.sas.com/?docsetId=lrcon&docsetTarget=p00iah2thp63bmn1lt20esag14lh.htm&docset... (look at the comparison operators).

Programming 1 is a free elearning class and it will outline both the use of WHERE and the use of IF and discuss the operators available. You can activate Programming 1 by clicking on the link at the top of this page: https://support.sas.com/edu/elearning.html?ctry=us&productType=library

Cynthia
Super User
Super User
Posts: 7,083

Re: BETWEEN-AND in logical expressions: WHERE, IF-THEN and SELECT-WHEN

[ Edited ]

It is not a feature or a bug. It was a design decision. The BETWEEN and other SQL syntax features are supported by the WHERE statement, but not in other normal SAS statements.

 

I suspect that it was done as part of supporting external databases with libname engines. So as part of pushing WHERE clauses into the external database they decided to except the SQL like syntax so that it would be easier for SAS to push more WHERE conditions down without having to re-write them. 

 

But modifying how all of the SAS statements work would have been a much large task and probably would have had the potential for breaking customer's existing code bases.

 

Note that they are also many SAS features like the : modifier on comparison operators or variables lists that are not supported by PROC SQL.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 148 views
  • 6 likes
  • 6 in conversation