03-30-2018 04:39 PM
The two codes below subsets data differently, but I'm not sure why.
if not (10<=X<=12 & Y>=50) then
if not (X in (10-12) & Y>=50) then
I don't understand why 10<=X<=12 is performing differently from X in(10-12).
03-30-2018 05:25 PM
Please run this code and look at the results.
data _null_; input x; file print; If x in (10-12) then put x= +1 "is in (10-12)"; If x in (10 11 12) then put x= +1 "is in (10 11 12)"; If x in (10:12) then put x= +1 "is in (10:12)"; If 10 le x le 12 then put x= +1 "is in 10 le x le 12"; datalines; 1 1.5 2 3 4 10 10.1 11 -12 11.7 12 ; run;
IN is used for a LIST of values so the main difference is the treatment of decimal values. A second consideration is that for numeric values to provide a list of sequential integer values the syntax is (10:12). When you use (10-12) SAS examines the values and treats the - as part of the 12 or a negative 12 in the list.
03-30-2018 07:55 PM
I think that @ballardw already answered your question, but here is another way to look at it. I was surprised to find that in (10-12) is treated as two numbers:
data have; input x; datalines; 1 1.5 2 3 4 -2 10 10.1 11 -12 11.7 12 ; run; data incomma; set have (where=(x in (10,12))); run; data indash; set have (where=(x in (10-12))); run; data indashPLUS; set have (where=(x in (10-12,12))); run; data between; set have (where=(x between 10 and 12)); run; data ltetc; set have (where=(10 le x le 12)); run;
Art, CEO, AnalystFinder.com
03-30-2018 10:53 PM
Note that the answer to your question is in the SAS log, as the created code is explicited there.
Also note the code in the log (and cry when you look at the elapse time) when using
where X in (1:4)
if X in (1:4)
The first one generates :
and runs much slower.
03-30-2018 11:40 PM
@ChrisNZ: If you're pointing out that the use of the in operator in a where statement was poorly implemented (as compared with its implementation in an if statement), I agree. Both the where option and the where statement run between 2 and 3 times slower than using an if statement. Rick Langston pointed that out, years ago, in a SUGI paper (see: www2.sas.com/proceedings/sugi30/002-30.pdf ).
However, both the do the same thing and the point of my post was simply to show @jaeahn some additional examples of how the in operator works.
Art, CEO, AnalystFinder.com
03-31-2018 04:04 AM
I was just pointing out to @jaeahn that:
1- The answer to his question was in the sas log -where one should always look first- since the code
X in (10-12)
shows in the log as
X in (-12,10)
2- There can be surprises for some syntaxes.
I don't think this is do do with poor implementation of the in() clause (though when there are few values -like here- a list could be created to speed things up).
Rather, WHERE clauses always execute SAS functions slower than IF statements.
As soon as a function is introduced (even without the user explicitly doing so, such as here), IF statements are faster.
When only in/equalities are used, IF statements are slower.
Something to do with functions accessing data faster from the PDV than from the read buffer I suppose, though I was never given a clear explanation.
That's all I was trying to point out. Sorry about the confusion. My reply should have been to the OP.