Hi,
The two codes below subsets data differently, but I'm not sure why.
code #1
if not (10<=X<=12 & Y>=50) then
----;
code #2
if not (X in (10-12) & Y>=50) then
----;
I don't understand why 10<=X<=12 is performing differently from X in(10-12).
Best,
Jae
Please run this code and look at the results.
data _null_; input x; file print; If x in (10-12) then put x= +1 "is in (10-12)"; If x in (10 11 12) then put x= +1 "is in (10 11 12)"; If x in (10:12) then put x= +1 "is in (10:12)"; If 10 le x le 12 then put x= +1 "is in 10 le x le 12"; datalines; 1 1.5 2 3 4 10 10.1 11 -12 11.7 12 ; run;
IN is used for a LIST of values so the main difference is the treatment of decimal values. A second consideration is that for numeric values to provide a list of sequential integer values the syntax is (10:12). When you use (10-12) SAS examines the values and treats the - as part of the 12 or a negative 12 in the list.
I think that @ballardw already answered your question, but here is another way to look at it. I was surprised to find that in (10-12) is treated as two numbers:
data have; input x; datalines; 1 1.5 2 3 4 -2 10 10.1 11 -12 11.7 12 ; run; data incomma; set have (where=(x in (10,12))); run; data indash; set have (where=(x in (10-12))); run; data indashPLUS; set have (where=(x in (10-12,12))); run; data between; set have (where=(x between 10 and 12)); run; data ltetc; set have (where=(10 le x le 12)); run;
Art, CEO, AnalystFinder.com
Note that the answer to your question is in the SAS log, as the created code is explicited there.
Also note the code in the log (and cry when you look at the elapse time) when using
where X in (1:4)
versus
if X in (1:4)
The first one generates :
WHERE (X=INT(X)) and (X>=1 and X<=4);
and runs much slower.
@ChrisNZ: If you're pointing out that the use of the in operator in a where statement was poorly implemented (as compared with its implementation in an if statement), I agree. Both the where option and the where statement run between 2 and 3 times slower than using an if statement. Rick Langston pointed that out, years ago, in a SUGI paper (see: www2.sas.com/proceedings/sugi30/002-30.pdf ).
However, both the do the same thing and the point of my post was simply to show @jaeahn some additional examples of how the in operator works.
Art, CEO, AnalystFinder.com
I was just pointing out to @jaeahn that:
1- The answer to his question was in the sas log -where one should always look first- since the code
X in (10-12)
shows in the log as
X in (-12,10)
2- There can be surprises for some syntaxes.
I don't think this is do do with poor implementation of the in() clause (though when there are few values -like here- a list could be created to speed things up).
Rather, WHERE clauses always execute SAS functions slower than IF statements.
As soon as a function is introduced (even without the user explicitly doing so, such as here), IF statements are faster.
When only in/equalities are used, IF statements are slower.
Something to do with functions accessing data faster from the PDV than from the read buffer I suppose, though I was never given a clear explanation.
That's all I was trying to point out. Sorry about the confusion. My reply should have been to the OP.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.