08-04-2014 10:30 PM
Tonight, I'm perplexed.
Running SAS 9.3 on Linux server, via EG. Have SAS/ACCESS to Oracle libref, and reading from a view defined in the Oracle database. When I read from this view using implicit pass-through and a WHERE statement, the step returns 3 times as many records as when I use an IF statement. With the WHERE statement, every record is output three times.
The view is not much special (joins a few tables). The WHERE statement gets passed through to Oracle, the IF statement does not.
Step with WHERE statement:
data want; *n=300,000 ; set oracle.MyView (readbuff=40000); where site IN ("A" "B"); run;
Step with IF statement:
data want; *n=100,000 ; set oracle.MyView (readbuff=40000); if site IN ("A" "B"); run;
Feels bugish to me. Will approach with fresher eyes tomorrow, but wanted to reach out for ideas....
08-05-2014 09:49 AM
Thanks for suggestions. I'm still working on it. Going to try some brute force debugging. The code generated by the implicit pass-through looks good to me (not that I know Oracle SQL well).
And if I copy and paste that code into an explicit pass-through block, I get the same result (every record output 3 times).
SELECT "VarA", "VarB", "VarC" FROM MyView WHERE ( ("SITE" IN ( 'A' , 'B' ) ) )
Note SAS added the comma for the IN() operator for me when it generated the code. I thought it was odd that every variable name was in double quotes, but Oracle is fine with it (and removing them didn't change result).
There is also an extra pair of parentheses on the WHERE clause, but shouldn't matter.
Also noticed that when I do below explicit pass-through query using this as a subquery:
select count(*) from ( SELECT "VarA", "VarB", "VarC" FROM MyView WHERE ( ("SITE" IN ( 'A' , 'B' ) ) ) )
I get the right answer (100,000 not 300,000).
Like I said, going to try some brute force and chat with the DBA, and if I can't get far in an hour or so will open a ticket with Tech Support.
08-05-2014 01:50 PM
Good point @stat@sas That's a typo in my initial post (and test code) should be readbuff=4000 not 40000.
But changing it didn't change the results.
I think the limits are dependent on OS, memory, etc.
08-05-2014 02:47 PM
Try this just added not null in where clause.
data want; *n=300,000 ;
set oracle.MyView (readbuff=40000);
where site IN ("A" "B") and site is not null;
08-05-2014 05:10 AM
I am not seeing how sas translates the "in" functions to something within Oracle:
SAS/ACCESS(R) 9.3 for Relational Databases: Reference, Second Edition (sas/access functions/options) SQL Functions (oracle)
Within oracle it is an operation. Operators, Functions, Expressions, Conditions.
Could you have hit the famous 3 logic approach of a dbms? The null is a missing record always evaluates to false.
The missings in SAS are a different concept. but the null-values are translated to "".
IF there is some join/concatenation that can give nice surprising effects.
08-05-2014 05:13 AM
Unload all tables into SAS and do the view there, for reference.
08-05-2014 06:07 AM
In your real code do you separate the different values to select from with a comma?
So instead of:
....site IN ("A" "B");
It should be:
08-05-2014 02:51 PM
Quentin, your post at 3:51 is having two SQL statements.
The first one is, as far I have noticed, still going through the SAS SQL parser. (you could trace that).
The second with that additional block is a real explicit pass through and that one will build up a new session.
SAS/ACCESS(R) 9.3 for Relational Databases: Reference, Second Edition see the connect option for that.ay
Anyway you have proved the used SQL syntax is correct.
Something is happening in a unexpected strange way. There are a lot of specific options but it should be not possible returning wrong results.
08-05-2014 04:11 PM
Thanks Jaap (et al),
In my second post (#6 above) it's all explicit pass through, I just didn't show that it's wrapped in a select * from connection to oracle block. Basically I took the oracle SQL code generated by SAS and pasted it into and explicit pass-through block and started playing with it there.
Oracle DBA agreed to set up a time that he can trace the session and send me the logs from running these queries, so will see if that shows anything. If nothing else, I'll have more materials to send to tech support.
Figured you in particular would be pleased to hear that I'm happy to collaborate with my DBA on this (and my SAS admin).
Latest bizarre finding is if I limit the variables listed on the SELECT statement, I get the correct number of records returned. Can't think why changing variables listed on SELECT statement should impact the number of records returned. There are plenty of TS notes about SAS/ACCESS to Oracle doing some odd things over the years. Wondering if there is something SAS doesn't like about this particular VIEW.
08-05-2014 04:24 PM
I suggest you get your DBA to run your SQL queries in SQL Plus or Toad or his SQL tool of choice. If you get the same results as SAS then it is an Oracle VIEW problem and not caused by SAS.
08-05-2014 05:42 PM