BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
KevinViel
Pyrite | Level 9

I encountered this (stunning) issue today:

 

171  proc sql ;
172    create table ds_datadiff as
173    select *
174    from sasdata.ds
175    where recordid in
176             ( select recordid
177               from delta.datadiff
178               where dataset = "DS"
179             )
180    ;
NOTE: Table WORK.DS_DATADIFF created, with 45 rows and 55 columns.

181  quit ;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.02 seconds
      cpu time            0.01 seconds


182
183  proc sql ;
184    select b.recordid
185    from delta.datadiff as b
186    where dataset = "DS"
187    ;
ERROR: Column recordid could not be found in the table/view identified with
       the correlation name B.
188  quit ;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

189


190  proc sql ;
191    create table ds_datadiff as
192    select *
193    from sasdata.ds
194    where recordid in
195             ( select b.recordid
196               from delta.datadiff as b
197               where dataset = "DS"
198             )
199    ;
ERROR: Column recordid could not be found in the table/view identified with
       the correlation name B.
ERROR: Unresolved reference to table/correlation name b.
ERROR: Expression using IN has components that are of different data types.
NOTE: The IN referred to may have been transformed from an OR to an IN at some
      point during PROC SQL WHERE clause optimization.
200  quit ;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.02 seconds
      cpu time            0.01 seconds

 

 

The issue is that RECORDID is not is DELTA.DATADIFF as illustrated by the second two SQL procedure above.  I find this highly unexpected and worrisome.  Should the subquery be "protected", meaning that the SELECT clause in it should not pull from other parts of the SQL?  I may begin defensive programming like the second procedure to protect against such errors in coding.

 

Thank you,


Kevin

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

SAS is just being helpful.  You asked for RECORDID and it found a variable RECORDID and used it.

View solution in original post

4 REPLIES 4
Reeza
Super User

Not super helpful but this is something that's been around for a long time and is the default behaviour. 

 

https://communities.sas.com/t5/SAS-Procedures/Case-when-used-with-a-subquery-strange-result/td-p/210...

Quentin
Super User

Hey Kevin,

 

You're not alone in being stunned by this.  I was tipped off to it a few months ago, and started a SAS-L thread that got some good feedback from the 'L: https://listserv.uga.edu/cgi-bin/wa?A2=SAS-L;11810443.1907b .

 

This code runs without errors:

 

proc sql ;
  select *
  from sashelp.class
  where name in
    (select name    /*n.b.: sashelp.iris does not have a variable NAME*/
     from sashelp.iris
     where species is not missing
    )
  ;
quit;

It turns out the SQL compiler happily turns what the author intended as in inner query into a correlated subquery.  In the SAS-L thread @yabwon kindly pointed out that if you add the FEEDBACK option, the log will show you that the reference to NAME in the inner query is interpreted as class.name rather than iris.name:

11   proc sql feedback;
12     select *
13     from sashelp.class
14     where name in
15       (select name    /*n.b.: sashelp.iris does not have a variable NAME*/
16        from sashelp.iris
17        where species is not missing
18       )
19     ;
NOTE: Statement transforms to:

        select CLASS.Name, CLASS.Sex, CLASS.Age, CLASS.Height, CLASS.Weight
          from SASHELP.CLASS
         where CLASS.Name in
               ( select CLASS.Name
                   from SASHELP.IRIS
                  where IRIS.Species is not null
               );

20   quit;

As you showed and Kurt mentioned, the offensive programming to catch this error would be to always use two level names when you have two tables  in a query:

 

21   proc sql ;
22     select *
23     from sashelp.class as a
24     where a.name in
25       (select b.name    /*n.b.: sashelp.iris does not have a variable NAME*/
26        from sashelp.iris as b
27        where species is not missing
28       )
29     ;
ERROR: Column name could not be found in the table/view identified with the correlation name B.
ERROR: Unresolved reference to table/correlation name b.
30   quit;
NOTE: The SAS System stopped processing this step because of errors.

I haven't been in that habit, but seeing this scared me enough to try to learn this new habit.

 

--Q.

BASUG is hosting free webinars Next up: Don Henderson presenting on using hash functions (not hash tables!) to segment data on June 12. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
Tom
Super User Tom
Super User

SAS is just being helpful.  You asked for RECORDID and it found a variable RECORDID and used it.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1085 views
  • 5 likes
  • 5 in conversation