BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ukar441
Fluorite | Level 6

Hi

When I run the below proc sql code it returns results and I did expect it to give an error. Can anyone explain why?

Running SAS 9.4 on Unix. I expected it to fail since variable sex2 is not in sashelp.class but instead work.class with 10 obs was returned. Have entered my log notes as well as the code. 

 

*-- Creating subset of data renaming variable sex to sex2 --*;
data work.class(rename=(sex=sex2));
  set sashelp.class(where=(sex='M'));
run;

NOTE: There were 10 observations read from the data set SASHELP.CLASS.
      WHERE sex='M';
NOTE: The data set WORK.CLASS has 10 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds 

*-- Selecting from subset using sex2 in (select sex2) --*;
*-- Expected this to fail instead class2 is returned  --*;
*-- as result                                         --*;
proc sql;
  create table class2 as
  select * from class
  where sex2 in(select sex2 from sashelp.class); /* sex2 is not in sashelp.class */
quit;
NOTE: Table WORK.CLASS2 created, with 10 rows and 5 columns. NOTE: PROCEDURE SQL used (Total process time): real time 0.00 seconds
1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @Ukar441 and welcome to the SAS Support Communities!

 

You created a correlated subquery, i.e., the sql-expression sex2 in the subquery is recognized as that variable from WORK.CLASS in the outer query. As such it is syntactically correct.

 

Equivalently, you could have written:

proc sql;
  create table class2 as
  select * from class a
  where sex2 in (select a.sex2 from sashelp.class);
quit;

or (since sex2='M' for all observations in WORK.CLASS)

proc sql;
  create table class2 as
  select * from class
  where sex2 in (select 'M' from sashelp.class);
quit;

Like variable sex2 (but unlike, say, sex3, which would cause an error) the character constant 'M' (or any other character constant for that matter) is a valid sql-expression which can be compared to values of character variable sex2 from the outer query.

View solution in original post

6 REPLIES 6
FreelanceReinh
Jade | Level 19

Hi @Ukar441 and welcome to the SAS Support Communities!

 

You created a correlated subquery, i.e., the sql-expression sex2 in the subquery is recognized as that variable from WORK.CLASS in the outer query. As such it is syntactically correct.

 

Equivalently, you could have written:

proc sql;
  create table class2 as
  select * from class a
  where sex2 in (select a.sex2 from sashelp.class);
quit;

or (since sex2='M' for all observations in WORK.CLASS)

proc sql;
  create table class2 as
  select * from class
  where sex2 in (select 'M' from sashelp.class);
quit;

Like variable sex2 (but unlike, say, sex3, which would cause an error) the character constant 'M' (or any other character constant for that matter) is a valid sql-expression which can be compared to values of character variable sex2 from the outer query.

ChrisNZ
Tourmaline | Level 20

@FreelanceReinh Can you tell us more how a correlated subquery can replace and swap variable names, and the the name from the outer query and replace it in the subquery?

Any other cases where this name-swapping takes place?

FreelanceReinh
Jade | Level 19

@ChrisNZ wrote:

@FreelanceReinh Can you tell us more how a correlated subquery can replace and swap variable names, and the the name from the outer query and replace it in the subquery?

Any other cases where this name-swapping takes place?


I don't think anything special is happening here. It's just that variables from all contributing tables are valid in an sql-expression in a subquery (cf. the wording of the error message obtained with sex3). The expression sex2 in the subquery does not select values from variable sex in SASHELP.CLASS (using a "wrong" or "swapped" name). Rather it's a constant (like the 'M' in my earlier example) w.r.t. the subquery for each record from WORK.CLASS. As a consequence, the subquery results in a temporary table consisting of as many identical values (equal to the current value of sex2 from WORK.CLASS) as there are observations in SASHELP.CLASS, i.e. 19. Trivially, the resulting WHERE condition is always met, unless the table in the FROM clause of the subquery was empty. So, any non-empty dataset, even if totally unrelated to WORK.CLASS in terms of variable names, types and values, could be used as well in place of SASHELP.CLASS in the subquery.

Ukar441
Fluorite | Level 6

Hi @FreelanceReinh 

Thanks for your answer. It was really helpful and explained it well. Following your reasoning gives the following conclusion. If we change the where clause somewhat I get these results.

where sex2 in(select sex2 from sashelp.class);
NOTE: Table WORK.CLASS2 created, with 10 rows and 5 columns.

where sex2 in(select 'F' from sashelp.class);
where sex2 in(select 'Hello' from sashelp.class);
where sex2 in(select name from sashelp.class);
NOTE: Table WORK.CLASS2 created, with 0 rows and 5 columns.

where sex2 in(select a.sex2 from sashelp.class as a);
ERROR: Column sex2 could not be found in the table/view identified with the correlation name A.
ERROR: Unresolved reference to table/correlation name a.

First example is the original and there sex2 is correlated with sex2 in work.class.

Next three examples evaluates the character expression and gives expected = rows.

Last example when I explicitly point to sashelp.class with a.sex2 I get the error I expected in the first place.

 

However what was new to me was that in example 1,2 and 3 the "from sashelp.class" in a sense is ignored. Example 1 the select uses sex2 from another table and in 2 and 3 from sashelp.class is sort of ignored.

 

It is not quite a variable swap as @ChrisNZ suggest but more of an order for searching.

Totally new to me and makes me wonder how many bugs I've created over the years.

 

Thanks for your help.  

FreelanceReinh
Jade | Level 19

@Ukar441 wrote:

Hi @FreelanceReinh 

Thanks for your answer. (...)

However what was new to me was that in example 1,2 and 3 the "from sashelp.class" in a sense is ignored. (...)


You're welcome. In these examples the only impact of "sashelp.class" is the number of (identical) observations resulting from the subquery (i.e. 19). Interestingly, PROC SQL seems to be clever enough not to waste resources to create a huge trivial table if that number is large. [Edit: The run time does not depend on that number.] It uses the same methods whether a constant (like 'M') is selected or a variable from the outer query like sex2, but the set of methods is different in the situation where a variable like name or sex is selected in the subquery. This can be seen by using the undocumented option _method of the PROC SQL statement:

130  proc sql _method;
131    create table class2 as
132    select * from class
133    where sex2 in (select sex from sashelp.class);

NOTE: SQL execution methods chosen are:

      sqxcrta
          sqxfil
              sqxsrc( WORK.CLASS )

NOTE: SQL subquery execution methods chosen are:

          sqxsubq
              sqxsrc( SASHELP.CLASS )
NOTE: Table WORK.CLASS2 created, with 10 rows and 5 columns.

134  quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


135
136  proc sql _method;
137    create table class2 as
138    select * from class
139    where sex2 in (select sex2 from sashelp.class);

NOTE: SQL execution methods chosen are:

      sqxcrta
          sqxfil
              sqxsrc( WORK.CLASS )

NOTE: SQL subquery execution methods chosen are:

          sqxsubq
              sqxreps
                  sqxsrc( SASHELP.CLASS )
NOTE: Table WORK.CLASS2 created, with 10 rows and 5 columns.

140  quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds


141
142  proc sql _method;
143    create table class2 as
144    select * from class
145    where sex2 in (select 'X' from sashelp.class);

NOTE: SQL execution methods chosen are:

      sqxcrta
          sqxfil
              sqxsrc( WORK.CLASS )

NOTE: SQL subquery execution methods chosen are:

          sqxsubq
              sqxreps
                  sqxsrc( SASHELP.CLASS )
NOTE: Table WORK.CLASS2 created, with 0 rows and 5 columns.

 

Ukar441
Fluorite | Level 6

Hi @FreelanceReinh 

I-m not the first one puzzled by this. Looked at your answer and googled the method sqxreps and one of the first links was this one sas - Proc sql subquery based on nonexisitng column returns not null - Stack Overflow . A discussion on the same topic. Right or wrong can probably be discussed. My own takeaway will be that it might be a good idea to be more strict in referencing tables/views i.e to use tablename.variablename notation in the sql queries.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 709 views
  • 7 likes
  • 3 in conversation