BookmarkSubscribeRSS Feed

SQL Allows Multiple Columns with Same Name

Started ‎07-06-2018 by
Modified ‎07-06-2018 by
Views 10,465

In a DATA step, the program data vector does not allow two variables to have the same name. SQL is different. The namespace for a query can have multiple instances of the same column name.

 

Usually, the SQL processor will generate an error message if there is an ambiguous reference to a column name. There are still situations, however, where it is quite possible to get into trouble.

Here is an example:

 

proc sql;
 
create table one as
 select name, height, round(weight) as Size
  from sashelp.class
  where sex='M';
 
title 'Boys Taller Than 66 Inches';
 
create table two as
 select name, round(height) as Size
  from one
  where size > 66;
 
select * from two;

The result is

Boys Taller Than 66 Inches     
Name Size ------------------ Alfred 69 Henry 64 James 57 Jeffrey 63 John 59 Philip 72 Robert 65 Ronald 67 Thomas 58 William 67

There are lots of values of 66 or less. Why? It has to do with the order in which the parts of the query are processed. The WHERE filtering is done first. In fact, it is passed up to the data engine, which sees only the columns in the header of data set ONE. So it is not aware of the ambiguity and works with the SIZE values it finds in ONE, which of course reflect weights, not heights.

One way to fix this is to change the WHERE clause and specify (by means of the keyword CALCULATED) that SIZE refers to the new column derived from heights, as in:

 

create table three as
 select name, round(height) as Size
  from one
  where calculated size > 66;
 
select * from three;
 
quit;

That tells the processor to use the new SIZE columns and forces the WHERE filtering to be done after that column is derived. It produces, as expected:

 

Boys Taller Than 66 Inches     
Name Size ------------------ Alfred 69 Philip 72 Ronald 67 William 67

While multiple like-named columns are tolerated in SQL's workspace, they cannot be stored in SAS data sets. So if query results are feeding a CREATE TABLE or CREATE VIEW statement, only the first one will be saved, and warnings will be issued for subsequent columns with duplicative names. Of course unambiguous names can be specified to avoid this problem.

 

This article was originally posted by Howard Schreier on sasCommunity.org.

Version history
Last update:
‎07-06-2018 03:41 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Labels
Article Tags