Re: sql tables and ambiguous reference error message

sophia_SAS · Posted 04-17-2012 10:38 AM

Sas users,

Can you please recommend a way to fix the belwo code? I'm currently receiving the error message; "Ambiguous reference, column new_ID is in more than one table."

In a nutshell I would like to create a new dataset "want" based on dataset "have_1" except I don't want to include in "want" any that are also in "have_2". The variable in common is "new_ID". There are duplicate "new_IDs" in "have_1" which may be causing the error message.

proc sql;

create table want as

select new_ID from have_1

except

select new_ID from have_2;

quit;

Thanks.

shivas · Posted 04-17-2012 10:59 AM

Hi,

data one;

input no;

cards;

1

2

3

;

run;

data two;

input no;

cards;

3

4

5

;

run;

data want;

merge one(in=a) two(in=b);

by no;

if a ne b;

run;

Thanks,

Shiva

PGStats · Posted 04-17-2012 11:05 AM

That query, written as it is, should work, unless new_ID doesn't have the same type or size in both tables, or have_1 or have_2 are views instead of tables, in which case the root of the problem could be inside one of those views.

PG

ballardw · Posted 04-17-2012 11:05 AM

Try:

select distinct new_id from have_1

Hima · Posted 04-17-2012 11:07 AM

Working fine for me even with duplicates.

data have_1;
input new_id name $1. ;
cards;
1 a
2 b
3 c
4 d
1 a
1 b
run;

data have_2;
input new_id;
cards;
5
6
7
8
9
;
run;

proc sql;
create table want as
select new_ID from have_1
except
select new_ID from have_2;
quit;

Log:

1455 data have_1;
1456 input new_id name $1. ;
1457 cards;

NOTE: The data set WORK.HAVE_1 has 6 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

1464 run;
1465
1466 data have_2;
1467 input new_id;
1468 cards;

NOTE: The data set WORK.HAVE_2 has 5 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds

1474 ;
1475 run;
1476
1477 proc sql;
1478 create table want as
1479 select new_ID from have_1
1480 except
1481 select new_ID from have_2;
NOTE: Table WORK.WANT created, with 4 rows and 1 columns.

1482 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds

rsanchez87 · Posted 01-26-2019 11:36 PM

Hi,

Here are two methods:

METHOD 1: Using a where clause in both directions combined by a union all.

Method 1: 
PROC SQL;
CREATE TABLE WANT AS 
SELECT UNIQUE NEW_ID FROM HAVE_1
WHERE NEW_ID NOT IN (SELECT UNIQUE NEW_ID FROM HAVE_2)
UNION ALL 
SELECT UNIQUE NEW_ID FROM HAVE_2
WHERE NEW_ID NOT IN (SELECT UNIQUE NEW_ID FROM HAVE_2)
;QUIT;

Method 2:

Leveraging subqueries to create a union all NEW_ID's. Both tables have only 1 column, so it isn't ambiguous. The outermost query deduplicates.

METHOD 2: 
PROC SQL;
CREATE TABLE WANT AS 
SELECT UNIQUE *
FROM (SELECT * FROM (SELECT UNIQUE NEW_ID FROM HAVE_1)
                 UNION ALL
             SELECT * FROM (SELECT UNIQUE NEW_ID FROM HAVE_1)
            )
;QUIT;

sql tables and ambiguous reference error message