BookmarkSubscribeRSS Feed
sophia_SAS
Obsidian | Level 7

Sas users,

Can you please recommend a way to fix the belwo code?  I'm currently receiving the error message; "Ambiguous reference, column new_ID is in more than one table."

In a nutshell I would like to create a new dataset "want" based on dataset "have_1" except I don't want to include in "want" any that are also in "have_2".  The variable in common is "new_ID".  There are duplicate "new_IDs" in "have_1" which may be causing the error message.

proc sql;

create table want as

select new_ID from have_1

except

select new_ID from have_2;

quit;

Thanks.

5 REPLIES 5
shivas
Pyrite | Level 9

Hi,

data one;

input no;

cards;

1

2

3

;

run;

data two;

input no;

cards;

3

4

5

;

run;

data want;

merge one(in=a) two(in=b);

by no;

if a ne b;

run;

Thanks,

Shiva

PGStats
Opal | Level 21

That query, written as it is, should work, unless new_ID doesn't have the same type or size in both tables, or have_1 or have_2 are views instead of tables, in which case the root of the problem could be inside one of those views.

PG

PG
ballardw
Super User

Try:

select distinct new_id from have_1

Hima
Obsidian | Level 7

Working fine for me even with duplicates.

data have_1;
input new_id name $1. ;
cards;
1 a
2 b
3 c
4 d
1 a
1 b
run;

data have_2;
input new_id;
cards;
5
6
7
8
9
;
run;

proc sql;
create table want as
select new_ID from have_1
except
select new_ID from have_2;
quit;

Log:

1455  data have_1;
1456  input new_id name $1. ;
1457  cards;

NOTE: The data set WORK.HAVE_1 has 6 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


1464  run;
1465
1466  data have_2;
1467  input new_id;
1468  cards;

NOTE: The data set WORK.HAVE_2 has 5 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


1474  ;
1475  run;
1476
1477  proc sql;
1478  create table want as
1479  select new_ID from have_1
1480  except
1481  select new_ID from have_2;
NOTE: Table WORK.WANT created, with 4 rows and 1 columns.

1482  quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

rsanchez87
Obsidian | Level 7

Hi, 

 

Here are two methods: 

 

METHOD 1: Using a where clause in both directions combined by a union all. 

 

Method 1: 
PROC SQL; CREATE TABLE WANT AS SELECT UNIQUE NEW_ID FROM HAVE_1 WHERE NEW_ID NOT IN (SELECT UNIQUE NEW_ID FROM HAVE_2)
UNION ALL
SELECT UNIQUE NEW_ID FROM HAVE_2
WHERE NEW_ID NOT IN (SELECT UNIQUE NEW_ID FROM HAVE_2) ;QUIT;

Method 2: 

Leveraging subqueries to create a union all NEW_ID's. Both tables have only 1 column, so it isn't ambiguous. The outermost query deduplicates. 

 

METHOD 2: 
PROC SQL;
CREATE TABLE WANT AS 
SELECT UNIQUE *
FROM (SELECT * FROM (SELECT UNIQUE NEW_ID FROM HAVE_1)
                 UNION ALL
             SELECT * FROM (SELECT UNIQUE NEW_ID FROM HAVE_1)
            )
;QUIT;

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 12581 views
  • 3 likes
  • 6 in conversation