BookmarkSubscribeRSS Feed
sophia_SAS
Obsidian | Level 7

Sas users,

Can you please recommend a way to fix the belwo code?  I'm currently receiving the error message; "Ambiguous reference, column new_ID is in more than one table."

In a nutshell I would like to create a new dataset "want" based on dataset "have_1" except I don't want to include in "want" any that are also in "have_2".  The variable in common is "new_ID".  There are duplicate "new_IDs" in "have_1" which may be causing the error message.

proc sql;

create table want as

select new_ID from have_1

except

select new_ID from have_2;

quit;

Thanks.

5 REPLIES 5
shivas
Pyrite | Level 9

Hi,

data one;

input no;

cards;

1

2

3

;

run;

data two;

input no;

cards;

3

4

5

;

run;

data want;

merge one(in=a) two(in=b);

by no;

if a ne b;

run;

Thanks,

Shiva

PGStats
Opal | Level 21

That query, written as it is, should work, unless new_ID doesn't have the same type or size in both tables, or have_1 or have_2 are views instead of tables, in which case the root of the problem could be inside one of those views.

PG

PG
ballardw
Super User

Try:

select distinct new_id from have_1

Hima
Obsidian | Level 7

Working fine for me even with duplicates.

data have_1;
input new_id name $1. ;
cards;
1 a
2 b
3 c
4 d
1 a
1 b
run;

data have_2;
input new_id;
cards;
5
6
7
8
9
;
run;

proc sql;
create table want as
select new_ID from have_1
except
select new_ID from have_2;
quit;

Log:

1455  data have_1;
1456  input new_id name $1. ;
1457  cards;

NOTE: The data set WORK.HAVE_1 has 6 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


1464  run;
1465
1466  data have_2;
1467  input new_id;
1468  cards;

NOTE: The data set WORK.HAVE_2 has 5 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


1474  ;
1475  run;
1476
1477  proc sql;
1478  create table want as
1479  select new_ID from have_1
1480  except
1481  select new_ID from have_2;
NOTE: Table WORK.WANT created, with 4 rows and 1 columns.

1482  quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

rsanchez87
Obsidian | Level 7

Hi, 

 

Here are two methods: 

 

METHOD 1: Using a where clause in both directions combined by a union all. 

 

Method 1: 
PROC SQL; CREATE TABLE WANT AS SELECT UNIQUE NEW_ID FROM HAVE_1 WHERE NEW_ID NOT IN (SELECT UNIQUE NEW_ID FROM HAVE_2)
UNION ALL
SELECT UNIQUE NEW_ID FROM HAVE_2
WHERE NEW_ID NOT IN (SELECT UNIQUE NEW_ID FROM HAVE_2) ;QUIT;

Method 2: 

Leveraging subqueries to create a union all NEW_ID's. Both tables have only 1 column, so it isn't ambiguous. The outermost query deduplicates. 

 

METHOD 2: 
PROC SQL;
CREATE TABLE WANT AS 
SELECT UNIQUE *
FROM (SELECT * FROM (SELECT UNIQUE NEW_ID FROM HAVE_1)
                 UNION ALL
             SELECT * FROM (SELECT UNIQUE NEW_ID FROM HAVE_1)
            )
;QUIT;

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 11954 views
  • 3 likes
  • 6 in conversation