BookmarkSubscribeRSS Feed
sophia_SAS
Obsidian | Level 7

Sas users,

Can you please recommend a way to fix the belwo code?  I'm currently receiving the error message; "Ambiguous reference, column new_ID is in more than one table."

In a nutshell I would like to create a new dataset "want" based on dataset "have_1" except I don't want to include in "want" any that are also in "have_2".  The variable in common is "new_ID".  There are duplicate "new_IDs" in "have_1" which may be causing the error message.

proc sql;

create table want as

select new_ID from have_1

except

select new_ID from have_2;

quit;

Thanks.

5 REPLIES 5
shivas
Pyrite | Level 9

Hi,

data one;

input no;

cards;

1

2

3

;

run;

data two;

input no;

cards;

3

4

5

;

run;

data want;

merge one(in=a) two(in=b);

by no;

if a ne b;

run;

Thanks,

Shiva

PGStats
Opal | Level 21

That query, written as it is, should work, unless new_ID doesn't have the same type or size in both tables, or have_1 or have_2 are views instead of tables, in which case the root of the problem could be inside one of those views.

PG

PG
ballardw
Super User

Try:

select distinct new_id from have_1

Hima
Obsidian | Level 7

Working fine for me even with duplicates.

data have_1;
input new_id name $1. ;
cards;
1 a
2 b
3 c
4 d
1 a
1 b
run;

data have_2;
input new_id;
cards;
5
6
7
8
9
;
run;

proc sql;
create table want as
select new_ID from have_1
except
select new_ID from have_2;
quit;

Log:

1455  data have_1;
1456  input new_id name $1. ;
1457  cards;

NOTE: The data set WORK.HAVE_1 has 6 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


1464  run;
1465
1466  data have_2;
1467  input new_id;
1468  cards;

NOTE: The data set WORK.HAVE_2 has 5 observations and 1 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


1474  ;
1475  run;
1476
1477  proc sql;
1478  create table want as
1479  select new_ID from have_1
1480  except
1481  select new_ID from have_2;
NOTE: Table WORK.WANT created, with 4 rows and 1 columns.

1482  quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

rsanchez87
Obsidian | Level 7

Hi, 

 

Here are two methods: 

 

METHOD 1: Using a where clause in both directions combined by a union all. 

 

Method 1: 
PROC SQL; CREATE TABLE WANT AS SELECT UNIQUE NEW_ID FROM HAVE_1 WHERE NEW_ID NOT IN (SELECT UNIQUE NEW_ID FROM HAVE_2)
UNION ALL
SELECT UNIQUE NEW_ID FROM HAVE_2
WHERE NEW_ID NOT IN (SELECT UNIQUE NEW_ID FROM HAVE_2) ;QUIT;

Method 2: 

Leveraging subqueries to create a union all NEW_ID's. Both tables have only 1 column, so it isn't ambiguous. The outermost query deduplicates. 

 

METHOD 2: 
PROC SQL;
CREATE TABLE WANT AS 
SELECT UNIQUE *
FROM (SELECT * FROM (SELECT UNIQUE NEW_ID FROM HAVE_1)
                 UNION ALL
             SELECT * FROM (SELECT UNIQUE NEW_ID FROM HAVE_1)
            )
;QUIT;

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 11909 views
  • 3 likes
  • 6 in conversation