BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
FP12
Obsidian | Level 7

Hi,

 

I have two tables A and B, both with the same columns:

Z1 character7
Z2 character2
Z3 numeric8
Z4 Numeric8
Z5 character200
Z6 numeric8

In A there are 1 121 776 rows

In B there are 114 028 rows.

 

I do this in a PROC SQL:

		CREATE TABLE &output_table. AS
			SELECT * FROM A
			UNION SELECT * FROM B;

My output table has... 333 456 rows...

Can someone explain me how it is possible???

 

Thanks by advance

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisBrooks
Ammonite | Level 13

By default an SQL UNION only selects distinct values. If you want duplicates (i.e all rows from both tables) you need a UNION ALL.

 

Example from doc:

 

proc sql;
   title 'A UNION ALL B';
   select * from sql.a
   union all
   select * from sql.b;

View solution in original post

4 REPLIES 4
ChrisBrooks
Ammonite | Level 13

By default an SQL UNION only selects distinct values. If you want duplicates (i.e all rows from both tables) you need a UNION ALL.

 

Example from doc:

 

proc sql;
   title 'A UNION ALL B';
   select * from sql.a
   union all
   select * from sql.b;
mkeintz
PROC Star

You have duplicates in both data sets.  The example below produces one observation even though A has 3 and B has 2:

 

data a b;
  length z1 $7 z2 $2 z3 z4 8 z5 $200 z6 8;
z1='1234567';
z2='12';
z3=1;
z4=2;
z5=repeat('abcde',40);
z6=3;

output a;output a; output a;
output b;output b;
run;

proc sql;
CREATE TABLE _table AS
			SELECT * FROM A
			UNION SELECT * FROM B;
quit;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User

UNION doesn't insert duplicate rows. 

UNION ALL does insert duplicate rows.

 

UNION

The UNION operator produces a table that contains all the unique rows that result from both table expressions
ballardw
Super User

Or

 

data &output_table;

   set a b;

run;

 

Which may run faster than the SQL.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 104602 views
  • 5 likes
  • 5 in conversation