BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
FP12
Obsidian | Level 7

Hi,

 

I have two tables A and B, both with the same columns:

Z1 character7
Z2 character2
Z3 numeric8
Z4 Numeric8
Z5 character200
Z6 numeric8

In A there are 1 121 776 rows

In B there are 114 028 rows.

 

I do this in a PROC SQL:

		CREATE TABLE &output_table. AS
			SELECT * FROM A
			UNION SELECT * FROM B;

My output table has... 333 456 rows...

Can someone explain me how it is possible???

 

Thanks by advance

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisBrooks
Ammonite | Level 13

By default an SQL UNION only selects distinct values. If you want duplicates (i.e all rows from both tables) you need a UNION ALL.

 

Example from doc:

 

proc sql;
   title 'A UNION ALL B';
   select * from sql.a
   union all
   select * from sql.b;

View solution in original post

4 REPLIES 4
ChrisBrooks
Ammonite | Level 13

By default an SQL UNION only selects distinct values. If you want duplicates (i.e all rows from both tables) you need a UNION ALL.

 

Example from doc:

 

proc sql;
   title 'A UNION ALL B';
   select * from sql.a
   union all
   select * from sql.b;
mkeintz
PROC Star

You have duplicates in both data sets.  The example below produces one observation even though A has 3 and B has 2:

 

data a b;
  length z1 $7 z2 $2 z3 z4 8 z5 $200 z6 8;
z1='1234567';
z2='12';
z3=1;
z4=2;
z5=repeat('abcde',40);
z6=3;

output a;output a; output a;
output b;output b;
run;

proc sql;
CREATE TABLE _table AS
			SELECT * FROM A
			UNION SELECT * FROM B;
quit;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Reeza
Super User

UNION doesn't insert duplicate rows. 

UNION ALL does insert duplicate rows.

 

UNION

The UNION operator produces a table that contains all the unique rows that result from both table expressions
ballardw
Super User

Or

 

data &output_table;

   set a b;

run;

 

Which may run faster than the SQL.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 104249 views
  • 5 likes
  • 5 in conversation