Hi,
I have two tables A and B, both with the same columns:
Z1 character7 Z2 character2 Z3 numeric8 Z4 Numeric8 Z5 character200 Z6 numeric8
In A there are 1 121 776 rows
In B there are 114 028 rows.
I do this in a PROC SQL:
CREATE TABLE &output_table. AS SELECT * FROM A UNION SELECT * FROM B;
My output table has... 333 456 rows...
Can someone explain me how it is possible???
Thanks by advance
By default an SQL UNION only selects distinct values. If you want duplicates (i.e all rows from both tables) you need a UNION ALL.
Example from doc:
proc sql;
title 'A UNION ALL B';
select * from sql.a
union all
select * from sql.b;
By default an SQL UNION only selects distinct values. If you want duplicates (i.e all rows from both tables) you need a UNION ALL.
Example from doc:
proc sql;
title 'A UNION ALL B';
select * from sql.a
union all
select * from sql.b;
You have duplicates in both data sets. The example below produces one observation even though A has 3 and B has 2:
data a b;
length z1 $7 z2 $2 z3 z4 8 z5 $200 z6 8;
z1='1234567';
z2='12';
z3=1;
z4=2;
z5=repeat('abcde',40);
z6=3;
output a;output a; output a;
output b;output b;
run;
proc sql;
CREATE TABLE _table AS
SELECT * FROM A
UNION SELECT * FROM B;
quit;
UNION doesn't insert duplicate rows.
UNION ALL does insert duplicate rows.
Or
data &output_table;
set a b;
run;
Which may run faster than the SQL.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.