When working with Oracle SQL I will often use a tiny virtual dataset to test some idea or function out.
Ex:
WITH TEMP AS (
SELECT 1 AS NUM, 'A' AS ID FROM DUAL
UNION
SELECT 2 AS NUM, 'B' AS ID FROM DUAL
UNION
SELECT 3 AS NUM, 'C' AS ID FROM DUAL
UNION
SELECT 4 AS NUM, 'D' AS ID FROM DUAL
)
SELECT ID, SUM(NUM) AS SUMNUM FROM TEMP GROUP BY ID;
I've recently been learning about proc sql's remerge feature. I wanted to test my knowledge of how it works, so I built and ran this small test case.
PROC SQL;
SELECT
ID,
SUM(NUM)
FROM
(
SELECT 1 AS NUM, 'A' AS ID FROM SASHELP.AIR
UNION
SELECT 2 AS NUM, 'B' AS ID FROM SASHELP.AIR
UNION
SELECT 3 AS NUM, 'C' AS ID FROM SASHELP.AIR
UNION
SELECT 4 AS NUM, 'D' AS ID FROM SASHELP.AIR
);
QUIT;
Output:
This result was unexpected. I guessed that the cause of this unexpected output was the 'fake' table.
To test this theory I wrote the virtual table to a dataset and used that dataset to rerun the proc sql query.
PROC SQL;
CREATE TABLE TEMP AS
SELECT 1 AS NUM, 'A' AS ID FROM SASHELP.AIR
UNION
SELECT 2 AS NUM, 'B' AS ID FROM SASHELP.AIR
UNION
SELECT 3 AS NUM, 'C' AS ID FROM SASHELP.AIR
UNION
SELECT 4 AS NUM, 'D' AS ID FROM SASHELP.AIR;
SELECT
ID,
SUM(NUM)
FROM
TEMP;
QUIT;
Output:
This is the output I expected from the first query.
I've exhausted my google-fu and can't find anything that explains why this occurs. This is an edge case that I can't see ever coming up in my development work, but it still bugs me.
Does anyone know why this happens or where I could find more information about the way SAS handles remerging?
SAS system information:
SAS Enterprise Guide 7.12 (7.100.2.3350) (64-bit)
Base SAS Software
Custom version information: 9.4_M3
Image version information: 9.04.01M3P060315
it is answered about this here
https://communities.sas.com/t5/Base-SAS-Programming/PROC-SQL-remerging/td-p/332784
please have a look
Another variation that produces the correct output:
PROC SQL;
CREATE TABLE TEMP1 AS
SELECT 1 AS NUM, 'A' AS ID FROM SASHELP.AIR;
CREATE TABLE TEMP2 AS
SELECT 2 AS NUM, 'B' AS ID FROM SASHELP.AIR;
CREATE TABLE TEMP3 AS
SELECT 3 AS NUM, 'C' AS ID FROM SASHELP.AIR;
CREATE TABLE TEMP4 AS
SELECT 4 AS NUM, 'D' AS ID FROM SASHELP.AIR;
SELECT
ID,
SUM(NUM)
FROM
(
SELECT * FROM TEMP1
UNION
SELECT * FROM TEMP2
UNION
SELECT * FROM TEMP3
UNION
SELECT * FROM TEMP4
)
;
QUIT;
Output:
it is answered about this here
https://communities.sas.com/t5/Base-SAS-Programming/PROC-SQL-remerging/td-p/332784
please have a look
Now consider this:
PROC SQL;
CREATE view TEMP AS
SELECT 1 AS NUM, 'A' AS ID FROM SASHELP.AIR
UNION
SELECT 2 AS NUM, 'B' AS ID FROM SASHELP.AIR
UNION
SELECT 3 AS NUM, 'C' AS ID FROM SASHELP.AIR
UNION
SELECT 4 AS NUM, 'D' AS ID FROM SASHELP.AIR;
SELECT
ID,
SUM(NUM)
FROM
TEMP;
QUIT;
ID ------------ D 10
At least in SAS 9.4 (TS1M1)
You forgot the GROUP BY clause. Your Oracle example had that clause.
SELECT ID, sum(NUM) as TOT
FROM (SELECT 1 AS NUM, 'A' AS ID FROM SASHELP.class(obs=1)
UNION SELECT 2 AS NUM, 'B' AS ID FROM SASHELP.class(obs=1)
UNION SELECT 3 AS NUM, 'C' AS ID FROM SASHELP.class(obs=1)
UNION SELECT 4 AS NUM, 'D' AS ID FROM SASHELP.class(obs=1)
)
group by 1
;
ID TOT ------------ A 1 B 2 C 3 D 4
Without the GROUP BY SAS should have re-merged the SUM() with all of the input records, but there are some bugs in its ability to properly detect that in this case. That is what led to it returning only one record.
@BenNafziger wrote:
I've recently been learning about proc sql's remerge feature. I wanted to test my knowledge of how it works, so I built and ran this small test case.
So, the above discussion should convince you that the quirky behaviour you exposed is not a good example of how remerging is supposed to work. It is a bug which, fortunately, is not met too often in the real world and hopefully will be fixed, someday.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.