DATA Step, Macro, Functions and more

Proc sql remerge of inline virtual view behaves unexpectedly

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

Proc sql remerge of inline virtual view behaves unexpectedly

When working with Oracle SQL I will often use a tiny virtual dataset to test some idea or function out.
Ex:
WITH TEMP AS (
    SELECT 1 AS NUM, 'A' AS ID FROM DUAL
        UNION
    SELECT 2 AS NUM, 'B' AS ID FROM DUAL
        UNION
    SELECT 3 AS NUM, 'C' AS ID FROM DUAL
        UNION
    SELECT 4 AS NUM, 'D' AS ID FROM DUAL
)
SELECT ID, SUM(NUM) AS SUMNUM FROM TEMP GROUP BY ID;

I've recently been learning about proc sql's remerge feature. I wanted to test my knowledge of how it works, so I built and ran this small test case.

 

PROC SQL;
SELECT 
	ID,
	SUM(NUM)
FROM
(
	SELECT 1 AS NUM, 'A' AS ID FROM SASHELP.AIR
		UNION
	SELECT 2 AS NUM, 'B' AS ID FROM SASHELP.AIR
		UNION
	SELECT 3 AS NUM, 'C' AS ID FROM SASHELP.AIR
		UNION
	SELECT 4 AS NUM, 'D' AS ID FROM SASHELP.AIR
);
QUIT; 

 

Output:

image.png

 

 

 

 

This result was unexpected. I guessed that the cause of this unexpected output was the 'fake' table.
To test this theory I wrote the virtual table to a dataset and used that dataset to rerun the proc sql query.

PROC SQL;
	CREATE TABLE TEMP AS
		SELECT 1 AS NUM, 'A' AS ID FROM SASHELP.AIR
			UNION
		SELECT 2 AS NUM, 'B' AS ID FROM SASHELP.AIR
			UNION
		SELECT 3 AS NUM, 'C' AS ID FROM SASHELP.AIR
			UNION
		SELECT 4 AS NUM, 'D' AS ID FROM SASHELP.AIR;

	SELECT 
		ID,
		SUM(NUM)
	FROM
		TEMP;
QUIT;

 Output:

image.png

 

 

 

 

 

 

This is the output I expected from the first query.

I've exhausted my google-fu and can't find anything that explains why this occurs. This is an edge case that I can't see ever coming up in my development work, but it still bugs me.

Does anyone know why this happens or where I could find more information about the way SAS handles remerging?

SAS system information:
SAS Enterprise Guide 7.12 (7.100.2.3350) (64-bit)
Base SAS Software
Custom version information: 9.4_M3
Image version information: 9.04.01M3P060315 


Accepted Solutions
Solution
‎11-13-2017 10:47 AM
PROC Star
Posts: 549

Re: Proc sql remerge of inline virtual view behaves unexpectedly

Posted in reply to BenNafziger

All Replies
New Contributor
Posts: 3

Re: Proc sql remerge of inline virtual view behaves unexpectedly

[ Edited ]
Posted in reply to BenNafziger

Another variation that produces the correct output:

PROC SQL;
	CREATE TABLE TEMP1 AS
		SELECT 1 AS NUM, 'A' AS ID FROM SASHELP.AIR;
CREATE TABLE TEMP2 AS SELECT 2 AS NUM, 'B' AS ID FROM SASHELP.AIR;
CREATE TABLE TEMP3 AS SELECT 3 AS NUM, 'C' AS ID FROM SASHELP.AIR;
CREATE TABLE TEMP4 AS SELECT 4 AS NUM, 'D' AS ID FROM SASHELP.AIR; SELECT ID, SUM(NUM) FROM ( SELECT * FROM TEMP1 UNION SELECT * FROM TEMP2 UNION SELECT * FROM TEMP3 UNION SELECT * FROM TEMP4 ) ; QUIT;

Output:

image.png 

 

Solution
‎11-13-2017 10:47 AM
PROC Star
Posts: 549

Re: Proc sql remerge of inline virtual view behaves unexpectedly

Posted in reply to BenNafziger

it is answered about this here

 

https://communities.sas.com/t5/Base-SAS-Programming/PROC-SQL-remerging/td-p/332784

 

please have a look

Esteemed Advisor
Posts: 5,626

Re: Proc sql remerge of inline virtual view behaves unexpectedly

[ Edited ]
Posted in reply to BenNafziger

Now consider this:

 

PROC SQL;
	CREATE view TEMP AS
		SELECT 1 AS NUM, 'A' AS ID FROM SASHELP.AIR
			UNION
		SELECT 2 AS NUM, 'B' AS ID FROM SASHELP.AIR
			UNION
		SELECT 3 AS NUM, 'C' AS ID FROM SASHELP.AIR
			UNION
		SELECT 4 AS NUM, 'D' AS ID FROM SASHELP.AIR;

	SELECT 
		ID,
		SUM(NUM)
	FROM
		TEMP;
QUIT;
                                    ID
                                    ------------
                                    D         10

 

At least in SAS 9.4 (TS1M1)

 

PG
Super User
Super User
Posts: 8,287

Re: Proc sql remerge of inline virtual view behaves unexpectedly

Posted in reply to BenNafziger

You forgot the GROUP BY clause.  Your Oracle example had that clause.

SELECT ID, sum(NUM) as TOT
FROM (SELECT 1 AS NUM, 'A' AS ID FROM SASHELP.class(obs=1)
UNION SELECT 2 AS NUM, 'B' AS ID FROM SASHELP.class(obs=1)
UNION SELECT 3 AS NUM, 'C' AS ID FROM SASHELP.class(obs=1)
UNION SELECT 4 AS NUM, 'D' AS ID FROM SASHELP.class(obs=1)
     )
group by 1
;

 

ID       TOT
------------
A          1
B          2
C          3
D          4

Without the GROUP BY SAS should have re-merged the SUM() with all of the input records, but there are some bugs in its ability to properly detect that in this case. That is what led to it returning only one record.

Esteemed Advisor
Posts: 5,626

Re: Proc sql remerge of inline virtual view behaves unexpectedly

Posted in reply to BenNafziger

BenNafziger wrote:


I've recently been learning about proc sql's remerge feature. I wanted to test my knowledge of how it works, so I built and ran this small test case.


So, the above discussion should convince you that the quirky behaviour you exposed is not a good example of how remerging is supposed to work. It is a bug which, fortunately, is not met too often in the real world and hopefully will be fixed, someday.

PG
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 163 views
  • 6 likes
  • 4 in conversation