<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: PROC SQL remerging in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/332868#M74962</link>
    <description>&lt;PRE&gt;
Did you try :

,sum(a.max_x) as blah
   , b.max_y 

Better feedback it to sas support.


&lt;/PRE&gt;</description>
    <pubDate>Wed, 15 Feb 2017 02:23:06 GMT</pubDate>
    <dc:creator>Ksharp</dc:creator>
    <dc:date>2017-02-15T02:23:06Z</dc:date>
    <item>
      <title>PROC SQL remerging</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/332784#M74943</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Below is a PROC SQL query that is surprising me.&amp;nbsp; It has a many-to-one join of two subqueries a and b, and then uses a group by clause to aggregate the result.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have ;
  input id1 id2 x y ;
  cards ;
1 1 10 100
1 2 20 200
2 1 30 300
2 2 40 400
 ;
quit ;

proc sql ;
  select
    a.id1
   ,sum(max_x) as blah
   ,max_y  /*this causes remerging in 9.4 but not 9.3! */  /*if change to max_y as max_y in 9.4, it won't remerge! */
  from
    (select id1,max(x) as max_x
     from have
     group by id1,id2
     ) as a
  ,(select id1,max(y) as max_y
     from have
     group by id1
     ) as b
  where a.id1=b.id1
  group by a.id1
   ;
quit ;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Observations:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;When I run the code in 9.3, it does not trigger remerging, and the output is two rows.&lt;/LI&gt;
&lt;LI&gt;When I run the code in 9.4M3, it does trigger remerging, and the output is four rows.&lt;/LI&gt;
&lt;LI&gt;When I run the code in 9.4M3, if I change the first select statement from ",max_y" to ",max_y as max_y" it does NOT trigger remerging (and output is two rows, like 9.3).&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;I think I can convince myself that returning four rows is correct (since there is no excplicit aggregate function, even though the data were already aggregated on the second subquery).&amp;nbsp; Would that suggest that this was a bug in 9.3 not to remerge?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I can't imagine why adding "as max_y" would cause 9.4M3 to decide NOT to remerge.&amp;nbsp; But it makes me wonder if in SAS SQL not remerging is actually the "correct" behavior.&amp;nbsp; i.e., is the SAS SQL compiler supposed to be smart enough to see that the max_y has already been aggregated to the appropriate level?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;--Q.&lt;/P&gt;</description>
      <pubDate>Tue, 14 Feb 2017 20:37:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/332784#M74943</guid>
      <dc:creator>Quentin</dc:creator>
      <dc:date>2017-02-14T20:37:53Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL remerging</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/332802#M74945</link>
      <description>&lt;P&gt;Interesting. I think it should always remerge. Anyhow, these two operations should give identical results:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have ;
  input id1 id2 x y ;
  cards ;
1 1 10 100
1 2 20 200
2 1 30 300
2 2 40 400
 ;

proc sql;

title "Views";
create view a as
select id1, max(x) as max_x
        from have
        group by id1, id2;
create view b as
select id1, max(y) as max_y
        from have
        group by id1;
select
    a.id1
   ,sum(max_x) as blah
   ,max_y as max_y
from a inner join b on a.id1 = b.id1
group by a.id1;

title "Tables";
create table c as
select id1, max(x) as max_x
        from have
        group by id1, id2;
create table d as
select id1, max(y) as max_y
        from have
        group by id1;
select
    c.id1
   ,sum(max_x) as blah
   ,max_y as max_y
from c inner join d on c.id1 = d.id1
group by c.id1;

quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;they don't (in SAS/SQL version 9.4)&lt;/P&gt;</description>
      <pubDate>Tue, 14 Feb 2017 21:29:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/332802#M74945</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2017-02-14T21:29:42Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL remerging</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/332868#M74962</link>
      <description>&lt;PRE&gt;
Did you try :

,sum(a.max_x) as blah
   , b.max_y 

Better feedback it to sas support.


&lt;/PRE&gt;</description>
      <pubDate>Wed, 15 Feb 2017 02:23:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/332868#M74962</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2017-02-15T02:23:06Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL remerging</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/332928#M74981</link>
      <description>I think the _TREE procedure option might reveal the different query structures.&lt;BR /&gt;</description>
      <pubDate>Wed, 15 Feb 2017 09:22:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/332928#M74981</guid>
      <dc:creator>Peter_C</dc:creator>
      <dc:date>2017-02-15T09:22:32Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL remerging</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/333090#M75013</link>
      <description>&lt;P&gt;Thanks all.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Yes &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/15174"&gt;@Peter_C&lt;/a&gt;&amp;nbsp;_TREE showed differences (in particular the value of FLAG was different for max_y) but unfortunately I'm not adept at interpretting tree. In any case, it was helpful evidence that the SQL optimizer is making different decisisosn.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/18408"&gt;@Ksharp&lt;/a&gt;&amp;nbsp;, changing to b.max_y did not change results.&amp;nbsp; I will definitely send it in to tech support.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats&lt;/a&gt;&amp;nbsp;thanks for adding your very helpful examples of using views vs tables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Below I have 6 queries.&amp;nbsp; Three approaches (subqueries, views, tables ) each with "select Max_Y" and "select Max_Y as Max_Y".&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In 9.3 the results are consistent (though not what &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats&lt;/a&gt;&amp;nbsp;expects).&amp;nbsp; With the subquery and view approach, there is no remerging.&amp;nbsp; Witht the tables approach&amp;nbsp;there is remerging.&amp;nbsp; This is plausible to me, if the SQL optimizer is "smart" enough to look into the subqueries and views and see that Max_Y has already been aggregated to the appropriate level, so does not need to be remerged.&amp;nbsp; In 9.3, this is true regardless of using "select Max_Y or "select Max_Y as Max_Y".&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In 9.4M3, if you use &amp;nbsp;"select Max_Y as Max_Y" the results match 9.3 (i.e. subquery approach and view approach do NOT remerge).&amp;nbsp; But in 9.4M3, if you use "select Max_Y" the subquery approach and view approach do remerge.&amp;nbsp; I can't see an argument for why using "as" should change the results of the query, so I feel like this is a bug in 9.4.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My questions for tech support will be&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1.&amp;nbsp; "Which of these queries *should* remerge?"&amp;nbsp; Here I expect they will say use of subqueries and views should not remerge, but tables should remerge.&amp;nbsp; Even though this may be undesired from perspective of subqueries/views vs tables behaving differently, I think maybe it's fair because the whole remerging concept is non-standard SQL&amp;nbsp;anyway, so SAS gets to decide the rules.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2.&amp;nbsp; "In 9.4M3, why would use of 'as' change the behavior?" Here I hope their answer will be that this is a bug in 9.4&amp;nbsp; That is, I hope they wil say that 9.4 behavior should be same as 9.3.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Test code:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql  ; *_tree;

title1 "1. Subqueries with max_y as max_y";
title2 "9.4M3 No Remerging" ;
select
  a.id1
 ,sum(max_x) as blah
 ,b.max_y as max_y
from
  (select id1,max(x) as max_x
   from have
   group by id1,id2
   ) as a
inner join (select id1,max(y) as max_y
   from have
   group by id1
   ) as b
on a.id1=b.id1
group by a.id1
 ;

title1 "2. Views with max_y as max_y";
title2 "9.4M3 No Remerging" ;
create view a as
select id1, max(x) as max_x
        from have
        group by id1, id2;
create view b as
select id1, max(y) as max_y
        from have
        group by id1;
select
    a.id1
   ,sum(max_x) as blah
   ,b.max_y as max_y
from a inner join b on a.id1 = b.id1
group by a.id1;

title1 "3. Tables with max_y as max_y";
title2 "9.4M3 Remerging" ;
create table c as
select id1, max(x) as max_x
        from have
        group by id1, id2;
create table d as
select id1, max(y) as max_y
        from have
        group by id1;
select
    c.id1
   ,sum(max_x) as blah
   ,max_y as max_y
from c inner join d on c.id1 = d.id1
group by c.id1;


title1 "4. Subqueries with max_y (no as)";
title2 "9.4M3 Remerging" ;
select
  a.id1
 ,sum(max_x) as blah
 ,b.max_y
from
  (select id1,max(x) as max_x
   from have
   group by id1,id2
   ) as a
inner join (select id1,max(y) as max_y
   from have
   group by id1
   ) as b
on a.id1=b.id1
group by a.id1
 ;

title1 "5. Views with max_y (no as)";
title2 "9.4M3 Remerging" ;
create view a as
select id1, max(x) as max_x
        from have
        group by id1, id2;
create view b as
select id1, max(y) as max_y
        from have
        group by id1;
select
    a.id1
   ,sum(max_x) as blah
   ,b.max_y 
from a inner join b on a.id1 = b.id1
group by a.id1;

title1 "6. Tables with max_y (no as)";
title2 "9.4M3 Remerging" ;
create table c as
select id1, max(x) as max_x
        from have
        group by id1, id2;
create table d as
select id1, max(y) as max_y
        from have
        group by id1;
select
    c.id1
   ,sum(max_x) as blah
   ,max_y as max_y
from c inner join d on c.id1 = d.id1
group by c.id1;


quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 15 Feb 2017 17:31:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/333090#M75013</guid>
      <dc:creator>Quentin</dc:creator>
      <dc:date>2017-02-15T17:31:24Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL remerging</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/333100#M75017</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/19879"&gt;@Quentin&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Please keep us up to date on SAS's response to your questions.&lt;/P&gt;</description>
      <pubDate>Wed, 15 Feb 2017 18:12:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/333100#M75017</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2017-02-15T18:12:01Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL remerging</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/334419#M75521</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I heard back from tech support.&amp;nbsp; My expectation/hope was incorrect; they agreed with the expectations of &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats&lt;/a&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Per tech support, all six of the scenarios I posted should remerge.&amp;nbsp; So regardless of whether using subqueries / views / interim tables, and regardless of whether or not you use an alias, it should remerge.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The remerging rule is straight-forward: "This note is generated when a column, listed on the select clause, is not being used in a summary function and the column is not listed on the GROUP BY clause."&amp;nbsp; &lt;A href="http://support.sas.com/techsup/notes/v8/4/308.html" target="_blank"&gt;http://support.sas.com/techsup/notes/v8/4/308.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So there&amp;nbsp;is a bug&amp;nbsp; in 9.3 that the subquery and view approach do not remerge.&amp;nbsp; This bug was fixed in 9.4, so that they do remerge.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There is a different bug in 9.4 such that if&amp;nbsp;you use an alias ("as") on the SELECT clause, or a format, it will not remerge.&amp;nbsp; They will add a TS note about this bug.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks all for the help in understanding what is happening.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;--Q.&lt;/P&gt;</description>
      <pubDate>Mon, 20 Feb 2017 18:39:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/334419#M75521</guid>
      <dc:creator>Quentin</dc:creator>
      <dc:date>2017-02-20T18:39:27Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL remerging</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/334446#M75528</link>
      <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/19879"&gt;@Quentin&lt;/a&gt;&amp;nbsp;for carrying this through. Remerging is a very useful feature of SAS/SQL, it should be done flawlessly.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 20 Feb 2017 19:33:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/334446#M75528</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2017-02-20T19:33:41Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL remerging</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/334489#M75538</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/19879"&gt;@Quentin&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I too appreciate you pasing along SAS's&amp;nbsp; answer to this question.&lt;/P&gt;</description>
      <pubDate>Mon, 20 Feb 2017 22:27:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-remerging/m-p/334489#M75538</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2017-02-20T22:27:23Z</dc:date>
    </item>
  </channel>
</rss>

