Help using Base SAS procedures

Data Remerging in PROC SQL

Reply
Frequent Contributor
Posts: 88

Data Remerging in PROC SQL

Hello Folks,

  I have a quick question on  Data Remerging  in PROC SQL.

I was going through the Book SAS Certification Prep Guide: Advanced Programming for SAS 9, Fourth Edition    and  in   Chapter 2 (Performing Advanced Queries with PROC SQL)  under the topic Understanding Data Remerging


The book says one of the condition under which Data Remerging will occur in PROC SQL is the following :

The HAVING clause specifies one or more columns or column expressions that are not included in a subquery or a GROUP BY clause.


The same Condition is also provided here under the section : Remerging Summary Statistics

SAS(R) 9.3 SQL Procedure User's Guide



I am trying to understand the above  condition.  I understand the Data remerging  when the HAVING Clause has a Column/Expression which is not in the  GROUP BY  Clause

For eg  , the below PROC SQL will show Data Remerging because the HAVING Clause has a column Salary which is NOT in the GROUP BY Clause  assuming the dataset  sasuser.payrollmaster  has columns  JobCode  and Salary

proc sql;

       select jobcode, avg( salary) as AvgSalary format = dollar11.2

       from sasuser.payrollmaster

       group by JobCode

      having salary > avg( salary);

Quit;

But unable to understand how  the Data remerging occurs when the Column/Expression is not present in a Subquery.

Can someone  help me with a PROC SQL  example which shows Data remerging occuring if the HAVING clause has a Column/Expression which is NOT present in a Sub query.

Thanks,

Prashant

Super User
Posts: 18,589

Re: Data Remerging in PROC SQL

Although I think this illustrates the issue, in practice it usually occurs when the variable is in the SELECT statement and not summarized or used in a HAVING/GROUP BY that you'll get the remerge warning.

proc sql;

title 'Remerging example';

select sex,  mean(weight) as weight

from sashelp.class

group by sex

having age>12;

title 'Values from original data, no HAVE clause';

select sex, age, mean(weight) as weight

from sashelp.class

group by sex;

title 'Not equivalent to WHERE';

select sex, age, mean(weight) as weight

from sashelp.class

where age>12

group by sex;

quit;

Frequent Contributor
Posts: 88

Re: Data Remerging in PROC SQL

Thanks Reeza. 

According to the Advanced SAS  Prep Guide   Data  Remerging occurs  in PROC SQL whenever any of the following conditions exist:

  1. The values returned by a summary function are used in a calculation.
  2. The SELECT clause specifies a column that contains a summary function and other column( s) that are not listed in a GROUP BY clause.
  3. The HAVING clause specifies one or more columns or column expressions that are not included in a subquery or a GROUP BY clause.

The Examples you  provided do not involve any Sub queries  but  includes  Scenario  2 above  3.

Is there a scenario where the Subquery in HAVING Clause causes Data Remerging  as mentioned by scenario 3 above?

Super User
Posts: 18,589

Re: Data Remerging in PROC SQL

Where does the first example fall under?

Frequent Contributor
Posts: 88

Re: Data Remerging in PROC SQL

Reeza,

Your First Example falls under Scenario  3  above where the HAVING Clause has a Column or Expression not in the GROUP BY Clause.

I was looking for a  PROC SQL  Example which involves a Sub query  and HAVING Clause  in Scenario 3  which causes  Data Remerging.

Ask a Question
Discussion stats
  • 4 replies
  • 596 views
  • 0 likes
  • 2 in conversation