<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Help with Grouping in PROC SQL in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Help-with-Grouping-in-PROC-SQL/m-p/644965#M192730</link>
    <description>&lt;P&gt;Thank you.&amp;nbsp; This code has gone through numerous changes based on feedback from various SMEs, particularly with regard to the WHERE statement.&amp;nbsp; I tested your solution, and with some minor tweaks, I was able to get it to provide the results I needed.&amp;nbsp; I had never performed joins on subsets, so this was something I would never have thought of on my own.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 04 May 2020 12:26:21 GMT</pubDate>
    <dc:creator>RandoDando</dc:creator>
    <dc:date>2020-05-04T12:26:21Z</dc:date>
    <item>
      <title>Help with Grouping in PROC SQL</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-Grouping-in-PROC-SQL/m-p/638900#M190006</link>
      <description>&lt;P&gt;Hello&lt;/P&gt;&lt;P&gt;I need some help writing a SQL query which will group by a variable while taking the max of something where something is true.&amp;nbsp; Sounds much simpler than it really is, so I'll jump right to my query...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table INSPN_FY as
select distinct
a.HUB_Name,
b.PROP_ID,
b.PROP_NAME,
c.INSPN_FISC_YR,
e.FISC_YR as ASMT_FISC_YR,
c.INSPECTION_ID,
c.INSPECTION_CD,
c.INSPECTION_SCORE
from HUB_TBL a INNER JOIN PROPERTY_TBL b ON a.HUB_ID  = b.HUB_ID 
INNER JOIN 
INSPECTION_TBL c ON c.PROP_ID = b.PROP_ID
LEFT JOIN INSPECTION_ASSESSMENT d ON c.INSPECTION_ID = d.INSPECTION_ID
LEFT JOIN ASSESSMENT_TBL e on d.ASMT_ID  = e.ASMT_ID and d.GRP_ID = e.GRP_ID and d.VER_ID = e.VER_ID
where  e.FISC_YR &amp;lt;&amp;gt; . and 
c.INSPECTION_ID &amp;gt;=500000 and c.INSPECTION_ID &amp;lt;1000000 and
c.INSPECTION_CD in('RTN','IRA', 'VUR', 'AWP', 'VUI')
and c.PROP_ID is not null and
(c.INSPECTION_SCORE is not null OR c.INSPECTION_CD in( 'VUI'))  
and b.PROPERTY_UNIT_CNT &amp;gt; 0 
ORDER BY a.HUB_Name,
b.PROP_ID,
b.PROP_NAME,
c.INSPN_FISC_YR,
e.FISC_YR ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I want to select inspections for properties by assessment FY, where the conditions in the WHERE clause are met.&amp;nbsp; However, some cases have 2 inspections in the same assessment year, one with a VUI status and one released, such as:&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;PROPERTY_ID&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;PROPERTY_UNIT_CNT&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;INSPN_FISC_YR&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;ASMT_FISC_YR&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;INSPECTION_ID&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;INSPECTION_CODE&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;XX999999&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;28&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;2016&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;2017&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;550505&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;RTN&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;XX999999&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;28&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;2017&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;2017&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;690908&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;VUI&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And some have two released, such as:&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;PROPERTY_ID&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;PROPERTY_UNIT_CNT&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;INSPN_FISC_YR&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;ASMT_FISC_YR&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;INSPECTION_ID&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&lt;CODE class=" language-sas"&gt;INSPECTION_CD&lt;/CODE&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;XX999999&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;74&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;2016&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;2017&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;570707&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;RTN&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;XX999999&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;74&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;2017&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;2017&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;680808&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;TD&gt;&lt;P&gt;&lt;STRONG&gt;RTN&lt;/STRONG&gt;&lt;/P&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In the first case, I want the RTN, while in the second case I want the MAX inspection ID.&amp;nbsp; However, if the only inspection for a FY is VUI, I still want it (only omit them if there is a duplicative released).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;How can I do this in 1 SQL step?&amp;nbsp; I need it in 1 step because I plan to have this table automated in an ETL.&amp;nbsp; SQL only please.&lt;/P&gt;</description>
      <pubDate>Fri, 10 Apr 2020 11:51:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-Grouping-in-PROC-SQL/m-p/638900#M190006</guid>
      <dc:creator>RandoDando</dc:creator>
      <dc:date>2020-04-10T11:51:45Z</dc:date>
    </item>
    <item>
      <title>Re: Help with Grouping in PROC SQL</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-Grouping-in-PROC-SQL/m-p/639114#M190061</link>
      <description>&lt;P&gt;A 5 table join is a lot to look at with out any data.&lt;/P&gt;
&lt;P&gt;The first angle of attach is probably to change the join to table&lt;/P&gt;
&lt;LI-CODE lang="sas"&gt;INNER JOIN 
INSPECTION_TBL c &lt;/LI-CODE&gt;
&lt;P&gt;to a join to subselect&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;INNER JOIN 
 ( SELECT ... stuff ...
   from INSPECTION_TBL 
   where ... correlation criteria ...
   group by ... 
   having ... criteria for selecting desired row ...
 ) as c&lt;/LI-CODE&gt;
&lt;P&gt;Complex ordering and selection rules may require the subselect have additional subselects itself.&lt;/P&gt;</description>
      <pubDate>Sat, 11 Apr 2020 04:22:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-Grouping-in-PROC-SQL/m-p/639114#M190061</guid>
      <dc:creator>RichardDeVen</dc:creator>
      <dc:date>2020-04-11T04:22:56Z</dc:date>
    </item>
    <item>
      <title>Re: Help with Grouping in PROC SQL</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-Grouping-in-PROC-SQL/m-p/639159#M190080</link>
      <description>&lt;P&gt;The first thing I do when I see code like yours is to make it readable by&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Getting rid of the aliases. With a query as large as that, referring to tables as "a", "b", "c" etc. only obfuscates the whole thing. This is easily accomplished by a search an replace in any text editor.&lt;/LI&gt;
&lt;LI&gt;Doing a little bit of indentation.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;After that, your code looks like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table INSPN_FY as select distinct
  HUB_TBL.HUB_Name,
  PROPERTY_TBL.PROP_ID,
  PROPERTY_TBL.PROP_NAME,
  INSPECTION_TBL.INSPN_FISC_YR,
  ASSESSMENT_TBL.FISC_YR as ASMT_FISC_YR,
  INSPECTION_TBL.INSPECTION_ID,
  INSPECTION_TBL.INSPECTION_CD,
  INSPECTION_TBL.INSPECTION_SCORE
from HUB_TBL INNER JOIN 
     PROPERTY_TBL ON HUB_TBL.HUB_ID  = PROPERTY_TBL.HUB_ID INNER JOIN 
     INSPECTION_TBL ON INSPECTION_TBL.PROP_ID = PROPERTY_TBL.PROP_ID LEFT JOIN 
     INSPECTION_ASSESSMENT ON INSPECTION_TBL.INSPECTION_ID = INSPECTION_ASSESSMENT.INSPECTION_ID LEFT JOIN 
     ASSESSMENT_TBL on INSPECTION_ASSESSMENT.ASMT_ID  = ASSESSMENT_TBL.ASMT_ID 
      and INSPECTION_ASSESSMENT.GRP_ID = ASSESSMENT_TBL.GRP_ID 
      and INSPECTION_ASSESSMENT.VER_ID = ASSESSMENT_TBL.VER_ID
where ASSESSMENT_TBL.FISC_YR &amp;lt;&amp;gt; . 
  and INSPECTION_TBL.INSPECTION_ID &amp;gt;=500000 and INSPECTION_TBL.INSPECTION_ID &amp;lt;1000000 
  and INSPECTION_TBL.INSPECTION_CD in('RTN','IRA', 'VUR', 'AWP', 'VUI')
  and INSPECTION_TBL.PROP_ID is not null 
  and (INSPECTION_TBL.INSPECTION_SCORE is not null OR INSPECTION_TBL.INSPECTION_CD in( 'VUI'))  
  and PROPERTY_TBL.PROPERTY_UNIT_CNT &amp;gt; 0 
ORDER BY 
  HUB_TBL.HUB_Name,
  PROPERTY_TBL.PROP_ID,
  PROPERTY_TBL.PROP_NAME,
  INSPECTION_TBL.INSPN_FISC_YR,
  ASSESSMENT_TBL.FISC_YR ;
quit;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The first thing that becomes obvious is that the LEFT joins are silly, they suggest that you may want to have stuff that is not in the ASSESMENT_TBL, but you explicitly make a WHERE clause specifying that assessment fiscal year cannot be missing. So really, it is all inner join.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As you are not using any data from the link table, my next suggestion would be to put the assessment stuff into a subquery:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table INSPN_FY as select distinct
  HUB_TBL.HUB_Name,
  PROPERTY_TBL.PROP_ID,
  PROPERTY_TBL.PROP_NAME,
  INSPECTION_TBL.INSPN_FISC_YR,
  ASSESSMENTS.FISC_YR as ASMT_FISC_YR,
  INSPECTION_TBL.INSPECTION_ID,
  INSPECTION_TBL.INSPECTION_CD,
  INSPECTION_TBL.INSPECTION_SCORE
from HUB_TBL INNER JOIN 
     PROPERTY_TBL ON HUB_TBL.HUB_ID  = PROPERTY_TBL.HUB_ID INNER JOIN 
     INSPECTION_TBL ON INSPECTION_TBL.PROP_ID = PROPERTY_TBL.PROP_ID JOIN 
     (select 
        INSPECTION_ASSESSMENT.INSPECTION_ID,
        ASSESSMENT_TBL.FISC_YR
      from INSPECTION_ASSESSMENT JOIN 
           ASSESSMENT_TBL on INSPECTION_ASSESSMENT.ASMT_ID  = ASSESSMENT_TBL.ASMT_ID 
             and INSPECTION_ASSESSMENT.GRP_ID = ASSESSMENT_TBL.GRP_ID 
             and INSPECTION_ASSESSMENT.VER_ID = ASSESSMENT_TBL.VER_ID
      where ASSESSMENT_TBL.FISC_YR &amp;lt;&amp;gt; . ) ASSESSMENTS on ASSESSMENTS.INSPECTION_ID=INSPECTION_TBL.INSPECTION_ID 
where INSPECTION_TBL.INSPECTION_ID &amp;gt;=500000 and INSPECTION_TBL.INSPECTION_ID &amp;lt;1000000 
  and INSPECTION_TBL.INSPECTION_CD in('RTN','IRA', 'VUR', 'AWP', 'VUI')
  and INSPECTION_TBL.PROP_ID is not null 
  and (INSPECTION_TBL.INSPECTION_SCORE is not null OR INSPECTION_TBL.INSPECTION_CD in( 'VUI'))  
  and PROPERTY_TBL.PROPERTY_UNIT_CNT &amp;gt; 0 
ORDER BY 
  HUB_TBL.HUB_Name,
  PROPERTY_TBL.PROP_ID,
  PROPERTY_TBL.PROP_NAME,
  INSPECTION_TBL.INSPN_FISC_YR,
  ASSESSMENT_TBL.FISC_YR ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Now, what you want is the maximum PROPERTY_ID for each ASMT_FISC_YR, except that INSPECTION_CODE='RTN' takes precedence. If you want to calculate that in a single query, the easiest way is probably to create a dummy variable, which is the property id plus a large number for 'RTN', and just the id for the others:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;case INSPECTION_TBL.INSPECTION_CD
  when 'RTN' then INSPECTION_TBL.INSPECTION_ID+1E8
  else INSPECTION_TBL.INSPECTION_ID
end&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You can then use that as your criterion for selecting, e.g.:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;create table INSPN_FY as select distinct
  HUB_TBL.HUB_Name,
  PROPERTY_TBL.PROP_ID,
  PROPERTY_TBL.PROP_NAME,
  INSPECTION_TBL.INSPN_FISC_YR,
  ASSESSMENTS.FISC_YR as ASMT_FISC_YR,
  INSPECTION_TBL.INSPECTION_ID,
  INSPECTION_TBL.INSPECTION_CD,
  INSPECTION_TBL.INSPECTION_SCORE
from HUB_TBL INNER JOIN 
     PROPERTY_TBL ON HUB_TBL.HUB_ID  = PROPERTY_TBL.HUB_ID INNER JOIN 
     INSPECTION_TBL ON INSPECTION_TBL.PROP_ID = PROPERTY_TBL.PROP_ID JOIN 
     (select 
        INSPECTION_ASSESSMENT.INSPECTION_ID,
        ASSESSMENT_TBL.FISC_YR
      from INSPECTION_ASSESSMENT JOIN 
           ASSESSMENT_TBL on INSPECTION_ASSESSMENT.ASMT_ID  = ASSESSMENT_TBL.ASMT_ID 
             and INSPECTION_ASSESSMENT.GRP_ID = ASSESSMENT_TBL.GRP_ID 
             and INSPECTION_ASSESSMENT.VER_ID = ASSESSMENT_TBL.VER_ID
      where ASSESSMENT_TBL.FISC_YR &amp;lt;&amp;gt; . ) ASSESSMENTS on ASSESSMENTS.INSPECTION_ID=INSPECTION_TBL.INSPECTION_ID 
where INSPECTION_TBL.INSPECTION_ID &amp;gt;=500000 and INSPECTION_TBL.INSPECTION_ID &amp;lt;1000000 
  and INSPECTION_TBL.INSPECTION_CD in('RTN','IRA', 'VUR', 'AWP', 'VUI')
  and INSPECTION_TBL.PROP_ID is not null 
  and (INSPECTION_TBL.INSPECTION_SCORE is not null OR INSPECTION_TBL.INSPECTION_CD in( 'VUI'))  
  and PROPERTY_TBL.PROPERTY_UNIT_CNT &amp;gt; 0 
group by 
  INSPECTION_TBL.INSPECTION_ID,
  ASSESSMENTS.FISC_YR
having case INSPECTION_TBL.INSPECTION_CD
  when 'RTN' then INSPECTION_TBL.INSPECTION_ID+1E8
  else INSPECTION_TBL.INSPECTION_ID
  end 
  =max(case INSPECTION_TBL.INSPECTION_CD
        when 'RTN' then INSPECTION_TBL.INSPECTION_ID+1E8
        else INSPECTION_TBL.INSPECTION_ID
       end)
ORDER BY 
  HUB_TBL.HUB_Name,
  PROPERTY_TBL.PROP_ID,
  PROPERTY_TBL.PROP_NAME,
  INSPECTION_TBL.INSPN_FISC_YR,
  ASSESSMENT_TBL.FISC_YR ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;It may be better, both logically and performance-wise, to calculate the criterion (and everything else) in a subquery, and then apply the GROUP BY, HAVING and ORDER BY to the results of that query. But I leave that as an exercise for the reader. And you may have to add some of the other columns to your GROUP BY.&lt;/P&gt;</description>
      <pubDate>Sat, 11 Apr 2020 16:04:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-Grouping-in-PROC-SQL/m-p/639159#M190080</guid>
      <dc:creator>s_lassen</dc:creator>
      <dc:date>2020-04-11T16:04:36Z</dc:date>
    </item>
    <item>
      <title>Re: Help with Grouping in PROC SQL</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-Grouping-in-PROC-SQL/m-p/644965#M192730</link>
      <description>&lt;P&gt;Thank you.&amp;nbsp; This code has gone through numerous changes based on feedback from various SMEs, particularly with regard to the WHERE statement.&amp;nbsp; I tested your solution, and with some minor tweaks, I was able to get it to provide the results I needed.&amp;nbsp; I had never performed joins on subsets, so this was something I would never have thought of on my own.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 04 May 2020 12:26:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-Grouping-in-PROC-SQL/m-p/644965#M192730</guid>
      <dc:creator>RandoDando</dc:creator>
      <dc:date>2020-05-04T12:26:21Z</dc:date>
    </item>
  </channel>
</rss>

