<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic What's the better way to flag highest and lowest observations using proc sql in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321843#M61976</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm trying to find a way to use proc sql to flag highest or lowest observations similar to the way we do it in data step:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=sashelp.shoes out=shoes;
by region product sales;
run;

data shoes2;
set shoes;
by region product sales;
if first.product then flag = 1;
else if last.product then flag = 2;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I have come up with this solution:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
select *, min(sales) as min, max(sales) as max,
case 
	when sales = calculated min then 1
	when sales = calculated max then 2
	else .
end as flag
from sashelp.shoes
group by region, product
order by region, product, sales;
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;It does the job. I'm just wondering if there is a more efficient way to do these kind of flagging?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The proc sql code I have above requires reemerging and I'm not sure if this is the best way to do this task.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can anyone offer any suggestions? thanks.&lt;/P&gt;</description>
    <pubDate>Sat, 31 Dec 2016 19:00:02 GMT</pubDate>
    <dc:creator>kisumsam</dc:creator>
    <dc:date>2016-12-31T19:00:02Z</dc:date>
    <item>
      <title>What's the better way to flag highest and lowest observations using proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321843#M61976</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm trying to find a way to use proc sql to flag highest or lowest observations similar to the way we do it in data step:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=sashelp.shoes out=shoes;
by region product sales;
run;

data shoes2;
set shoes;
by region product sales;
if first.product then flag = 1;
else if last.product then flag = 2;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I have come up with this solution:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
select *, min(sales) as min, max(sales) as max,
case 
	when sales = calculated min then 1
	when sales = calculated max then 2
	else .
end as flag
from sashelp.shoes
group by region, product
order by region, product, sales;
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;It does the job. I'm just wondering if there is a more efficient way to do these kind of flagging?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The proc sql code I have above requires reemerging and I'm not sure if this is the best way to do this task.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can anyone offer any suggestions? thanks.&lt;/P&gt;</description>
      <pubDate>Sat, 31 Dec 2016 19:00:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321843#M61976</guid>
      <dc:creator>kisumsam</dc:creator>
      <dc:date>2016-12-31T19:00:02Z</dc:date>
    </item>
    <item>
      <title>Re: What's the better way to flag highest and lowest observations using proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321846#M61977</link>
      <description>&lt;P&gt;This is one of those areas, as you are obviously aware, where a data step is superior: data steps think globally but acts locally; SQL only acts globally.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;By extending your code to use a subjoin (or creating a summarised view) and effectively doing a manual re-merge, the optimiser will&amp;nbsp;&lt;EM&gt;probably&lt;/EM&gt; do a better job. By using the&amp;nbsp;&lt;EM&gt;mag&lt;/EM&gt;&lt;EM&gt;ic&amp;nbsp;&lt;/EM&gt;keyword and value of 103, I've forced it to do a hash join which, on a really big dataset&amp;nbsp;&lt;EM&gt;may&lt;/EM&gt; make things run faster.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql magic=103;
create table firstlast as
   select shoes.*,
          sum.min, 
          sum.max,
          case
             when sales = sum.min 
                then 1
             when sales = sum.max
                then 2
             else .
             end as flag
     from sashelp.shoes as shoes
    inner join (select region,
                       product,
                       min(sales) as min,
                       max(sales) as max
                  from sashelp.shoes
                 group by region,
                          product) as sum
        on shoes.region = sum.region
       and shoes.product = sum.product
     order by shoes.region,
              shoes.product,
              shoes.sales;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 31 Dec 2016 20:39:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321846#M61977</guid>
      <dc:creator>LaurieF</dc:creator>
      <dc:date>2016-12-31T20:39:31Z</dc:date>
    </item>
    <item>
      <title>Re: What's the better way to flag highest and lowest observations using proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321849#M61978</link>
      <description>&lt;P&gt;Thanks LaurieF. I tried using a similar join as well but the code seems to be overly complicated while not reducing much of the CPU/Real time.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Good to know data step is superior is this case. Thanks!&lt;/P&gt;</description>
      <pubDate>Sat, 31 Dec 2016 21:11:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321849#M61978</guid>
      <dc:creator>kisumsam</dc:creator>
      <dc:date>2016-12-31T21:11:31Z</dc:date>
    </item>
    <item>
      <title>Re: What's the better way to flag highest and lowest observations using proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321850#M61979</link>
      <description>&lt;P&gt;Yeah - I thought as much. SAS SQL&amp;nbsp;just&amp;nbsp;doesn't have the facility to do it. I think you can do something&amp;nbsp;&lt;EM&gt;similar&lt;/EM&gt; in Oracle SQL, but it still requires a fair amount of code.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 31 Dec 2016 21:15:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321850#M61979</guid>
      <dc:creator>LaurieF</dc:creator>
      <dc:date>2016-12-31T21:15:34Z</dc:date>
    </item>
    <item>
      <title>Re: What's the better way to flag highest and lowest observations using proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321862#M61980</link>
      <description>&lt;P&gt;If you have pass through access to your database, the latest versions of SQL (ORACLE/DB2) have either cursor logic or WINDOW/PIVOT technology that helps with this somewhat. It's still more code and I don't find it as clear as a data step.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 01 Jan 2017 00:54:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321862#M61980</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2017-01-01T00:54:23Z</dc:date>
    </item>
    <item>
      <title>Re: What's the better way to flag highest and lowest observations using proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321888#M61981</link>
      <description>&lt;P&gt;If it's a big flle and you use the data step approach, I would PROC SUMMARY instead of PROC SORT - much less input/output, ... and it preserves the original record order.&amp;nbsp; Then use a hash lookup to set flag values:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=sashelp.shoes (keep=region product sales) noprint nway;
  class region product;
  output out=need (drop=_type_ _freq_) min(sales)=minsale max(sales)=maxsale;
run;

data shoes2 (drop=rc minsale maxsale);
  set sashelp.shoes end=eod;
  flag=.;
  if _n_=1 then do;
    if 0 then set need;
    declare hash h (dataset:'need' );
      h.definekey('region','product');
      h.definedata('minsale','maxsale');
      h.definedone();
  end;
  rc=h.find();
  if sales=minsale then flag=1;
  else if sales=maxsale then flag=2;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 01 Jan 2017 22:49:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321888#M61981</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2017-01-01T22:49:23Z</dc:date>
    </item>
    <item>
      <title>Re: What's the better way to flag highest and lowest observations using proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321889#M61982</link>
      <description>&lt;P&gt;Oh very good - I do like a good hash table lookup.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Bear in mind for this solution, and the SQL method: if two products have the same sales value, the flag will be set incorrectly. The only &lt;EM&gt;easy&lt;/EM&gt; way to be absolutely sure is to use first/last.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And also note: the original point was to find out if there was an SQL method for first/last processing, not whether there was&amp;nbsp;&lt;EM&gt;another&lt;/EM&gt; way of doing it.&lt;/P&gt;</description>
      <pubDate>Mon, 02 Jan 2017 02:18:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/321889#M61982</guid>
      <dc:creator>LaurieF</dc:creator>
      <dc:date>2017-01-02T02:18:34Z</dc:date>
    </item>
    <item>
      <title>Re: What's the better way to flag highest and lowest observations using proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/322010#M61989</link>
      <description>&lt;P&gt;I'd like to revise my original suggestion, in light of the possibility of there being multiple sales values for a region/product, by using monotonic() (after a sort!).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql undo_policy=none magic=103;
create table shoes as
   select *
     from sashelp.shoes
    order by region,
             product,
             sales;
create table shoes as
   select *,
          monotonic() as identifier
     from shoes;
create table shoes(drop=identifier) as
   select shoes.*,
          case
             when shoes.identifier = sum.min_id 
                then 1
             when shoes.identifier = sum.max_id
                then 2
             else .
             end as flag length=3
     from shoes
    inner join (select region,
                       product,
                       min(identifier) as min_id,
                       max(identifier) as max_id
                  from shoes
                 group by region,
                          product) as sum
       on shoes.region = sum.region
      and shoes.product = sum.product;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Again, this requires multiple leaps through hoops, and is ultimately unsatisfactory. Data steps FTW, as the young things say.&lt;/P&gt;</description>
      <pubDate>Mon, 02 Jan 2017 21:12:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/322010#M61989</guid>
      <dc:creator>LaurieF</dc:creator>
      <dc:date>2017-01-02T21:12:48Z</dc:date>
    </item>
    <item>
      <title>Re: What's the better way to flag highest and lowest observations using proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/322027#M61991</link>
      <description>&lt;P&gt;Thanks all for the wonderful suggestions! The hash table looks good but as LaurieF said it might not get the correct results. Looks like the first/last data step method is indeed the best.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 02 Jan 2017 23:48:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/322027#M61991</guid>
      <dc:creator>kisumsam</dc:creator>
      <dc:date>2017-01-02T23:48:38Z</dc:date>
    </item>
    <item>
      <title>Re: What's the better way to flag highest and lowest observations using proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/322064#M61994</link>
      <description>&lt;P&gt;Is the first./last. method as submitted REALLY the correct result?&amp;nbsp;&amp;nbsp;Do you&amp;nbsp;mean you want to ignore ties at the minimum or maximum, and flag only one record for each?&amp;nbsp; If so, what rule should be used to select which of the tied records gets flagged?&amp;nbsp; The usual&amp;nbsp; PROC SORT (with default option EQUALS) result would preserve original order among ties.&amp;nbsp; So the first MIN and last MAX would be flagged.&amp;nbsp; Is that the desired behavior?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also, to be more general, you might want&amp;nbsp;two flag variables instead of one.&amp;nbsp; That would allow you to accomoate cases in which there is only one record in a group&amp;nbsp; (i.e it's both min and max).&amp;nbsp; The&amp;nbsp;first./last. program assigns a flag=1 and never assigns a flag=2&amp;nbsp;for&amp;nbsp;such a&amp;nbsp;case.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jan 2017 05:37:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/322064#M61994</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2017-01-03T05:37:19Z</dc:date>
    </item>
    <item>
      <title>Re: What's the better way to flag highest and lowest observations using proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/322238#M61996</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;mkeintz. Yes, I was aware that the program isn't ideal when there is only 1 observation in the BY group. I just wanted to find out if there is a more efficient way to achieve&amp;nbsp;the same goal with proc sql.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jan 2017 20:17:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/322238#M61996</guid>
      <dc:creator>kisumsam</dc:creator>
      <dc:date>2017-01-03T20:17:55Z</dc:date>
    </item>
    <item>
      <title>Re: What's the better way to flag highest and lowest observations using proc sql</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/322270#M61997</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/17429"&gt;@LaurieF&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm confused.&amp;nbsp; I do not understand why it is that "if two products have the same sales value, the flag will be set incorrectly".&amp;nbsp; The hash object proposal generates independent min's and max's for each product, regardless of whether they are identically distributed.&amp;nbsp; Aside from the issue of whether to flag ties at max or at min WITHIN a region/product, I don't see how it would be different than the PROC-SORT-FOLLOWED-BY FIRST./LAST. technique.&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jan 2017 22:39:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/What-s-the-better-way-to-flag-highest-and-lowest-observations/m-p/322270#M61997</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2017-01-03T22:39:24Z</dc:date>
    </item>
  </channel>
</rss>

