<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Top N by sum in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Top-N-by-sum/m-p/569620#M160549</link>
    <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=have nway;
    class region;
    var sales;
    output out=total_sales sum=total_sales;
run;
proc sort data=have;
    by region descending sales;
run;
data have2;
    merge have total_sales;
    by region;
    if first.region then cumulative_percent=0;
    percent_of_sales = sales/total_sales;
    cumulative_percent + percent_of_sales;
    start_of_grouping = cumulative_percent - percent_of_sales;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Thu, 27 Jun 2019 20:21:24 GMT</pubDate>
    <dc:creator>PaigeMiller</dc:creator>
    <dc:date>2019-06-27T20:21:24Z</dc:date>
    <item>
      <title>Top N by sum</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Top-N-by-sum/m-p/569602#M160541</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am trying to create a flag for the top N values in my dataset by region.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Right now I have the total amount of sales for each account and I need to find the number of accounts that make up the top 25% and top 50% of sales for the region they are in. The output I am looking for would be the dataset with a flag on the end that says whether or no this account is part of the top 25% of sales, top 50% of sales or neither.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When I say this I do not mean that this account accounts for 25% of the sales itself, rather this account and the subsequent accounts make up 25%.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried sorting by region and sales which gives me a list that looks like this:&lt;/P&gt;&lt;P&gt;Region1 Account1 100&lt;/P&gt;&lt;P&gt;Region1 Account2 75&lt;/P&gt;&lt;P&gt;Region1 Account3 12&lt;/P&gt;&lt;P&gt;...&lt;/P&gt;&lt;P&gt;I then found the % of sales each account is responsible for in their region. Now I am stuck on aggregating the percents from top to bottom&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jun 2019 19:39:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Top-N-by-sum/m-p/569602#M160541</guid>
      <dc:creator>alexgouv</dc:creator>
      <dc:date>2019-06-27T19:39:14Z</dc:date>
    </item>
    <item>
      <title>Re: Top N by sum</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Top-N-by-sum/m-p/569606#M160543</link>
      <description>&lt;P&gt;I'm trying to think my way through this idea of "Top N by Sum" and its not really clear to me exactly how this would work.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I suppose you want to sort by descending sales, as you have shown, with the largest sales account first, but then what happens if the top account sales are &amp;gt;25%? What happens if the top account sales is greater than 50%? What happens if an account's sales moves the sum from 48% to 54%, what group does it fit in?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Can you give a specific example, showing inputs and the desired outputs?&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jun 2019 19:44:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Top-N-by-sum/m-p/569606#M160543</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-06-27T19:44:38Z</dc:date>
    </item>
    <item>
      <title>Re: Top N by sum</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Top-N-by-sum/m-p/569608#M160544</link>
      <description>&lt;P&gt;Hi basically whatever account puts it over the top would be classified in that group.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So if account 1 is 76% of the sales this would be the only account in top 25%, top 50% and top 75%.&amp;nbsp;&lt;/P&gt;&lt;P&gt;If the top account is 24% and the next one is 3% they would both be considered to the top 25%.&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jun 2019 19:47:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Top-N-by-sum/m-p/569608#M160544</guid>
      <dc:creator>alexgouv</dc:creator>
      <dc:date>2019-06-27T19:47:13Z</dc:date>
    </item>
    <item>
      <title>Re: Top N by sum</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Top-N-by-sum/m-p/569613#M160546</link>
      <description>&lt;P&gt;The first step would be to get the cumulative percents ... you would sort by descending sales, divide by total sales to get non-cumulative percent, then get the cumulative percents and do a grouping.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Something like this UNTESTED CODE:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=have;
    var sales;
    output out=total_sales sum=total_sales;
run;
proc sort data=have;
    by descending sales;
run;
data have2;
    if _n_=1 then set total_sales;
    set have;
    percent_of_sales = sales/total_sales;
    cumulative_percent + percent_of_sales;
    start_of_grouping = cumulative_percent - percent_of_sales;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;So, if &lt;FONT face="courier new,courier"&gt;start_of_grouping&lt;/FONT&gt; is less than 25, it is in the &amp;lt;25 group. If &lt;FONT face="courier new,courier"&gt;start_of_grouping&lt;/FONT&gt; is less than 50, it is in the &amp;lt;50 group. And so on.&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jun 2019 20:01:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Top-N-by-sum/m-p/569613#M160546</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-06-27T20:01:52Z</dc:date>
    </item>
    <item>
      <title>Re: Top N by sum</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Top-N-by-sum/m-p/569617#M160548</link>
      <description>&lt;P&gt;Only issue is that its not doing it by region. This is the same issue I was having&lt;/P&gt;</description>
      <pubDate>Thu, 27 Jun 2019 20:19:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Top-N-by-sum/m-p/569617#M160548</guid>
      <dc:creator>alexgouv</dc:creator>
      <dc:date>2019-06-27T20:19:00Z</dc:date>
    </item>
    <item>
      <title>Re: Top N by sum</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Top-N-by-sum/m-p/569620#M160549</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=have nway;
    class region;
    var sales;
    output out=total_sales sum=total_sales;
run;
proc sort data=have;
    by region descending sales;
run;
data have2;
    merge have total_sales;
    by region;
    if first.region then cumulative_percent=0;
    percent_of_sales = sales/total_sales;
    cumulative_percent + percent_of_sales;
    start_of_grouping = cumulative_percent - percent_of_sales;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 27 Jun 2019 20:21:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Top-N-by-sum/m-p/569620#M160549</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-06-27T20:21:24Z</dc:date>
    </item>
  </channel>
</rss>

