<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Help with grouped results in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669713#M200929</link>
    <description>&lt;P&gt;I thought I was clear.&amp;nbsp; The group is the field Countries.&amp;nbsp; I want the output dataset to be filtered by the field called Countries, using the other field Bluneconomic to determine to highest and lowest values.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In the supplied data set, above, Countries has two values 19 and 20.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Lets say I want to top and bottom 2 results.&amp;nbsp; So the output dataset would have 8 rows: the top 2 Bluneconomic values where Countries =19, the top 2 Bluneconomic values where Countries =20,the bottom 2 Bluneconomic values where Countries =19, and&amp;nbsp;the bottom 2 Bluneconomic values where Countries =20.&lt;/P&gt;</description>
    <pubDate>Wed, 15 Jul 2020 20:41:00 GMT</pubDate>
    <dc:creator>texasmfp</dc:creator>
    <dc:date>2020-07-15T20:41:00Z</dc:date>
    <item>
      <title>Help with grouped results</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669686#M200920</link>
      <description>&lt;P&gt;I have a large dataset that is too big to open and filter within SAS or SAS Enterprise Guide.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My current code spits out the top 25k results, previously sorted by Field 1:&lt;/P&gt;
&lt;P&gt;data RESULTS.&amp;amp;top_set;&lt;BR /&gt;set top_set (obs=25000);&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is repeated after sorting Field 1 in descending order:&lt;/P&gt;
&lt;P&gt;data RESULTS.&amp;amp;bottom_set;&lt;BR /&gt;set bottom_set (obs=25000);&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Combined, yields the top/bottom results:&lt;/P&gt;
&lt;P&gt;data top_bottom;&lt;BR /&gt;set RESULTS.&amp;amp;top_set RESULTS.&amp;amp;bottom_set;&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What I want to generate is the top 1,000 and bottom 1,000 results, but for each group.&lt;/P&gt;
&lt;P&gt;There is another field "Countries", which captures the # of countries used to create the result in FIELD 1.&lt;/P&gt;
&lt;P&gt;So, what I really want in the output is the top and bottom results from within each unique Countries value.&lt;/P&gt;
&lt;P&gt;So, the top and bottom for Countries =57, plus the the top and bottom for Countries =56, the top and bottom for Countries =55, etc....&lt;/P&gt;
&lt;P&gt;For each generation of runs, the range of Countries varies.&amp;nbsp; So, it needs to be scalable rather than hard code for a specific list of Countries =values.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 15 Jul 2020 19:57:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669686#M200920</guid>
      <dc:creator>texasmfp</dc:creator>
      <dc:date>2020-07-15T19:57:19Z</dc:date>
    </item>
    <item>
      <title>Re: Help with grouped results</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669689#M200921</link>
      <description>&lt;P&gt;How big are the data sets?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Can you show an example of your input and expected output?&lt;BR /&gt;Or can you illustrate what you want as output using sashelp.heart or sashelp.cars as your input data set.&lt;/P&gt;</description>
      <pubDate>Wed, 15 Jul 2020 20:00:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669689#M200921</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-07-15T20:00:03Z</dc:date>
    </item>
    <item>
      <title>Re: Help with grouped results</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669696#M200923</link>
      <description>&lt;P&gt;The input and output are SAS datasets and have identical structure with just two fields.&amp;nbsp; What I want is results that have been filtered for the top x and bottom x from within each Countries value.&amp;nbsp; Here is an example of the data.&amp;nbsp; In this limited dataset, there are only 2 values in the Countries field (19, 20).&amp;nbsp; The full database has 10s of millions of rows and has Countries values that range from 5 to 100:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;bluneconomic	Countries
-0.695725305	19
-0.695386876	19
-0.695117345	19
-0.694779503	20
-0.693231578	19
-0.692606254	20
-0.692243075	20
-0.691876134	20
-0.691733435	19
-0.691376988	19
-0.69006756	    20
-0.690016635	19
-0.689842044	20
-0.689760414	19
-0.689543252	19
-0.689053336	20
-0.689001225	19
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 15 Jul 2020 20:11:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669696#M200923</guid>
      <dc:creator>texasmfp</dc:creator>
      <dc:date>2020-07-15T20:11:12Z</dc:date>
    </item>
    <item>
      <title>Re: Help with grouped results</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669704#M200925</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;ods select none;
ods output ExtremeValues=want;
proc univariate data=yourData nextrval=10;
class countries;
   var bluneconomic;
run;
ods select all;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;H3 class="xis-title"&gt;&lt;FONT size="4"&gt;Note nextrval is your 'x'&lt;/FONT&gt;&lt;/H3&gt;
&lt;H3 class="xis-title"&gt;&lt;FONT size="4"&gt;Example 4.3 Identifying Extreme Observations and Extreme Values&lt;/FONT&gt;&lt;/H3&gt;
&lt;P&gt;&lt;A href="https://documentation.sas.com/?docsetId=procstat&amp;amp;docsetTarget=procstat_univariate_examples03.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en"&gt;https://documentation.sas.com/?docsetId=procstat&amp;amp;docsetTarget=procstat_univariate_examples03.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 15 Jul 2020 20:17:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669704#M200925</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-07-15T20:17:28Z</dc:date>
    </item>
    <item>
      <title>Re: Help with grouped results</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669706#M200926</link>
      <description>&lt;P&gt;Thanks Reeza, but I do not want ODS output.&amp;nbsp; I want a SAS datafile.&amp;nbsp; Thanks&lt;/P&gt;</description>
      <pubDate>Wed, 15 Jul 2020 20:22:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669706#M200926</guid>
      <dc:creator>texasmfp</dc:creator>
      <dc:date>2020-07-15T20:22:47Z</dc:date>
    </item>
    <item>
      <title>Re: Help with grouped results</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669710#M200928</link>
      <description>&lt;P&gt;So, how do we know what constitutes a "group"? A variable? A combination of variables? Telepathy? Please provide a clearer definition and best is terms of actual variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Will you state that every group has at least 2000 records? If not you need to provide a description of what you desire for smaller groups as there will be overlaps between upper and lower. And if you ever have fewer than 1000 records in a group then you have yet another problem to describe the desired behavior.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;For each generation of runs, the range of Countries varies. So, it needs to be scalable rather than hard code for a specific list of Countries =values.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Likely the easier part of this.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is one approach but does not attempt to include any fix for fewer records involved. This assumes that each level of the variable Sex defines a "group".&lt;/P&gt;
&lt;PRE&gt;proc sort data=sashelp.class out=work.class;
   by sex;
run;

proc summary data=work.class nway;
   class sex ;
   output out=work.sexcount(drop=_type_);
run;

data temp;
  merge work.class
        work.sexcount
  ;
  by sex;
  if first.sex then counter=.;
   counter+1;
  /* picking top/bottom 3 when _freq_ is at least 6*/
  if counter le 3 or (_freq_-counter) le 2 ; 
run;&lt;/PRE&gt;
&lt;P&gt;If you really know that you have at least 2000 records in each "group" that last line would be&lt;/P&gt;
&lt;P&gt;if counter le 1000 or (_freq_-counter) le 999 ;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 15 Jul 2020 20:32:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669710#M200928</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2020-07-15T20:32:21Z</dc:date>
    </item>
    <item>
      <title>Re: Help with grouped results</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669713#M200929</link>
      <description>&lt;P&gt;I thought I was clear.&amp;nbsp; The group is the field Countries.&amp;nbsp; I want the output dataset to be filtered by the field called Countries, using the other field Bluneconomic to determine to highest and lowest values.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In the supplied data set, above, Countries has two values 19 and 20.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Lets say I want to top and bottom 2 results.&amp;nbsp; So the output dataset would have 8 rows: the top 2 Bluneconomic values where Countries =19, the top 2 Bluneconomic values where Countries =20,the bottom 2 Bluneconomic values where Countries =19, and&amp;nbsp;the bottom 2 Bluneconomic values where Countries =20.&lt;/P&gt;</description>
      <pubDate>Wed, 15 Jul 2020 20:41:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669713#M200929</guid>
      <dc:creator>texasmfp</dc:creator>
      <dc:date>2020-07-15T20:41:00Z</dc:date>
    </item>
    <item>
      <title>Re: Help with grouped results</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669719#M200932</link>
      <description>You're aware that ODS OUTPUT creates a SAS data set? Did you try it and it didn't work?&lt;BR /&gt;&lt;BR /&gt;If so, what is a data file?</description>
      <pubDate>Wed, 15 Jul 2020 20:58:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669719#M200932</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-07-15T20:58:43Z</dc:date>
    </item>
    <item>
      <title>Re: Help with grouped results</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669758#M200950</link>
      <description>&lt;P&gt;Your original post does not include the word "Bluneconomics"&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;I have a large dataset that is too big to open and filter within SAS or SAS Enterprise Guide.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My current code spits out the top 25k results, previously sorted by Field 1:&lt;/P&gt;
&lt;P&gt;data RESULTS.&amp;amp;top_set;&lt;BR /&gt;set top_set (obs=25000);&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is repeated after sorting Field 1 in descending order:&lt;/P&gt;
&lt;P&gt;data RESULTS.&amp;amp;bottom_set;&lt;BR /&gt;set bottom_set (obs=25000);&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Combined, yields the top/bottom results:&lt;/P&gt;
&lt;P&gt;data top_bottom;&lt;BR /&gt;set RESULTS.&amp;amp;top_set RESULTS.&amp;amp;bottom_set;&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What I want to generate is the top 1,000 and bottom 1,000 results, but for each group.&lt;/P&gt;
&lt;P&gt;There is another field "Countries", which captures the # of countries used to create the result in FIELD 1.&lt;/P&gt;
&lt;P&gt;So, what I really want in the output is the top and bottom results from within each unique Countries value.&lt;/P&gt;
&lt;P&gt;So, the top and bottom for Countries =57, plus the the top and bottom for Countries =56, the top and bottom for Countries =55, etc....&lt;/P&gt;
&lt;P&gt;For each generation of runs, the range of Countries varies. So, it needs to be scalable rather than hard code for a specific list of Countries =values.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;So you would sort by Country Bluneconomics in the first sort to get increasing values by row, or by Country descending Blunecomnics; to get decreasing values of the economics within each country value.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 16 Jul 2020 00:29:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669758#M200950</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2020-07-16T00:29:43Z</dc:date>
    </item>
    <item>
      <title>Re: Help with grouped results</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669765#M200955</link>
      <description>&lt;P&gt;Thanks.&amp;nbsp; Your suggested code works well, even when there are fewer results than the counter setting.&amp;nbsp; If I only wanted the top results rather than both the top and bottom, how would the code be modded?&amp;nbsp; Thanks&lt;/P&gt;</description>
      <pubDate>Thu, 16 Jul 2020 01:35:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Help-with-grouped-results/m-p/669765#M200955</guid>
      <dc:creator>texasmfp</dc:creator>
      <dc:date>2020-07-16T01:35:35Z</dc:date>
    </item>
  </channel>
</rss>

