<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: how to select top 100 observation a variable by rank in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125470#M260171</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Besides it does not work for the requirement like PG pointed out, using system options for this kind of task doesn't seem to be a good practice, programmer has to remember where to set it back to default, because you can't get help from error log when the intended output is not achieved.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Sun, 03 Nov 2013 03:44:45 GMT</pubDate>
    <dc:creator>Haikuo</dc:creator>
    <dc:date>2013-11-03T03:44:45Z</dc:date>
    <item>
      <title>how to select top 100 observation a variable by rank</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125463#M260164</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Dear all:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I hope to select the observations with top 100 assets in each year.&amp;nbsp; May I ask how to do it? &lt;/P&gt;&lt;P&gt;Either by sql or proc rank or proc univariate?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The data looks like: &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;TABLE border="1" class="jiveBorder" style="border: 1px solid #000000; width: 100%;"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TH style="text-align: center; background-color: #6690bc; color: #ffffff; padding: 2px;" valign="middle"&gt;&lt;STRONG&gt;firm&lt;/STRONG&gt;&lt;/TH&gt;&lt;TH style="text-align: center; background-color: #6690bc; color: #ffffff; padding: 2px;" valign="middle"&gt;&lt;STRONG&gt;year&lt;/STRONG&gt;&lt;/TH&gt;&lt;TH style="text-align: center; background-color: #6690bc; color: #ffffff; padding: 2px;" valign="middle"&gt;&lt;STRONG&gt;asset&lt;/STRONG&gt;&lt;/TH&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;1&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;1997&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;535&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;1&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;1998&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;453&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;1&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;1999&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;7856&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;2&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;1997&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;87&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;2&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;1998&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;87687&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;2&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;1999&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;452&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;2&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;2000&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;78&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;2&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;1997&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;78&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD style="padding: 2px;"&gt;3&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;1998&lt;/TD&gt;&lt;TD style="padding: 2px;"&gt;986&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I tried the following, but it results in sort execution failure:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;&amp;nbsp; creassetse table r.want as&lt;/P&gt;&lt;P&gt;&amp;nbsp; select a.*&lt;/P&gt;&lt;P&gt;&amp;nbsp; from r.have as a&lt;/P&gt;&lt;P&gt;&amp;nbsp; left join r.have as b on (a.year=b.year and a.assets &amp;lt;=b.assets)&lt;/P&gt;&lt;P&gt;&amp;nbsp; group by a.year, a.assets&lt;/P&gt;&lt;P&gt;&amp;nbsp; having count(*) &amp;lt;= 100&lt;/P&gt;&lt;P&gt;&amp;nbsp; order by a.year, a.assets desc;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks !&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Message was edited by: Eric Wayne&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 27 Oct 2013 23:24:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125463#M260164</guid>
      <dc:creator>caveman529</dc:creator>
      <dc:date>2013-10-27T23:24:00Z</dc:date>
    </item>
    <item>
      <title>Re: how to select top 100 observation a variable by rank</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125464#M260165</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;How about SORT and a DATASTEP? Here, for example is how to get the top 2 assets for each year :&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;data have;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;input firm year asset;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;datalines;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;1 1997 535 &lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;1 1998 453 &lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;1 1999 7856 &lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;2 1997 87 &lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;2 1998 87687 &lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;2 1999 452 &lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;2 2000 78 &lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;2 1997 78 &lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;3 1998 986 &lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc sort data=have; by year descending asset; run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;data want;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;order = 0;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;do until (last.year);&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; order + 1;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; set have; by year;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if order &amp;lt;= 2 then output;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; end;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 28 Oct 2013 00:45:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125464#M260165</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2013-10-28T00:45:36Z</dc:date>
    </item>
    <item>
      <title>Re: how to select top 100 observation a variable by rank</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125465#M260166</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I believe you should just be able to sort--&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;proc sort data=dataset out=newdataset(where=(_n_ le 100));&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; by asset descending;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;_n_ is an automatic variable sas creates as a record ID, so proc sort would put the top 100 at the beginning, and then on the new dataset you would only keep the first 100 records.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 01 Nov 2013 18:52:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125465#M260166</guid>
      <dc:creator>cau83</dc:creator>
      <dc:date>2013-11-01T18:52:47Z</dc:date>
    </item>
    <item>
      <title>Re: how to select top 100 observation a variable by rank</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125466#M260167</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;A __default_attr="810217" __jive_macro_name="user" class="jive_macro jive_macro_user" data-objecttype="3" href="https://communities.sas.com/"&gt;&lt;/A&gt;, that can't work for two reasons:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;450&amp;nbsp; proc sort data=sashelp.cars(where=(_n_&amp;lt;=3));&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="color: #ff0000;"&gt;&lt;STRONG&gt;ERROR: Variable _n_ is not on file SASHELP.CARS.&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;451&amp;nbsp; by make horsepower;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;452&amp;nbsp; run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;First, variable _n_ is created only within a datastep, it is not created during other SAS procedures and it is not saved with datasets.&lt;/P&gt;&lt;P&gt;Second, variable _n_ is not reset at the beginning of each BY group, so it would only identify the first observations from the dataset and not those from each BY group.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 01 Nov 2013 21:22:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125466#M260167</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2013-11-01T21:22:17Z</dc:date>
    </item>
    <item>
      <title>Re: how to select top 100 observation a variable by rank</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125467#M260168</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank you so much !!! &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 02 Nov 2013 02:32:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125467#M260168</guid>
      <dc:creator>caveman529</dc:creator>
      <dc:date>2013-11-02T02:32:58Z</dc:date>
    </item>
    <item>
      <title>Re: how to select top 100 observation a variable by rank</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125468#M260169</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;you are correct.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I should have used my other suggestion I was thinking of at the time-- using an obs= option on the to data set-- i actually tested that and it doesn't work either but it was not as obviously wrong and does not produce an error.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Nevertheless, what I just validated to work with only a proc sort is this:&lt;/P&gt;&lt;P&gt;proc sort dataset to=newdataset;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; options obs=#ofObsYouWant;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; by variable;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sat, 02 Nov 2013 21:42:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125468#M260169</guid>
      <dc:creator>cau83</dc:creator>
      <dc:date>2013-11-02T21:42:40Z</dc:date>
    </item>
    <item>
      <title>Re: how to select top 100 observation a variable by rank</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125469#M260170</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;It doesn't validate for me.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;data test;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;do grp = 1, 2;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; do id = 1 to 10;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; output;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; end;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; end;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;options obs=3;&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;proc sort data=test out=top3perGrp; &lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;by grp descending id; &lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;run;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;dataset top3perGrp does &lt;SPAN style="text-decoration: underline;"&gt;not&lt;/SPAN&gt; contain top 3 id from each grp, as required.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;PG&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 03 Nov 2013 03:08:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125469#M260170</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2013-11-03T03:08:54Z</dc:date>
    </item>
    <item>
      <title>Re: how to select top 100 observation a variable by rank</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125470#M260171</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Besides it does not work for the requirement like PG pointed out, using system options for this kind of task doesn't seem to be a good practice, programmer has to remember where to set it back to default, because you can't get help from error log when the intended output is not achieved.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Sun, 03 Nov 2013 03:44:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125470#M260171</guid>
      <dc:creator>Haikuo</dc:creator>
      <dc:date>2013-11-03T03:44:45Z</dc:date>
    </item>
    <item>
      <title>Re: how to select top 100 observation a variable by rank</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125471#M260172</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;ahh i missed the implied outputting things by top 'x' of each group in the original post, my bad. that makes the obs or _n_ approach both invalid regardless of whether it's a data step or not.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;hai.kuo, I agree but if one uses it they need to always write in the options obs=max at the same time.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;sorry for the goose chase everyone.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;if i were doing this from scratch I'd do a proc sort and then a data step, but instead of PG Stats' do loop I'd use the by statement in the data step, retain a counter variable and then output only when the counter was below your threshhold-- I tend to use the data/by/counter a lot for other things so that's what first pops into mind.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Mon, 04 Nov 2013 00:55:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/125471#M260172</guid>
      <dc:creator>cau83</dc:creator>
      <dc:date>2013-11-04T00:55:41Z</dc:date>
    </item>
    <item>
      <title>Re: how to select top 100 observation a variable by rank</title>
      <link>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/323595#M260173</link>
      <description>&lt;P&gt;data have;&lt;BR /&gt;input firm year asset;&lt;BR /&gt;datalines;&lt;BR /&gt;1 1997 535&lt;BR /&gt;1 1998 453&lt;BR /&gt;1 1999 7856&lt;BR /&gt;2 1997 87&lt;BR /&gt;2 1998 87687&lt;BR /&gt;2 1999 452&lt;BR /&gt;2 2000 78&lt;BR /&gt;2 1997 78&lt;BR /&gt;3 1998 986&lt;BR /&gt;;&lt;BR /&gt;proc sort data=have;&lt;BR /&gt;by year ;&lt;BR /&gt;run;&lt;BR /&gt;&lt;BR /&gt;data mak;&lt;BR /&gt;set have;&lt;BR /&gt;by year;&lt;BR /&gt;if first.year then Top=0;&lt;BR /&gt;Top+1;&lt;BR /&gt;if Top &amp;lt; 3 ;&lt;BR /&gt;run;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jan 2017 12:10:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/how-to-select-top-100-observation-a-variable-by-rank/m-p/323595#M260173</guid>
      <dc:creator>Prashant_Ph</dc:creator>
      <dc:date>2017-01-10T12:10:47Z</dc:date>
    </item>
  </channel>
</rss>

