<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Use only the top xxx rows of the data set in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281864#M59136</link>
    <description>&lt;P&gt;Example, please? &amp;nbsp;All Greek to me....&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sat, 02 Jul 2016 21:45:49 GMT</pubDate>
    <dc:creator>NKormanik</dc:creator>
    <dc:date>2016-07-02T21:45:49Z</dc:date>
    <item>
      <title>Use only the top xxx rows of the data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281833#M59133</link>
      <description>&lt;P&gt;Data is sorted in descending order (second column), over 100,000 rows.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;_X	_50501
_22001_1	1.51880
_23005_1	1.15927
_23403_1	1.12800
_23401_1	1.12679
_20104_1	1.09546
_20104_1	1.08488
_20204_0	1.06033
_21105_0	1.05820
_21506_0	1.05118
_21801a_0	1.04543
_20104_1	1.04470
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I would like to use Proc Freq on just the first xxx rows.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;How can I do that? &amp;nbsp;Create a new data set (subset)? &amp;nbsp;Use a particular IF or WHERE statement?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My objective is to get a 'tally' for the first column, but only the top so many.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;My preference would be to use the following:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;-- top 1%&lt;/P&gt;
&lt;P&gt;-- top 2%&lt;/P&gt;
&lt;P&gt;-- top 5%&lt;/P&gt;
&lt;P&gt;-- top 10%&lt;/P&gt;
&lt;P&gt;-- etc.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is this somehow possible?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Help greatly appreciated.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Nicholas Kormanik&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 02 Jul 2016 09:54:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281833#M59133</guid>
      <dc:creator>NKormanik</dc:creator>
      <dc:date>2016-07-02T09:54:04Z</dc:date>
    </item>
    <item>
      <title>Re: Use only the top xxx rows of the data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281835#M59134</link>
      <description>&lt;P&gt;Use OBS=&amp;nbsp; in the set or infile statement;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data want; &amp;nbsp; set have (obs=xxx);&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There is probably a way to get number of obs in the data set&amp;nbsp;&amp;nbsp; _obs_&amp;nbsp; but I dont know how to get it into a sas variable to do a calculation on it.&lt;/P&gt;</description>
      <pubDate>Sat, 02 Jul 2016 10:26:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281835#M59134</guid>
      <dc:creator>Jim_G</dc:creator>
      <dc:date>2016-07-02T10:26:57Z</dc:date>
    </item>
    <item>
      <title>Re: Use only the top xxx rows of the data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281839#M59135</link>
      <description>&lt;P&gt;To quickly get the number of total obs:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
call symput('total_obs',put(numobs,best.));
set have nobs=numobs;
stop;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You could already calculate your percentage(s) in the same step.&lt;/P&gt;
&lt;P&gt;Then use the obs dataset option:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc freq data=have (obs=&amp;amp;wanted_obs) ......&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 02 Jul 2016 12:46:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281839#M59135</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2016-07-02T12:46:28Z</dc:date>
    </item>
    <item>
      <title>Re: Use only the top xxx rows of the data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281864#M59136</link>
      <description>&lt;P&gt;Example, please? &amp;nbsp;All Greek to me....&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 02 Jul 2016 21:45:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281864#M59136</guid>
      <dc:creator>NKormanik</dc:creator>
      <dc:date>2016-07-02T21:45:49Z</dc:date>
    </item>
    <item>
      <title>Re: Use only the top xxx rows of the data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281867#M59137</link>
      <description>&lt;P&gt;Expanding Kurt's code.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;****************************************;&lt;BR /&gt; %let percent=20;&lt;BR /&gt; ****************************************;&lt;BR /&gt; &lt;BR /&gt; data _null_; set have nobs=numobs;&lt;BR /&gt; xxx=int(&amp;amp;percent*numobs/100);&lt;BR /&gt; call symput('topxxx',put(xxx,best.)); put xxx;&lt;BR /&gt; stop;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc print data=have(obs=&amp;amp;topxxx); run;&lt;/P&gt;</description>
      <pubDate>Sun, 03 Jul 2016 10:17:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281867#M59137</guid>
      <dc:creator>Jim_G</dc:creator>
      <dc:date>2016-07-03T10:17:35Z</dc:date>
    </item>
    <item>
      <title>Re: Use only the top xxx rows of the data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281870#M59140</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can also use score variable to create a rank variable and use that in proc freq with by processing to observe count within top 5%, 10% etc.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 03 Jul 2016 00:49:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281870#M59140</guid>
      <dc:creator>stat_sas</dc:creator>
      <dc:date>2016-07-03T00:49:17Z</dc:date>
    </item>
    <item>
      <title>Re: Use only the top xxx rows of the data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281876#M59143</link>
      <description>&lt;PRE&gt;
Make a macro I guess.







data have;
infile cards expandtabs truncover;;
input _X : $20. _50501;
cards;
_22001_1	1.51880
_23005_1	1.15927
_23403_1	1.12800
_23401_1	1.12679
_20104_1	1.09546
_20104_1	1.08488
_20204_0	1.06033
_21105_0	1.05820
_21506_0	1.05118
_21801a_0	1.04543
_20104_1	1.04470
;
run;


%let top=0.1 ;  /*&amp;lt;---- Change it */



%let dsid=%sysfunc(open(have));
%let nobs=%sysevalf(%sysfunc(attrn(&amp;amp;dsid,nlobs))*&amp;amp;top,i);
%let dsid=%sysfunc(close(&amp;amp;dsid));
proc freq data=have(obs=&amp;amp;nobs);
table _x;
run;






&lt;/PRE&gt;</description>
      <pubDate>Sun, 03 Jul 2016 02:56:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281876#M59143</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2016-07-03T02:56:39Z</dc:date>
    </item>
    <item>
      <title>Re: Use only the top xxx rows of the data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281894#M59144</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Xia Keshan, your code looks terrific. &amp;nbsp;Problem, though, SAS freezes. &amp;nbsp;Error message in Output title bar:&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;PROC FREQ suspended.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Never completes, for some reason.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Any ideas?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 03 Jul 2016 08:38:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281894#M59144</guid>
      <dc:creator>NKormanik</dc:creator>
      <dc:date>2016-07-03T08:38:10Z</dc:date>
    </item>
    <item>
      <title>Re: Use only the top xxx rows of the data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281940#M59147</link>
      <description>&lt;P&gt;?? Look right for me .&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 51         
 52         
 53         
 54         data have;
 55         infile cards expandtabs truncover;;
 56         input _X : $20. _50501;
 57         cards;
 
 NOTE: 数据集 WORK.HAVE 有 11 个观测和 2 个变量。
 NOTE: “DATA 语句”所用时间（总处理时间）:
       实际时间          0.00 秒
       CPU 时间          0.01 秒
       
 69         ;
 
 70         run;
 71         
 72         
 73         %let top=0.1 ;  /*&amp;lt;---- Change it */
 74         
 75         
 76         
 77         %let dsid=%sysfunc(open(have));
 78         %let nobs=%sysevalf(%sysfunc(attrn(&amp;amp;dsid,nlobs))*&amp;amp;top,i);
 79         %let dsid=%sysfunc(close(&amp;amp;dsid));
 80         proc freq data=have(obs=&amp;amp;nobs);
 81         table _x;
 82         run;
 
 NOTE: 从数据集 WORK.HAVE. 读取了 1 个观测
 NOTE: “PROCEDURE FREQ”所用时间（总处理时间）:
       实际时间          0.06 秒
       CPU 时间          0.02 秒
       
 
 83         
 84         
 85         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 95         &lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 04 Jul 2016 03:45:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281940#M59147</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2016-07-04T03:45:18Z</dc:date>
    </item>
    <item>
      <title>Re: Use only the top xxx rows of the data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281961#M59149</link>
      <description>&lt;P&gt;My bad. &amp;nbsp;Sorry&amp;nbsp;&lt;SPAN&gt;Xia Keshan. &amp;nbsp;For some unknown reason my SAS was waiting. &amp;nbsp;Had to type END at command prompt to keep it going.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Thank you for rechecking your code. &amp;nbsp;And writing it.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Nicholas&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 04 Jul 2016 08:00:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Use-only-the-top-xxx-rows-of-the-data-set/m-p/281961#M59149</guid>
      <dc:creator>NKormanik</dc:creator>
      <dc:date>2016-07-04T08:00:45Z</dc:date>
    </item>
  </channel>
</rss>

