<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How do I split data into equal-sized groups by ranking variable? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453698#M284055</link>
    <description>yes I know ideally group sizes should differ by one but when I check the difference is 4 or 5 in some cases for groups in same month.</description>
    <pubDate>Thu, 12 Apr 2018 18:28:47 GMT</pubDate>
    <dc:creator>lezgin</dc:creator>
    <dc:date>2018-04-12T18:28:47Z</dc:date>
    <item>
      <title>How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453428#M284040</link>
      <description>&lt;P&gt;Hello,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am using below code to split my attached data into 20 equal groups each month based on excess_vwretd. I need equal number of observations in each group each month. However, below code only groups my data based on the distribution of my ranking variable. I want to&amp;nbsp;rank the data based on&amp;nbsp;&lt;SPAN&gt;excess_vwretd and create groups of equal size each month.&lt;/SPAN&gt;&amp;nbsp;Any help is appreciated.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc rank data=a5 out=a6 groups=20;&lt;BR /&gt;by date2; var excess_vwretd; ranks betarank; run;&lt;/P&gt;</description>
      <pubDate>Thu, 12 Apr 2018 06:12:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453428#M284040</guid>
      <dc:creator>lezgin</dc:creator>
      <dc:date>2018-04-12T06:12:46Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453444#M284041</link>
      <description>&lt;P&gt;Something like this (three groups here) ?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data WANT;    
  set SASHELP.CLASS nobs=NOBS;
  GROUP=ceil(_N_*3/NOBS);
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;GRP NAME&lt;BR /&gt;1 Alfred&lt;BR /&gt;1 Alice&lt;BR /&gt;1 Barbara&lt;BR /&gt;1 Carol&lt;BR /&gt;1 Henry&lt;BR /&gt;1 James&lt;BR /&gt;2 Jane&lt;BR /&gt;2 Janet&lt;BR /&gt;2 Jeffrey&lt;BR /&gt;2 John&lt;BR /&gt;2 Joyce&lt;BR /&gt;2 Judy&lt;BR /&gt;3 Louise&lt;BR /&gt;3 Mary&lt;BR /&gt;3 Philip&lt;BR /&gt;3 Robert&lt;BR /&gt;3 Ronald&lt;BR /&gt;3 Thomas&lt;BR /&gt;3 William&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 12 Apr 2018 05:31:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453444#M284041</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2018-04-12T05:31:18Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453446#M284042</link>
      <description>&lt;P&gt;no, I need to divide monthly observations into 20 groups with equal number of observations. I also have more than 700.000 observations and your code seems to take a lot of time. I appreciate your help.&lt;/P&gt;</description>
      <pubDate>Thu, 12 Apr 2018 05:55:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453446#M284042</guid>
      <dc:creator>lezgin</dc:creator>
      <dc:date>2018-04-12T05:55:49Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453452#M284043</link>
      <description>&lt;P&gt;I can't think of a faster way than my code. It should run much faster than proc rank. Drop unneeded variables&lt;/P&gt;</description>
      <pubDate>Thu, 12 Apr 2018 05:57:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453452#M284043</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2018-04-12T05:57:32Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453453#M284044</link>
      <description>&lt;P&gt;This takes 0.15s on my laptop for 20 groups with one million records.&lt;/P&gt;
&lt;P&gt;Don't just say no without looking at the solution please.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data HAVE;
 do I=1 to 1e6;
   output; 
  end; 
run;
data WANT;    
  set HAVE nobs=NOBS;
  GROUP=ceil(_N_*20/NOBS);
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 12 Apr 2018 06:00:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453453#M284044</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2018-04-12T06:00:52Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453454#M284045</link>
      <description>&lt;P&gt;I tried your code. It took time to process and I had to terminate. It should do it for each date. Your code doesn't take this into account. I attached a small sample, it might give a better idea.&lt;/P&gt;</description>
      <pubDate>Thu, 12 Apr 2018 06:13:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453454#M284045</guid>
      <dc:creator>lezgin</dc:creator>
      <dc:date>2018-04-12T06:13:39Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453468#M284046</link>
      <description>&lt;P&gt;Assuming data is sorted by date, try next code:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
   set have;
    by date;
         if _N_ = 1 then group=1;
         else group+1;
         if group &amp;gt; 20 then group=1;
run;

/* check distribution */
proc freq taya=want;
    table date * group;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 12 Apr 2018 08:38:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453468#M284046</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2018-04-12T08:38:43Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453498#M284047</link>
      <description>&lt;P&gt;Like this (2.8s for 10 million rows)?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data HAVE;
 do DATE2=1 to 10;
  do VAR=1 to 1e6;
   output; 
  end; 
 end; 
run;
data WANT; 
  do NOBS=1 to 1e9 until (last.DATE2);
    set HAVE;
    by DATE2;
  end;
  do I=1 to NOBS;
    set HAVE;
    GROUP=ceil(I*20/NOBS);
    output;
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 12 Apr 2018 11:20:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453498#M284047</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2018-04-12T11:20:37Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453499#M284048</link>
      <description>&lt;P&gt;Just checking that I understand the problem first.&amp;nbsp; Please confirm:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;For each DATE2, the highest values for EXCESS_VWRETD belong in group 1.&amp;nbsp; The next highest in group 2.&lt;/LI&gt;
&lt;LI&gt;If two observations have the same EXCESS_VWRETD, they may need to be placed in different groups, in order to keep the group sizes equal.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;To achieve that, sort the data if necessary:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc sort data=have;&lt;/P&gt;
&lt;P&gt;by date2 descending excess_vwretd;&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then process each DATE2 separately:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data want;&lt;/P&gt;
&lt;P&gt;date_counter = 0;&lt;/P&gt;
&lt;P&gt;do until (last.date2);&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;set have;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;by date2;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;date_counter + 1;&lt;/P&gt;
&lt;P&gt;end;&lt;/P&gt;
&lt;P&gt;group_counter = 0;&lt;/P&gt;
&lt;P&gt;do until (last.date2);&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;set have;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;by date2;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;group_counter + 1;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;group = ceil(group_counter / date_counter * 20);&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#FF0000"&gt;**&lt;/FONT&gt;end;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;output;&lt;/P&gt;
&lt;P&gt;end;&lt;/P&gt;
&lt;P&gt;drop group_counter &lt;FONT color="#FF0000"&gt;date&lt;/FONT&gt;_counter;&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It's untested code, and might need some correction.&amp;nbsp; But first things first.&amp;nbsp; Is it attempting to solve the right problem?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#FF0000"&gt;********** EDITED to comment out an extraneous END statement.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 12 Apr 2018 16:10:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453499#M284048</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2018-04-12T16:10:05Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453516#M284049</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/77029"&gt;@lezgin&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;If you just want to split up the date into groups with equal numbers of observations per date and there is no need to also have some sort of random selection, then I don't see how anything could be faster than a simple data step with a function as&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/16961"&gt;@ChrisNZ&lt;/a&gt;&amp;nbsp;suggests.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here another coding variant&amp;nbsp;for this approach&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have(drop=_:);
  format date date9.;

  do date='01jan2018'd, '02jan2018'd;
    do _i=1 to 150;
      output;
    end;
  end;

  do date='03jan2018'd, '04jan2018'd;
    do _i=1 to 10;
      output;
    end;
  end;

  stop;
run;

%let n_groups=5;

data want;
  set have;
  by date;
  group=mod(_n_-1,&amp;amp;n_groups)+1;
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 12 Apr 2018 12:21:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453516#M284049</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2018-04-12T12:21:42Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453527#M284050</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=sashelp.heart out=have;
by status;
run;
proc surveyselect data=have out=want groups=10;
strata status;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 12 Apr 2018 12:52:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453527#M284050</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2018-04-12T12:52:14Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453625#M284051</link>
      <description>&lt;P&gt;hello astounding, that is exactly what I need. I get this error when I run it&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;5775 /*then process each DATE2 separately:*/&lt;BR /&gt;5776 data yedek;&lt;BR /&gt;5777 date_counter = 0;&lt;BR /&gt;5778 do until (last.date2);&lt;BR /&gt;5779 set a5;&lt;BR /&gt;5780 by date2;&lt;BR /&gt;5781 date_counter + 1;&lt;BR /&gt;5782 end;&lt;BR /&gt;5783 group_counter = 0;&lt;BR /&gt;5784 do until (last.date2);&lt;BR /&gt;5785 set a5;&lt;BR /&gt;5786 by date2;&lt;BR /&gt;5787 group_counter + 1;&lt;BR /&gt;5788 group = ceil(group_counter / date_counter * 20);&lt;BR /&gt;5789 end;&lt;BR /&gt;5790 output;&lt;BR /&gt;5791 end;&lt;BR /&gt;---&lt;BR /&gt;161&lt;BR /&gt;ERROR 161-185: No matching DO/SELECT statement.&lt;/P&gt;&lt;P&gt;5792&lt;BR /&gt;5793 drop group_counter record_counter;&lt;BR /&gt;5794&lt;BR /&gt;5795 run;&lt;/P&gt;&lt;P&gt;WARNING: The variable record_counter in the DROP, KEEP, or RENAME list has never been referenced.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;chris, thanks for taking your time to do this, but I don't think you read my previous posts. you are ignoring the ranking variable. I want&amp;nbsp;to rank each month into 20 equal sized groups.&lt;/P&gt;</description>
      <pubDate>Thu, 12 Apr 2018 16:07:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453625#M284051</guid>
      <dc:creator>lezgin</dc:creator>
      <dc:date>2018-04-12T16:07:10Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453639#M284052</link>
      <description>&lt;P&gt;astounding, thank you but it doesn't produce groups with equal no of obs.&lt;/P&gt;</description>
      <pubDate>Thu, 12 Apr 2018 16:25:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453639#M284052</guid>
      <dc:creator>lezgin</dc:creator>
      <dc:date>2018-04-12T16:25:17Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453692#M284054</link>
      <description>&lt;P&gt;It worked for me.&amp;nbsp; Of course, your groups can't be exactly of equal size unless the number of observations is a multiple of 20.&amp;nbsp; So some group sizes can differ by 1.&amp;nbsp; But other than that it should be perfect.&lt;/P&gt;</description>
      <pubDate>Thu, 12 Apr 2018 18:21:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453692#M284054</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2018-04-12T18:21:20Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453698#M284055</link>
      <description>yes I know ideally group sizes should differ by one but when I check the difference is 4 or 5 in some cases for groups in same month.</description>
      <pubDate>Thu, 12 Apr 2018 18:28:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453698#M284055</guid>
      <dc:creator>lezgin</dc:creator>
      <dc:date>2018-04-12T18:28:47Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453708#M284056</link>
      <description>&lt;P&gt;I can look at the log if you post it.&amp;nbsp; (Wouldn't hurt to post a small piece of the output ... an example of what goes wrong.)&amp;nbsp; Otherwise there's not much I can do on this end.&lt;/P&gt;</description>
      <pubDate>Thu, 12 Apr 2018 18:48:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453708#M284056</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2018-04-12T18:48:26Z</dc:date>
    </item>
    <item>
      <title>Re: How do I split data into equal-sized groups by ranking variable?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453713#M284057</link>
      <description>I figured it out. Thank you very much, it works as you said.</description>
      <pubDate>Thu, 12 Apr 2018 19:01:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-do-I-split-data-into-equal-sized-groups-by-ranking-variable/m-p/453713#M284057</guid>
      <dc:creator>lezgin</dc:creator>
      <dc:date>2018-04-12T19:01:15Z</dc:date>
    </item>
  </channel>
</rss>

