<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Subset of data with desired number of records having specific categories in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Subset-of-data-with-desired-number-of-records-having-specific/m-p/745528#M233708</link>
    <description>&lt;P&gt;Hi there,&amp;nbsp;&lt;/P&gt;&lt;P&gt;I need your kind help to take a subset of my dataset with a desired number of reports having specific categories. My dataset is having 25 records with 5 apple, 4 banana, 6 grape, 6 orange and 4 pears.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to take a random subset which will have 2 apple, 2 banana, 3 grape, 3 orange and 3 pears.&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input id $ catgeory $;
datalines;
101 apple
102 orange
103 grape
104 grape
105 pears
106 apple
106 orange
108 banana
109 grape
110 pears
111 apple
112 orange
113 banana
114 banana
115 pears
116 apple
117 orange
118 grape
119 banana
120 orange
121 pears
122 apple
123 orange
124 grape
125 grape
;
run;

&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Thank you in advance for your kind help.&lt;/P&gt;</description>
    <pubDate>Thu, 03 Jun 2021 16:13:48 GMT</pubDate>
    <dc:creator>DeepakSwain</dc:creator>
    <dc:date>2021-06-03T16:13:48Z</dc:date>
    <item>
      <title>Subset of data with desired number of records having specific categories</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Subset-of-data-with-desired-number-of-records-having-specific/m-p/745528#M233708</link>
      <description>&lt;P&gt;Hi there,&amp;nbsp;&lt;/P&gt;&lt;P&gt;I need your kind help to take a subset of my dataset with a desired number of reports having specific categories. My dataset is having 25 records with 5 apple, 4 banana, 6 grape, 6 orange and 4 pears.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to take a random subset which will have 2 apple, 2 banana, 3 grape, 3 orange and 3 pears.&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input id $ catgeory $;
datalines;
101 apple
102 orange
103 grape
104 grape
105 pears
106 apple
106 orange
108 banana
109 grape
110 pears
111 apple
112 orange
113 banana
114 banana
115 pears
116 apple
117 orange
118 grape
119 banana
120 orange
121 pears
122 apple
123 orange
124 grape
125 grape
;
run;

&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Thank you in advance for your kind help.&lt;/P&gt;</description>
      <pubDate>Thu, 03 Jun 2021 16:13:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Subset-of-data-with-desired-number-of-records-having-specific/m-p/745528#M233708</guid>
      <dc:creator>DeepakSwain</dc:creator>
      <dc:date>2021-06-03T16:13:48Z</dc:date>
    </item>
    <item>
      <title>Re: Subset of data with desired number of records having specific categories</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Subset-of-data-with-desired-number-of-records-having-specific/m-p/745530#M233709</link>
      <description>&lt;P&gt;One way:&lt;/P&gt;
&lt;PRE&gt;proc sort data=have;
   by catgeory;
run;

proc surveyselect data=have out=want
   sampsize=(2 2 3 3 3 ); 
   /* the order of values in the SAMPSIZE must match sorted order of the STRATA variable*/
   strata catgeory;
run;&lt;/PRE&gt;
&lt;P&gt;The Strata are combinations of variable(s) and the data must be sorted by those variables.&lt;/P&gt;
&lt;P&gt;The output data set will have the records selected along with the probability of selection and weight if needed for an analysis later. If you don't want those drop the variables SelectionProb and Samplingweight.&lt;/P&gt;</description>
      <pubDate>Thu, 03 Jun 2021 16:20:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Subset-of-data-with-desired-number-of-records-having-specific/m-p/745530#M233709</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2021-06-03T16:20:41Z</dc:date>
    </item>
    <item>
      <title>Re: Subset of data with desired number of records having specific categories</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Subset-of-data-with-desired-number-of-records-having-specific/m-p/745540#M233712</link>
      <description>Are those numbers fixed or is it roughly 50% of cases that you want by category? In that case you can still use SURVEYSELECT but specify a rate instead of the hardcoded values.&lt;BR /&gt;</description>
      <pubDate>Thu, 03 Jun 2021 16:40:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Subset-of-data-with-desired-number-of-records-having-specific/m-p/745540#M233712</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2021-06-03T16:40:03Z</dc:date>
    </item>
  </channel>
</rss>

