<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Divide dataset into 3 unequal smaller datasets based on ID in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/Divide-dataset-into-3-unequal-smaller-datasets-based-on-ID/m-p/787438#M32291</link>
    <description>&lt;P&gt;Not sure that i understood your starting position correctly. You want to divide a dataset creating three new datasets using the variable ID and the variable has only three different values. Correct? Sounds strange.&lt;/P&gt;</description>
    <pubDate>Mon, 27 Dec 2021 10:28:22 GMT</pubDate>
    <dc:creator>andreas_lds</dc:creator>
    <dc:date>2021-12-27T10:28:22Z</dc:date>
    <item>
      <title>Divide dataset into 3 unequal smaller datasets based on ID</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Divide-dataset-into-3-unequal-smaller-datasets-based-on-ID/m-p/786278#M32206</link>
      <description>&lt;P&gt;I'm trying to divide a large dataset into smaller 3 unequal sized groups. I used the following code:&lt;/P&gt;
&lt;PRE&gt;PROC SURVEYSELECT data = Original
		out = sample9
			method = SRS
			seed = 12345678
			sampsize = (1000 52175 13044);
	strata ID notsorted;
	*title;
RUN;&lt;/PRE&gt;
&lt;P&gt;However, I'm getting the following error:&lt;/P&gt;
&lt;P&gt;ERROR: The sample size, 1000, is greater than the number of sampling units, 1.&lt;BR /&gt;NOTE: The above message was for the following stratum:&lt;BR /&gt;IID=TV20_2018.&lt;BR /&gt;ERROR: The sample size, 52175, is greater than the number of sampling units, 1.&lt;BR /&gt;NOTE: The above message was for the following stratum:&lt;BR /&gt;IID=TV20_2018.&lt;BR /&gt;ERROR: The sample size, 13044, is greater than the number of sampling units, 1.&lt;BR /&gt;NOTE: The above message was for the following stratum:&lt;BR /&gt;IID=TV20_2018.&lt;BR /&gt;ERROR: The number of values in the SAMPSIZE= list must equal the number of strata. There are more strata than SAMPSIZE=&lt;BR /&gt;values.&lt;BR /&gt;NOTE: The SAS System stopped processing this step because of errors.&lt;BR /&gt;WARNING: The data set WORK.SAMPLE9 may be incomplete. When this step was stopped there were 0 observations and 75 variables.&lt;BR /&gt;&lt;BR /&gt;Can you please suggest, how I can address this? TIA&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 16 Dec 2021 11:49:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Divide-dataset-into-3-unequal-smaller-datasets-based-on-ID/m-p/786278#M32206</guid>
      <dc:creator>mantubiradar19</dc:creator>
      <dc:date>2021-12-16T11:49:19Z</dc:date>
    </item>
    <item>
      <title>Re: Divide dataset into 3 unequal smaller datasets based on ID</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Divide-dataset-into-3-unequal-smaller-datasets-based-on-ID/m-p/786287#M32207</link>
      <description>Add an option :&lt;BR /&gt;&lt;BR /&gt;sampsize = (1000 52175 13044)   SELECTALL  ;</description>
      <pubDate>Thu, 16 Dec 2021 12:27:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Divide-dataset-into-3-unequal-smaller-datasets-based-on-ID/m-p/786287#M32207</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2021-12-16T12:27:07Z</dc:date>
    </item>
    <item>
      <title>Re: Divide dataset into 3 unequal smaller datasets based on ID</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Divide-dataset-into-3-unequal-smaller-datasets-based-on-ID/m-p/786310#M32211</link>
      <description>&lt;P&gt;Considering the way the errors are reading I am wondering if your data is GROUPED properly. With the notsorted option all of the like values should be adjacent in the data set otherwise each time a value repeated value appears it is a "new" strata. The clue is repeated mention of the same stratum value:&lt;/P&gt;
&lt;P&gt;IID=TV20_2018. (appears you code posted uses a different variable than then when the LOG was created)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I think this code should demonstrate the grouping issue (unless you have some how sorted the SASHELP.CLASS data set by age):&lt;/P&gt;
&lt;PRE&gt;proc surveyselect data=sashelp.class &lt;BR /&gt;out=work.sel&lt;BR /&gt;sampsize = (2 3 2 2 2 1)&lt;BR /&gt;;&lt;BR /&gt;strata age notsorted;&lt;BR /&gt;run;&lt;/PRE&gt;
&lt;P&gt;There are 6 different ages in the data with counts of 2,5,3,4,4 and 1 for ages 11 to 16 set but the data is sorted by name by default and the age groups are not adjacent so you get a log similar to yours:&lt;/P&gt;
&lt;PRE&gt;&lt;BR /&gt;ERROR: The sample size, 2, is greater than the number of sampling units, 1.&lt;BR /&gt;NOTE: The above message was for the following stratum:&lt;BR /&gt;Age=14.&lt;BR /&gt;ERROR: The sample size, 3, is greater than the number of sampling units, 2.&lt;BR /&gt;NOTE: The above message was for the following stratum:&lt;BR /&gt;Age=13.&lt;BR /&gt;NOTE: The sample size equals the number of sampling units. All units are included in the sample.&lt;BR /&gt;NOTE: The above message was for the following stratum:&lt;BR /&gt;Age=14.&lt;BR /&gt;NOTE: The sample size equals the number of sampling units. All units are included in the sample.&lt;BR /&gt;NOTE: The above message was for the following stratum:&lt;BR /&gt;Age=12.&lt;BR /&gt;ERROR: The sample size, 2, is greater than the number of sampling units, 1.&lt;BR /&gt;NOTE: The above message was for the following stratum:&lt;BR /&gt;Age=15.&lt;BR /&gt;NOTE: The sample size equals the number of sampling units. All units are included in the sample.&lt;BR /&gt;NOTE: The above message was for the following stratum:&lt;BR /&gt;Age=13.&lt;BR /&gt;ERROR: The number of values in the SAMPSIZE= list must equal the number of strata. There are more&lt;BR /&gt;strata than SAMPSIZE= values.&lt;BR /&gt;
&lt;/PRE&gt;
&lt;P&gt;So either sort your data by the stratum variable (probably best). or re-examine the order of the values and provide matching number of strata definitions with appropriate sizes for each of the existing strata.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If there is a serious reason that your source data set Original should not be sorted then sort it and create a different set to do the selection from and use it survey select.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;Proc sort data=original out=toselectfrom;
   by id;
run;&lt;/PRE&gt;</description>
      <pubDate>Thu, 16 Dec 2021 16:19:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Divide-dataset-into-3-unequal-smaller-datasets-based-on-ID/m-p/786310#M32211</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2021-12-16T16:19:10Z</dc:date>
    </item>
    <item>
      <title>Re: Divide dataset into 3 unequal smaller datasets based on ID</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Divide-dataset-into-3-unequal-smaller-datasets-based-on-ID/m-p/786319#M32213</link>
      <description>You're trying to pick 1000 samples from your first ID, 52175 from the second ID and 13044 from your third ID. &lt;BR /&gt;Is that what you're trying to do? I suspect your STRATA statement is wrong somehow.</description>
      <pubDate>Thu, 16 Dec 2021 17:02:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Divide-dataset-into-3-unequal-smaller-datasets-based-on-ID/m-p/786319#M32213</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2021-12-16T17:02:13Z</dc:date>
    </item>
    <item>
      <title>Re: Divide dataset into 3 unequal smaller datasets based on ID</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Divide-dataset-into-3-unequal-smaller-datasets-based-on-ID/m-p/787438#M32291</link>
      <description>&lt;P&gt;Not sure that i understood your starting position correctly. You want to divide a dataset creating three new datasets using the variable ID and the variable has only three different values. Correct? Sounds strange.&lt;/P&gt;</description>
      <pubDate>Mon, 27 Dec 2021 10:28:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Divide-dataset-into-3-unequal-smaller-datasets-based-on-ID/m-p/787438#M32291</guid>
      <dc:creator>andreas_lds</dc:creator>
      <dc:date>2021-12-27T10:28:22Z</dc:date>
    </item>
  </channel>
</rss>

