<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Create training, validation, and test datasets while maintaining target variable ratios in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Create-training-validation-and-test-datasets-while-maintaining/m-p/688963#M209399</link>
    <description>&lt;P&gt;I have a large dataset with a binary target variable (0s and 1s). I'm looking to randomly split the data into a training, validation, and test set while maintaining the ratio of 0s and 1s across all datasets.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;How would I do this or what procedures should I be looking into? I tried proc partition, but I don't have a CAS engine library setup (don't know how to check is one has been setup or how setup a session myself).&lt;/P&gt;</description>
    <pubDate>Mon, 05 Oct 2020 18:12:00 GMT</pubDate>
    <dc:creator>jlsagisi</dc:creator>
    <dc:date>2020-10-05T18:12:00Z</dc:date>
    <item>
      <title>Create training, validation, and test datasets while maintaining target variable ratios</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-training-validation-and-test-datasets-while-maintaining/m-p/688963#M209399</link>
      <description>&lt;P&gt;I have a large dataset with a binary target variable (0s and 1s). I'm looking to randomly split the data into a training, validation, and test set while maintaining the ratio of 0s and 1s across all datasets.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;How would I do this or what procedures should I be looking into? I tried proc partition, but I don't have a CAS engine library setup (don't know how to check is one has been setup or how setup a session myself).&lt;/P&gt;</description>
      <pubDate>Mon, 05 Oct 2020 18:12:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-training-validation-and-test-datasets-while-maintaining/m-p/688963#M209399</guid>
      <dc:creator>jlsagisi</dc:creator>
      <dc:date>2020-10-05T18:12:00Z</dc:date>
    </item>
    <item>
      <title>Re: Create training, validation, and test datasets while maintaining target variable ratios</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-training-validation-and-test-datasets-while-maintaining/m-p/688964#M209400</link>
      <description>&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;I have a large dataset with a binary target variable (0s and 1s). I'm looking to randomly split the data into a training, validation, and test set while maintaining the ratio of 0s and 1s across all datasets.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;This is a requirement that I am not aware of for most modeling. Normally, the data is split at random, and the ratios of 0s and 1s in each data set also is random. Why is it needed?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;How would I do this or what procedures should I be looking into? I tried proc partition, but I don't have a CAS engine library setup&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;What parts of SAS do you have?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 05 Oct 2020 18:16:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-training-validation-and-test-datasets-while-maintaining/m-p/688964#M209400</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-10-05T18:16:53Z</dc:date>
    </item>
    <item>
      <title>Re: Create training, validation, and test datasets while maintaining target variable ratios</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Create-training-validation-and-test-datasets-while-maintaining/m-p/688965#M209401</link>
      <description>&lt;P&gt;Many model-selection routines in SAS enable you to split data by using the PARTITION statement. Examples include the "SELECT" procedures (GLMSELECT, QUANTSELECT, HPGENSELECT...) and the ADAPTIVEREG procedure.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you want to create the data yourself, you use the DATA step to split the data randomly (which approximately preserves the proportion of 0/1), or you can use the GROUPS= option in the SURVEYSELECT procedure to specify the exact number of observations in each group.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Additional discussion and completely worked examples are available at &lt;A href="https://blogs.sas.com/content/iml/2019/01/21/training-validation-test-data-sas.html" target="_self"&gt;"Create training, validation, and test data sets in SAS."&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 05 Oct 2020 18:22:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Create-training-validation-and-test-datasets-while-maintaining/m-p/688965#M209401</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2020-10-05T18:22:12Z</dc:date>
    </item>
  </channel>
</rss>

