<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic SAS Viya Model Studio and Data Partitioning in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/649684#M22286</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using SAS Viya Model Studio ML and DM for educational purposes. I use a data set that i partition 70% training, 30% validation. I assume that if i do the project from scratch, every time the software chooses different data sets for training and validation randomly so the rsults of e.g. the Decision tree will every time be slightly different. Is that right?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If yes is there a way to select a seed so every time i create the project the training - validation sets will be the same?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One solution that i have found is to set a partition binary variable in a data set so every time the sets will be the same but i was wondering whether i can do this whithout the extra variable via a seed. The seed was the case in SAS EM.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks in advance,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Andreas&lt;/P&gt;</description>
    <pubDate>Thu, 21 May 2020 19:31:14 GMT</pubDate>
    <dc:creator>andreas_zaras</dc:creator>
    <dc:date>2020-05-21T19:31:14Z</dc:date>
    <item>
      <title>SAS Viya Model Studio and Data Partitioning</title>
      <link>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/649684#M22286</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using SAS Viya Model Studio ML and DM for educational purposes. I use a data set that i partition 70% training, 30% validation. I assume that if i do the project from scratch, every time the software chooses different data sets for training and validation randomly so the rsults of e.g. the Decision tree will every time be slightly different. Is that right?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If yes is there a way to select a seed so every time i create the project the training - validation sets will be the same?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One solution that i have found is to set a partition binary variable in a data set so every time the sets will be the same but i was wondering whether i can do this whithout the extra variable via a seed. The seed was the case in SAS EM.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks in advance,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Andreas&lt;/P&gt;</description>
      <pubDate>Thu, 21 May 2020 19:31:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/649684#M22286</guid>
      <dc:creator>andreas_zaras</dc:creator>
      <dc:date>2020-05-21T19:31:14Z</dc:date>
    </item>
    <item>
      <title>Re: SAS Viya Model Studio and Data Partitioning</title>
      <link>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/649710#M22289</link>
      <description>Hi Andreas,&lt;BR /&gt;&lt;BR /&gt;Might Proc Partition help: &lt;A href="https://go.documentation.sas.com/?docsetId=casstat&amp;amp;docsetTarget=casstat_partition_examples02.htm&amp;amp;docsetVersion=8.5&amp;amp;locale=en" target="_blank"&gt;https://go.documentation.sas.com/?docsetId=casstat&amp;amp;docsetTarget=casstat_partition_examples02.htm&amp;amp;docsetVersion=8.5&amp;amp;locale=en&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Cheers, Simon&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Example 29.2 Stratified Sampling&lt;BR /&gt;This example demonstrates how to use PROC PARTITION to perform stratified sampling to partition the data; it uses the same data table as is used in Example 29.1.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;You can load the sampsio.hmeq data set into your CAS session by naming your CAS engine libref in the first statement of the following DATA step. This DATA step assumes that your CAS engine libref is named mycas, but you can substitute any appropriately defined CAS engine libref.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;data mycas.hmeq;&lt;BR /&gt;   set sampsio.hmeq;&lt;BR /&gt;run;&lt;BR /&gt;The following statements perform the partitioning:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;proc partition data=mycas.hmeq samppct=10 samppct2=20 seed=10 partind nthreads=3;&lt;BR /&gt;   by BAD;&lt;BR /&gt;   output out=mycas.out3 copyvars=(job reason loan value delinq derog);&lt;BR /&gt;run;&lt;BR /&gt;&lt;BR /&gt;proc print data=mycas.out3(obs=20);&lt;BR /&gt;run;&lt;BR /&gt;The SAMPPCT=10 option requests that 10% of the input data be included in the training partition, and the SAMPPCT2=20 option requests that 20% of the input data be included in the testing partition. The SEED= option specifies 10 as the random seed to be used in the partitioning process. The PARTIND option requests that the output data table, mycas.out3, include an indicator that shows whether each observation is selected to a partition (1 for training or 2 for testing) or not (0). The OUTPUT statement requests that the sampled data be stored in a table named mycas.out3, and the COPYVARS= option lists the variables to be copied from mycas.hmeq to mycas.out3.</description>
      <pubDate>Thu, 21 May 2020 21:12:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/649710#M22289</guid>
      <dc:creator>SimonWilliams</dc:creator>
      <dc:date>2020-05-21T21:12:05Z</dc:date>
    </item>
    <item>
      <title>Re: SAS Viya Model Studio and Data Partitioning</title>
      <link>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/649723#M22293</link>
      <description>&lt;P&gt;You may also find this recently published article helpful:&amp;nbsp;&lt;A href="https://communities.sas.com/t5/SAS-Communities-Library/SAS-Model-Studio-8-5-projects-and-considerations-when-needing-to/ta-p/649717" target="_blank"&gt;https://communities.sas.com/t5/SAS-Communities-Library/SAS-Model-Studio-8-5-projects-and-considerations-when-needing-to/ta-p/649717&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 21 May 2020 22:03:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/649723#M22293</guid>
      <dc:creator>SimonWilliams</dc:creator>
      <dc:date>2020-05-21T22:03:00Z</dc:date>
    </item>
    <item>
      <title>Re: SAS Viya Model Studio and Data Partitioning</title>
      <link>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/649788#M22305</link>
      <description>&lt;P&gt;Hello Simon,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for your answer!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So do you agree that if i do the project from scratch, every time the software chooses different data sets for training and validation randomly so the rsults of e.g. the Decision tree will every time be slightly different?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If yes is there a way to select a seed so every time i create the project the training - validation sets will be the same &lt;STRONG&gt;by using the Model Studio GUI&lt;/STRONG&gt;?&lt;/P&gt;</description>
      <pubDate>Fri, 22 May 2020 06:40:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/649788#M22305</guid>
      <dc:creator>andreas_zaras</dc:creator>
      <dc:date>2020-05-22T06:40:06Z</dc:date>
    </item>
    <item>
      <title>Re: SAS Viya Model Studio and Data Partitioning</title>
      <link>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/649875#M22308</link>
      <description>&lt;P&gt;Hi Andreas,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Ok i spoke with a colleague in R&amp;amp;D and they confirmed the seed for the partitioning of data within Model Studio 8.5 is fixed value. It is the same value for each project you create and for each run of the data node.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You should be able to verify this by looking at summary statistics for each of the partitioned tables. They should be the same. If you are seeing behaviour which suggests that the partitioning of data is not consistent, then please contact Technical Support and provide some examples.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you are seeing slightly different results for each run of the model, then perhaps the algorithms that underpin each modelling technique may have seed/starting values that are chosen at random or can be specified by the user. The VDMML documentation may help.&amp;nbsp;&lt;A href="https://go.documentation.sas.com/?docsetId=casml&amp;amp;docsetTarget=casml_whatsnew_sect003.htm&amp;amp;docsetVersion=8.5&amp;amp;locale=en"&gt;https://go.documentation.sas.com/?docsetId=casml&amp;amp;docsetTarget=casml_whatsnew_sect003.htm&amp;amp;docsetVersion=8.5&amp;amp;locale=en&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You are correct that there is no option for the user within Model Studio GUI to set the seed value. Your feedback has been passed on to R&amp;amp;D.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Cheers, Simon&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 22 May 2020 11:37:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/649875#M22308</guid>
      <dc:creator>SimonWilliams</dc:creator>
      <dc:date>2020-05-22T11:37:18Z</dc:date>
    </item>
    <item>
      <title>Re: SAS Viya Model Studio and Data Partitioning</title>
      <link>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/649877#M22309</link>
      <description>Thanks SImon!</description>
      <pubDate>Fri, 22 May 2020 11:54:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/649877#M22309</guid>
      <dc:creator>andreas_zaras</dc:creator>
      <dc:date>2020-05-22T11:54:23Z</dc:date>
    </item>
    <item>
      <title>Re: SAS Viya Model Studio and Data Partitioning</title>
      <link>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/651112#M22358</link>
      <description>&lt;P&gt;Hi Andreas,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I received an update from my colleagues on this topic.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In essence, once you have created a Model Studio project which uses data 'x', everytime the data node is run, your partitions will remain the same.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, if you create multiple projects which use the same set of data 'x', the partitions will look different across the projects.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you are teaching students and each student has their own project and you really want to them to have identical partitions for data 'x', then use the program method i outline in the &lt;A title="Model Studio and partitioning" href="https://communities.sas.com/t5/SAS-Communities-Library/SAS-Model-Studio-8-5-projects-and-considerations-when-needing-to/ta-p/649717" target="_self"&gt;communities article&lt;/A&gt; to create identical partitions by having each student run the proc partition with the same seed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sorry for any confusion, and as mentioned before we've provided feedback for Model Studio users to be able to set the seed in the Model Studio GUI.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks, Simon&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 27 May 2020 15:22:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/651112#M22358</guid>
      <dc:creator>SimonWilliams</dc:creator>
      <dc:date>2020-05-27T15:22:50Z</dc:date>
    </item>
    <item>
      <title>Re: SAS Viya Model Studio and Data Partitioning</title>
      <link>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/651190#M22364</link>
      <description>&lt;P&gt;Hi SImon!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for the update.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;ANother good idea for passing to the R&amp;amp;D is that a seed is available for the event based sampling facility. I think now every time you create a project it samples the events and the non events with a new seed so the results won;t be the same.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Andreas&lt;/P&gt;</description>
      <pubDate>Wed, 27 May 2020 19:43:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/651190#M22364</guid>
      <dc:creator>andreas_zaras</dc:creator>
      <dc:date>2020-05-27T19:43:02Z</dc:date>
    </item>
    <item>
      <title>Re: SAS Viya Model Studio and Data Partitioning</title>
      <link>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/651401#M22377</link>
      <description>&lt;P&gt;Hi Andreas,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I will add your feedback regarding the seed for event based sampling back to R&amp;amp;D.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Cheers, Simon&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 May 2020 14:49:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/SAS-Viya-Model-Studio-and-Data-Partitioning/m-p/651401#M22377</guid>
      <dc:creator>SimonWilliams</dc:creator>
      <dc:date>2020-05-28T14:49:21Z</dc:date>
    </item>
  </channel>
</rss>

