<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Split the data in Dataflux in SAS Data Management</title>
    <link>https://communities.sas.com/t5/SAS-Data-Management/Split-the-data-in-Dataflux/m-p/842261#M20602</link>
    <description>&lt;P&gt;Data splitting is when data is divided into two or more subsets. Typically, with a two-part split, one part is used to evaluate or test the data and the other to train the model.&lt;/P&gt;&lt;P&gt;Data splitting is an important aspect of data science, particularly for creating models based on data. This technique helps ensure the creation of data models and processes that use data models -- such as machine learning -- are accurate.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This may help you,&lt;/P&gt;&lt;P&gt;Rachel Gomez&lt;/P&gt;</description>
    <pubDate>Thu, 03 Nov 2022 08:45:30 GMT</pubDate>
    <dc:creator>RacheLGomez123</dc:creator>
    <dc:date>2022-11-03T08:45:30Z</dc:date>
    <item>
      <title>Split the data in Dataflux</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Split-the-data-in-Dataflux/m-p/815171#M20347</link>
      <description>&lt;P&gt;Dear Team,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;I want to split my DataFlux data into chunks, For example, How to split the 60 million rows into 4,15 million row tables in Data Management Studio ?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Thank You&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Shakti&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 26 May 2022 05:56:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Split-the-data-in-Dataflux/m-p/815171#M20347</guid>
      <dc:creator>Shakti_Sourav</dc:creator>
      <dc:date>2022-05-26T05:56:30Z</dc:date>
    </item>
    <item>
      <title>Re: Split the data in Dataflux</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Split-the-data-in-Dataflux/m-p/817561#M20371</link>
      <description>&lt;P&gt;Hey Shakti!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Have you tried using the Data Validation node? You can use that to filter your data. For example, you could filter on an expression like Profit &amp;gt; 1000. Let me know if that's the sort of thing you're trying to do.&lt;/P&gt;</description>
      <pubDate>Sat, 11 Jun 2022 01:39:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Split-the-data-in-Dataflux/m-p/817561#M20371</guid>
      <dc:creator>ErinW</dc:creator>
      <dc:date>2022-06-11T01:39:11Z</dc:date>
    </item>
    <item>
      <title>Re: Split the data in Dataflux</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Split-the-data-in-Dataflux/m-p/817714#M20372</link>
      <description>&lt;P&gt;It depends if the split should be done randomly or according to the order rows are read. If this is the later, then add a sequencer node for numbering each row, and next an expression with something like&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;integer mygroup&lt;/P&gt;
&lt;P&gt;integer groups&lt;/P&gt;
&lt;P&gt;groups = 4&lt;/P&gt;
&lt;P&gt;mygroup = mysequencer % groups&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jun 2022 06:56:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Split-the-data-in-Dataflux/m-p/817714#M20372</guid>
      <dc:creator>VincentRejany</dc:creator>
      <dc:date>2022-06-13T06:56:22Z</dc:date>
    </item>
    <item>
      <title>Re: Split the data in Dataflux</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Split-the-data-in-Dataflux/m-p/842261#M20602</link>
      <description>&lt;P&gt;Data splitting is when data is divided into two or more subsets. Typically, with a two-part split, one part is used to evaluate or test the data and the other to train the model.&lt;/P&gt;&lt;P&gt;Data splitting is an important aspect of data science, particularly for creating models based on data. This technique helps ensure the creation of data models and processes that use data models -- such as machine learning -- are accurate.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This may help you,&lt;/P&gt;&lt;P&gt;Rachel Gomez&lt;/P&gt;</description>
      <pubDate>Thu, 03 Nov 2022 08:45:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Split-the-data-in-Dataflux/m-p/842261#M20602</guid>
      <dc:creator>RacheLGomez123</dc:creator>
      <dc:date>2022-11-03T08:45:30Z</dc:date>
    </item>
  </channel>
</rss>

