BookmarkSubscribeRSS Feed
Shakti_Sourav
Quartz | Level 8

Dear Team,

 

I want to split my DataFlux data into chunks, For example, How to split the 60 million rows into 4,15 million row tables in Data Management Studio ?

 

Thank You

Shakti

3 REPLIES 3
ErinW
SAS Employee

Hey Shakti!

 

Have you tried using the Data Validation node? You can use that to filter your data. For example, you could filter on an expression like Profit > 1000. Let me know if that's the sort of thing you're trying to do.

VincentRejany
SAS Employee

It depends if the split should be done randomly or according to the order rows are read. If this is the later, then add a sequencer node for numbering each row, and next an expression with something like

 

integer mygroup

integer groups

groups = 4

mygroup = mysequencer % groups

 

RacheLGomez123
Fluorite | Level 6

Data splitting is when data is divided into two or more subsets. Typically, with a two-part split, one part is used to evaluate or test the data and the other to train the model.

Data splitting is an important aspect of data science, particularly for creating models based on data. This technique helps ensure the creation of data models and processes that use data models -- such as machine learning -- are accurate.

 

 

This may help you,

Rachel Gomez

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 670 views
  • 1 like
  • 4 in conversation