Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Methodology to apply Big Data into SAS Mining

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 18
Accepted Solution

Methodology to apply Big Data into SAS Mining

Hi experts,


I want to apply in a Big Data Project some Data Mining Techniques with SAS.

I’m planning my methodology (a gantt project) and I have some doubts to ask because I don’t want to “kill” SAS Machine whit a big amount of data to analyze:


1) Is a good choice divide the data to 3 data sets (training, tests and validation) on Big Data Tool? I usually do SAS Enterprise Miner to target data.

2) Choose only a data set of my big amount of data and then store it into SAS Files to use SAS Miner to create this 3 data sets.

What is the best option?

Thanks!


Accepted Solutions
Solution
‎05-22-2016 11:14 AM
SAS Employee
Posts: 13

Re: Methodology to apply Big Data into SAS Mining

You could use the HPA (High-Performance Analytics) nodes in Enterprise Miner for your data (80GB). This also requires that you have cluster/group of machines or MPP (Massive Parallel Processing) setup so the data can be distributed across them to perform modeling computations -- similar to what you are planning to do manually. To use HPA in MPP setup in EM, you will need SAS High-Performance Data Mining License. Here is tip that introduces HPA and other SAS products that handle large data: SAS High-Perfo​rmance Analytics tip #1: How it differs from SAS Grid & SAS In-Memory Analytics

 

If you want additional details about HPA in Enterprise Miner, continue reading subsequent tips in this series:

SAS High-Perfo​rmance Analytics tip #2: HPDM nodes in SAS Enterprise Miner

SAS High-Perfo​rmance Analytics tip #3: Example flow diagram in SAS Enterprise Miner

SAS High-Perfo​rmance Analytics tip #4: Scoring with SAS Enterprise Miner

SAS High-Perfo​rmance Analytics tip #5: Scoring with Analytic Store files

 

Hope this helps!

View solution in original post


All Replies
Super User
Posts: 17,840

Re: Methodology to apply Big Data into SAS Mining

How 'big' is your data?

 

The partitioning of datasets has nothing to do with data size, it's a methodological consideration. 

Occasional Contributor
Posts: 18

Re: Methodology to apply Big Data into SAS Mining

Like 800 GB.

Yes, but I'm afraid about put all the data into SAS Miner.

Super User
Posts: 17,840

Re: Methodology to apply Big Data into SAS Mining

At the end of the day it will depend on your setup. 

 

My my guess is that's going to be too big 😩

Occasional Contributor
Posts: 18

Re: Methodology to apply Big Data into SAS Mining

I guess I've to do some segmentation on Big Data Tool before I load the Data Sets into SAS. If I create some rules to create some clusters with a smaller amount of data using the Big Data tool to do that segmentation, then I can use SAS Miner. But in this case, I will have multiple diagrams in SAS Miner... Smiley Sad

Solution
‎05-22-2016 11:14 AM
SAS Employee
Posts: 13

Re: Methodology to apply Big Data into SAS Mining

You could use the HPA (High-Performance Analytics) nodes in Enterprise Miner for your data (80GB). This also requires that you have cluster/group of machines or MPP (Massive Parallel Processing) setup so the data can be distributed across them to perform modeling computations -- similar to what you are planning to do manually. To use HPA in MPP setup in EM, you will need SAS High-Performance Data Mining License. Here is tip that introduces HPA and other SAS products that handle large data: SAS High-Perfo​rmance Analytics tip #1: How it differs from SAS Grid & SAS In-Memory Analytics

 

If you want additional details about HPA in Enterprise Miner, continue reading subsequent tips in this series:

SAS High-Perfo​rmance Analytics tip #2: HPDM nodes in SAS Enterprise Miner

SAS High-Perfo​rmance Analytics tip #3: Example flow diagram in SAS Enterprise Miner

SAS High-Perfo​rmance Analytics tip #4: Scoring with SAS Enterprise Miner

SAS High-Perfo​rmance Analytics tip #5: Scoring with Analytic Store files

 

Hope this helps!

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 440 views
  • 0 likes
  • 3 in conversation