BookmarkSubscribeRSS Feed

Data mining in SAS Enterprise Miner? It's easy with the SAS Rapid Predictive Modeler task in SAS Studio

Started ‎02-29-2016 by
Modified ‎02-02-2024 by
Views 6,400

The SAS Rapid Predictive Modeler task in SAS Studio not only lets you quickly and easily build predictive models using smart defaults, but it also creates an Enterprise Miner process flow and SAS code behind the scenes. Once the task in SAS Studio runs, you can open the Enterprise Miner process flow or the code and make any desired changes. The SAS Studio Rapid Predictive Modeler task presents results in clear business terms, such as scorecards, lift charts, and variable importance. It automatically handles outliers, missing values, rare target events, skewed data, variable selection and model selection. Machine learning techniques such as neural networks and other data mining methods are used behind the scenes, and the best model is selected automatically.

 

You can further customize and tweak models using the Enterprise Miner GUI to edit the process flow that the task creates, or by editing the code created behind the scenes. Models are registered in metadata to automate the execution of score code and make deployment to other systems easy. The SAS Studio Rapid Predictive Modeler task is useful for:  

 

Blog1aIcon.png Business Analysts, who simply want a fast and accurate answer to their business question. The Business Analyst will generally accept the results of the task, as is.

 

 

 

Statisticians, who may want to open the Enterprise Miner flow to look under the Blog1bIcon.pngcovers, and adjust some of the defaults and add/subtract nodes as they see fit in an effort to incrementally improve model accuracy and results.

 

 

 

 

Blog1cIcon.pngData Scientists and Coders, who may use the Rapid Predictive Modeler to develop a coding template, which they can use at a starting point, to edit and amend.

 

 

 

 

Auto safety example

Imagine you are interested in preventing auto accidents by issuing recalls on automobile parts that are likely to fail. In my example below, I start with a historic (notional) data set on auto parts that includes a binary target (dependent) variable TargetPartFailure. TargetPartFailure indicates whether or not the part failed: 1 = failure and 0 = no failure. Other variables include a unique ID variable (PartNumber), and input (independent) variables, such as PartType, PartAge, and NumIssuesReported.

 

Rapid Predictive Modeler finds the best model based on the historic data. That model can then be applied to a completely new data set, which has no information on target part failure, but has the same inputs (independent variables) as the historic data set. This allows the analyst or manager to prioritize which auto parts should be further investigated for potential recalls.  

 

Let's import the data

The first step is to upload the data so that they are available in SAS Studio. Right click in the file where you want the data, and select Upload Files.

 

Blog1d.png

 

Navigate to the physical file where your data sets are stored, and select the files you want to upload.

 

Blog1e.png Blog1f.png Blog1g.png

 

Drag and drop your data from the navigation pane on the left into the work area on the right. The data should automatically load into a _TEMP library.

 

Blog1z.png

 

Next, expand the available Tasks, then expand the Data Mining subcategory. Double click on the Rapid Predictive Modeler task.

 

 Blog1h.png

 

 

Assign the target variable TargetPartFailure the role of Dependent Variable. Unlike the Enterprise Miner interface, if your target variable starts with “target,” Rapid Predictive Modeler will not automatically assign it the role of Dependent Variable; you must assign this role.

 

Blog1i.png  

 

What are my options?

On the Options tab, under Model you may select Basic, Intermediate or Advanced. For this example, I select Advanced.

 

Blog1j.png

 

 

 

The Basic, Intermediate, and Advanced Model options are described in the SAS Studio 3.4 User’s Guide. I chose the advanced option, which evaluates the most models and then chooses the best performing model.

 

Under the Reports Option, you can choose Standard reports or Standard & additional reports. Check the reports you want to see.

  

Blog1k.png  

 

 

And outputs?

On the Output tab, check each box and specify the names and folders you would like for your output data sets and save locations. To keep this information handy and avoid typos in a future step, you can use Ctrl + c and Ctrl + v to copy and paste the project data name (e.g., RapidPredictiveModelerAutoSafety20160209) and the folder (e.g., C:\Users\sasdemo\EMProjects) into Notepad.

 

 

Blog1l.png

 

 

The SAS Studio Rapid Predictive Modeler task will automatically create the output you requested. You will recognize this output, because it is Enterprise Miner output! For example, you will see an ROC plot with the K-S statistic.

 

Blog1m.png

 

 

The better the model, the higher and farther to the left the ROC curve will be, maximizing sensitivity and minimizing 1-specificity (that is, maximizing true positives and minimizing false positives). In my example, we have a pretty good K-S statistic (higher/closer to one is better) of 0.72388 for the validation data and 0.73372 for the training data. It is a good sign that the K-S statistic is similar for both the training and validation data, indicating that we did not overfit the training data.

 

How to open the SAS Enterprise Miner process flow

The SAS Studio Rapid Predictive Modeler task created a SAS Enterprise Miner process flow behind the scenes. You can open that process flow in SAS Enterprise Miner to make any changes or additions to the flow that you want. Start by logging on to Enterprise Miner.

 

Blog1o.png

 

Open a new project. This is counterintuitive, but you definitely want to open a New Project.

 

Blog1p.png

 

 

Name the new project the exact same name as you named the output file in Rapid Predictive Modeler, and browse to the same server directory as you indicated in Rapid Predictive Modeler. This is why it is helpful to have copied that project name and server directory path into Notepad, to avoid any typos in this step.

Blog1q.png

 

When you hit Next you will get a Project Exist dialogue reading “The selected project exists on the filesystem. It may have been created by another user. Do you want to continue?” Click Yes. The project that was already created is the one you created using the Rapid Predictive Modeler task in SAS Studio. Then click Next and Finish.   Blog1r.png Blog1s.png Blog1t.png

 

Open the Diagram and Voila! You see the Enterprise Miner process flow that you created with Rapid Predictive Modeler in SAS Studio.

 

Blog1u.png Blog1v.png

 

You can now use Enterprise Miner to make any changes or additions that you want.

 

ADDITIONAL RESOURCES:

SAS Studio Tutorials

Rapid Predictive Modeler VLE course

SAS Studio 3.4 User's Guide

 

If you'd like the sample data set used in this article, feel free to private message me via the community and I'll send it your way. 

Comments

 

Hi Beth and Anna,

 

I don't see the data mining option in SAS Studio 3.5. Is this because I am using the University Edition of the software?

 

Cheers,

 

Sandesh.

Hi @DataScientist, that is correct. University Edition does not include Enterprise Miner. If you're a university professor or student, SAS OnDemand for Academics is an option and it includes Enterprise Miner.

Thank you for your response @BeverlyBrown

 

Makes sense now.

 

Do you think going forward SAS might have university editions of EG and EM given that these are used a fair bit in the industry and not everyone that aspires to work with these applications will have access to them?

 

Cheers,

 

Sandesh.

Hi  @DataScientist, I checked with University Edition's product manager. She said: "We would have loved to have included SAS Enterprise Guide in the University Edition since it is a great tool for folks learning SAS.  Unfortunately, as a Windows-only client, it didn’t fit since University Edition needs to run on the Mac as well.  The good news is that SAS Studio is continuing to add EG-like functionality.  SAS Enterprise Miner is a bit of a different story – it’s really designed for a different audience. It is available for professors & their students via SAS OnDemand for Academics but there are no plans to include it in the University Edition right now."

Thanks for the information Beverly. The university edition for SAS Studio is a great product as it stands. I look forward to additional functionality being added to it. For now, I think I need to focus on the statistical functionality that SAS Studio university edition offers at the moment and get a handle on SAS/STAT, which is a bit technical but I'm sure is a great asset for people in the industry that are comfortable with using and writing statistical procedures.

 

Cheers,

 

Sandesh.

Hi,

 

I have create a Rapid Predictive Modeler Task in SAS studio, when I tried to open that process flow in SAS Enterprise Miner, I receive the following error:

This server location is already registered for use, either by another project, or by another SAS Application. Or user does not have permission to create project at this location. Try using a different name for your project, or enter a different path.

 

I am using the SAS OnDemand for Academics

 

Thanks

Hi @mnsen, the best way to ensure a quick response is to post a new question on the Analytics U Community: https://communities.sas.com/t5/SAS-Analytics-U/bd-p/sas_analytics_u Experts familiar with SAS OnDemand for Academics keep a close eye on that forum. Thank you for using SAS communities! 

Version history
Last update:
‎02-02-2024 11:11 AM
Updated by:
Contributors

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags