BookmarkSubscribeRSS Feed

SAS Visual Data Mining and Machine Learning (VDMML): Getting Started

Started ‎08-03-2017 by
Modified ‎05-30-2019 by
Views 11,899

NOTE: Updated to include questions, slides and links from the latest April 12, 2019 Ask the Expert session using SAS Visual Data Mining and Machine Learning (VDMML) 8.3

 

Did you miss the Ask the Expert session on SAS Visual Data Mining and Machine Learning? Not to worry, you can catch it on-demand at your leisure.

 

Watch the webinar

 

 

Innovative algorithms and fast, in-memory processing. That's what you get with SAS Visual Data Mining and Machine Learning.


Designed for data scientists, statisticians and business analysts, it offers advanced programming capabilities coupled with point-and-click ease.


Watch this presentation and demo as we explore:

  • New algorithms - factorization machines and random forests.
  • Creating models - neural network and gradient boosting (including auto tuning).
  • Comparing models - multiple criterion such as lift, ROC and misclassification.
  • Interfaces - Interactive features, programming capabilities and integration to open source.

2018-07-05_14-38-54.png

Here are some highlighted questions from the Q&A segment held at the end of the session.

 

Does SAS Visual Statistics come with SAS Visual Data Mining and Machine Learning?

 

As part of the VIYA platform when you license SAS Visual Data Mining and Machine Learning you also have SAS Visual Analytics and SAS Visual Statistics


Can I add SAS Visual Data Mining and Machine learning to my current SAS install

 

SAS Visual Data Mining and Machine Learning is a part of the new VIYA platform. SAS VIYA and SAS 9.4 can be integrated and interact with each other and they are a separate installation and configuration.


Is it possible to download this data?

 

This data is available with SAS Enterprise Miner as a sample dataset. You can find it in the SAMPSIO library. The data can also be found in the SAS Sample Library. And information about the data and using it can be found here.


Is autotuning available in the Visual Interface?

 

In the current version of SAS VDMML 8.2 autotuning is available through both the visual and programming interfaces in SAS VIYA or Jupyter Notebooks or through the task in SAS Studio.

 

Is there integration with R?

 

Yes, Python, Java, R and Lua are supported. You can call SAS VDMML using your preferred language in our world-class, governed environment. And using REST APIs, you can add the power of SAS Analytics to your custom applications. In the latest 8.3 version you can also run R and Python code in Pipelines as part of Model Studio.

 

Do R and Python applications call SAS procedures or can you install and run R and Python packages?­

­

 

You can combine.  You can code in your native R or Python programs to call CAS Actions in SAS to run SAS Machine Learning algorithms, for example, and display the results graphically using R packages. In the latest 8.3 version you can also run R and Python code in Pipelines as part of Model Studio.

 

Can you repeat the links to the videos showing the GUI and use of Visual Analytics from Melodie Rush's presentation?

 

Sure! See SAS Visual Data Mining and Machine Learning on GUI InterfaceSAS Visual Data Mining and Machine Learning on SAS Studio and SAS Visual Data Mining and Machine Learning with Python Demo for more information.

Are the algorithms provided in VDMML the same algorithms used in SAS Enterprise Miner?

 

Yes and no. Regression, Decision Trees, Neural Networks, Gradient Boosting and Random Forest are available in both. The implementation however is different. VDMML runs everything in memory so the algorithms are adjusted to take advantage. Also the default settings are different for some and there are different options available. 

­

 

Enterprise Miner has a tried and true process for data mining and machine learning and can integrate well with R and SAS VDMML procedure. VDMML has a very open architecture and is able to integrate with many open source packages. ­­There are algorithms that are included with both and some that are unique to each one. The answer becomes "it depends" based on what you want to do and what type of architecture you have or are willing to acquire and configure.­

 

Here is a link to the documentation for more information.

 

Can we use percentages to partition the dataset for training and testing, instead of using predefined values as shown in the demo?

 

Yes, Using the tasks in SAS Studio allows you to specify the percentages. SAS Studio also provides the Partition Data Task that allows you can create your own Partition variable like what was used in the demo. In this task you can select your desired percentages for the Training, Validation and Test data sets.

 

If my data is huge (like 100GB), is there any problem for loading them to memory?

 

No.  The SAS platform is distributed and completely scaleable.  You can run algorithms against Terabytes of data as long as your environment is properly sized.  SAS is only constrained by your hardware environment.

 

Is there some documentation on Machine Learning?­

­

 

Yes.  http://documentation.sas.com/?cdcId=vdmmlcdc&cdcVersion=8.2&docsetId=vdmmlug&docsetTarget=titlepage....

 

Do you have a website for the public users to try this product?­

 

Yes.  To try the visual interface:  https://www.sas.com/en_us/trials/software/data-mining-machine-learning/ep-form.html or the developer interface:  https://www.sas.com/en_us/trials/software/viya-developer/form.html .

 

­Does VDMML produce JAVA/C/PMML score code?­

 

VDMML generates SAS code (data step) or an analytics store to score data. Not C/PMML/JAVA in the current version.

 

Is there a graphical output that shows a comparison lift of various data mining choices to choose the best lift?­

 

Yes, this is available with SAS Visual Statistics and you can see a longer demo of the Model Comparison feature in the Ask the Expert for SAS Visual Statistics.

 

Could we import other open source graphical packages or should I rely on SAS VA?­

You can import graphical packages so you get the best of both worlds. Of course. we like it when you use SAS :)­

 

­What is the benefit of using SAS python functions vs using packages like scikit learn­?

 

When you use the SAS CAS Action Sets (python functions), remember the data is loaded in memory and not only is it using the power of SAS it also is using the full power of your hardware architecture.  You also have access to SAS Tech Support and the SAS Community to support you in your projects. Additionally you have the ability for your users to collaborate on projects, even if you have a variety of user types (SAS programmers, open source programmers, or visual interactive users), and all will get same answer.

 

­Is there an efficient way of working with large, sparse matrices?

­SAS offers numerous methods for addressing sparse matrices, for VDMML factorization machine (PROC FACTMAC) works well for sparse matrices.

 

Can you have it run all the models automatically for you?

You can use pipelines to run many models against the same data and compare and assess the models with each other.

 

Can you do unsupervised learning on SAS Viya?

Yes, k-means clustering and several algorithms are available to help with anomaly detection. Here’s an article on the communities that covers unsupervised learning in more detail.

 

Does the tool have XGBoost or similar Machine Learning methods?

Yes, and many additional resources. Here is a nice blog post on it.

 

Do you have a benchmark on how long it takes to run models ..lets say 1 million records?

This would depend on the configuration and resources available to you. I've run a trillion records on one of our installs in less than 30 seconds for a logistic regression with 20 variables.

 

Can you use a node in the pipeline to implement R or Python code?

Yes, there is an open source node that was demonstrated in the presentation.

 

Is the open source code node in the Model Building how does it know to run R and Python? Meaning - do I need to install something in CAS to get R and Python executed?

Yes, there is a node in the visual interface of VDMML to allow you to include R or Python code with the pipelines that was demoed today. From the drop-down box you specify whether you are using R or Python. Yes, R and Python needs to be installed. More information about configuration can be found here.

 

My understanding is that R and Python code is processed within SAS. R packages are updated frequently. How are updates to the R packages used and the Base R package handled

Keeping R packages up to date will be the responsibility of your Admin. SAS does not oversee this process. Also R and Python code is not processed within SAS, rather R and Python code is submitted to the respective platforms to be processed. This article my help.

 

Can this all be done in SAS Enterprise Miner?

There is some overlap between SAS Enterprise Miner and VDMML but at this time they are not exactly the same in functionality.

 

Recommended Resources

Want more tips? Be sure to subscribe to the Ask the Expert board to receive follow up Q/A, slides and recordings from other SAS Ask the Expert webinars. To subscribe, select Subscribe from the Options drop down button above the articles.

Comments

What is the actual difference between CAS Actions and the SAS procedures? and when do u use what?

CAS Actions are the tools used to interact with data on the CAS Server as part of the Viya platform. CAS actions are similar to traditional SAS procedures and in fact are the underlying units for SAS Procedures in Viya. CAS Actions act and behave more like the methods and options used in Open Source.  In Viya you can choose in many instance to use either the SAS Procedure or the CAS Actionset. For example to create a Random Forest model I could use PROC FOREST or the CAS Actionset decisionTree with the CAS action forestTrain. Both will give me the same results. A SAS programmer may feel more comfortable using the PROCs where as an Open Source Programmer may feel more comfortable with the CAS Actions.

Thanks alot for the reply. Very apt
Version history
Last update:
‎05-30-2019 10:03 AM
Updated by:

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Article Tags