03-09-2017 04:38 PM
If you’re headed to SAS Global Forum 2017, April 2-5, you may be thinking – “Wow, there are so many data mining and machine learning presentations, where do I begin?” Perhaps the below list of sessions from the SAS Advanced Analytics R&D and Product Management team will help get you started. You should definitely swing by the Data Mining and Machine Learning booth to say hello too.
Check out the SAS Global Forum 2017 community to get updates on various event happenings from how to get there to people’s favorite sessions.
And during the conference, put together your agenda and stay connected through the mobile app.
Machine learning is in high demand. Whether you are a citizen data scientist who wants to work interactively or a hands-on data scientist who wants to code, you have access to the latest analytic techniques with SAS® Visual Data Mining and Machine Learning on SAS Viya. This offering surfaces in-memory machine learning techniques such as gradient boosting, factorization machines, neural networks and much more through its interactive visual interface, SAS Studio tasks, SAS procedures, and api’s to Python. This paper shows you how to take advantage of these techniques and collaborate with your team - no matter your skillset. Learn about this multi-faceted new product and see it in action.
(Patrick Koch, Monday 4:00 - 4:30 PM) Location: Dolphin Level 3 - Oceanic 1
Machine learning predictive modeling algorithms are governed by “hyperparameters” that have no clear defaults agreeable to a wide range of applications. A few examples of quantities that must be prescribed for these algorithms are the depth of a decision tree, number of trees in a random forest, number of hidden layers and neurons in each layer in a neural network, and degree of regularization to prevent overfitting. Not only do ideal settings for the hyperparameters dictate the performance of the training process, but more importantly they govern the quality of the resulting predictive models. Recent efforts to move from a manual or random adjustment of these parameters have included rough grid search and intelligent numerical optimization strategies. This paper presents an automatic tuning implementation that uses SAS/OR® local search optimization for tuning hyperparameters of modeling algorithms in SAS® Visual Data Mining and Machine Learning. The AUTOTUNE statement in the NNET, TREESPLIT, FOREST, and GRADBOOST procedures defines tunable parameters, default ranges, user overrides, and validation schemes to avoid overfitting. Given the inherent expense of training numerous candidate models, the paper addresses efficient distributed and parallel paradigms for training and tuning in SAS® Viya™. It also presents sample tuning results that demonstrate improved model accuracy over default configurations and offers recommendations for efficient and effective model tuning.
(Udo Sglavo, Monday 5:00 - 5:30 PM) Location: Dolphin Level 3 – Oceanic 3
You've heard that SAS® Viya™ is our new, modern, open, and cloud-ready computing platform. Building on our unique foundation of expertise in delivering the most powerful and widely adopted analytics software in the world, SAS® Visual Data Mining and Machine Learning represents the next leap forward for SAS and our customers. Please join us to see where we are heading with SAS Visual Data Mining and Machine Learning in the next release. You will learn how we are converging large-scale analytics into a fully integrated offering within a unified environment. With the next version of SAS Visual Data Mining and Machine Learning, we combine the main components of a complete analytics lifecycle including data preparation, data exploration and visualization, model development, and deployment into a modern and approachable user experience.
(Funda Gunes, Tues 4:30-5:30) Location: Dolphin Level 3 - Oceanic 2
Ensemble models have become increasingly popular in boosting prediction accuracy over the last several years. Stacked ensemble techniques combine predictions from multiple machine learning algorithms and use these predictions as inputs to a second level-learning algorithm. This paper shows how you can generate a diverse set of models by various methods (such as neural networks, extreme gradient boosting, and matrix factorizations) and then combine them with popular stacking ensemble techniques, including hill-climbing, generalized linear models, gradient boosted decision trees, and neural nets, by using both the SAS® 9.4 and SAS® Visual Data Mining and Machine Learning environments. The paper analyzes the application of these techniques to real-life big data problems and demonstrates how using stacked ensembles produces greater prediction accuracy than individual models and nave ensembling techniques. In addition to training a large number of models, model stacking requires the proper use of cross validation to avoid overfitting, which makes the process even more computationally expensive. The paper shows how to deal with the computational expense and efficiently manage an ensemble workflow by using parallel computation in a distributed framework.
(Ray Wright, Tues 4:30 - 5:00) Location: Dolphin Level 5 - The Quad - Upper - Theater 2
Temporal text mining (TTM) is the discovery of temporal patterns in documents that are collected over time. It involves discovery of latent themes, construction of a thematic evolution graph, and analysis of thematic patterns. This paper uses text mining and time series analysis techniques to explore Don Quixote de la Mancha, a two-volume master work of Western literature. First, it uses singular value decomposition in SAS® Text Miner to discover 25 key themes that characterize the two volumes. Then it treats the chapters of the two books as time-ordered documents and creates a semiautomated visual summary of the two volumes. It also explores the trajectory of individual themes over the course of the chapters and identifies episodes, recurring themes, and climaxes. Finally, it uses time series clustering in SAS® Enterprise Miner™ to group chapters that have similar themes and to group themes that have similar trajectories. The TTM methods demonstrated in this paper lend themselves to business applications such as monitoring changes in customer sentiment and summarizing research and legislative trends.
(Jorge Silva, Tues 5:30 - 6:00) Location: Dolphin Level 3 - Asia 2
Factorization machines are a novel type of model that is well suited to very high-cardinality, sparsely observed transactional data. This paper presents the new FACTMAC procedure, which implements factorization machines in SAS® Visual Data Mining and Machine Learning. This powerful and flexible model can be thought of as a low-rank approximation of a matrix or a tensor, and it can be efficiently estimated when most of the elements of that matrix or tensor are unknown. Thanks to a highly parallel stochastic gradient descent optimization solver, PROC FACTMAC can quickly handle data sets that contain tens of millions of rows. The paper includes examples that show you how to use PROC FACTMAC to recommend movies to users based on tens of millions of past ratings, predict whether a wine will be highly rated by connoisseurs, discover shot styles that best fit individual basketball players, and restore heavily damaged high-resolution images.
(Ralph Abbey, Weds 11:30 - 12:00) Location: Dolphin Level 3 - Oceanic 3
Many practitioners of machine learning are familiar with support vector machines (SVMs) for solving binary classification problems. Two established methods of using SVMs in multinomial classification are the one-versus-all approach and the one-versus-one approach. This paper describes how to use SAS® software to implement these two methods of multinomial classification, with emphasis on both training the model and scoring new data. A variety of data sets are used to illustrate the pros and cons of each method.
(Ye Liu, Weds 12:00 - 12:30) Location: Dolphin Level 3 - Oceanic 7
A Bayesian network is a directed acyclic graphical model that represents probability relationships and conditional independence structure between random variables. SAS® Enterprise Miner™ implements a Bayesian network primarily as a classification tool; it supports naïve Bayes, tree-augmented naïve Bayes, Bayesian-network-augmented naïve Bayes, parent-child Bayesian network, and Markov blanket Bayesian network classifiers. The HPBNET procedure uses a score-based approach and a constraint-based approach to model network structures. This paper compares the performance of Bayesian network classifiers to other popular classification methods such as classification tree, neural network, logistic regression, and support vector machines. The paper also shows some real-world applications of the implemented Bayesian network classifiers and a useful visualization of the results.
Learn how quickly you can get started with the new machine learning procedures using the tasks and snippets in SAS® Studio.
Learn how the new batch code support in SAS Factory Miner enables you to retrain all of your models and register them with SAS® Model Manager without using the Factory Miner user interface.
Learn effective ways to combine hundreds of models to build a strong predictive model on the new SAS® Viya™ platform.
Learn how to use the SAS Enterprise Miner Link Analysis node to gain insights about associations in your data.
Learn about the new SVDD procedure in SAS® Visual Data Mining and Machine Learning for performing one-class classification and outlier detection. This demo applies SVDD to equipment condition monitoring.
Learn how to build better models faster with the latest advancements in automated hyperparameter tuning in SAS® Visual Data Mining and Machine Learning.
Find out what this product offers through its interactive visual interface, SAS® Studio tasks, procedures, and a Python client.
Learn the details and options associated with the various tree-based algorithms in SAS® Visual Data Mining and Machine Learning.
03-09-2017 04:58 PM
If interested, I am presenting
Using Segmentation to Build More Powerful Models with SAS® Visual Analytics
on Tuesday 11:30 am The Quad - Upper - Theater 3