Machine Learning is all the buzz. What makes it so cool? Two key things.
Using iterative processes, machine learning builds models that automatically adapt with little or no human intervention. The models “learn” from previous computations to produce more and more accurate results as data are added. Machine learning is used to identify patterns or predict future outcomes.
In the movie Furious 7 co-starring Dwayne Johnson (yes, he is the only reason I went to see the movie), the hacker character Megan Ramsey creates a tool called “God’s Eye.” It hacks into every camera around the world, uses face recognition software to compare camera images to the person targeted, and locates the target individual within 4 minutes. How cool is that? (And how scary is that!)
If you use social media tools like Facebook, you may have noticed that Facebook is now tagging people’s faces of its own accord. It isn’t perfect. I am annoyed to say that Facebook has started tagging my face with my aunt’s name, and she is 13 years older than I am. But the more times my face is tagged, the more the algorithms “learn” to tag me correctly.
Face recognition is just one of the many amazing uses of machine learning.
What else is machine learning used for?
Machine learning is used for everything from banking and credit risk assessment to fraud detection, to sentiment analysis for marketing, to sales prediction. A few examples are:
|Object, face, & voice recognition|
|Text sentiment analysis|
There are many different methods of machine learning. What they all have in common is that they are automated and adaptive. Neural networks are one good example of a machine learning technique. Neural networks are modeled after the human brain.
I like to use a simple example of distinguishing an apple from an orange. If I want to make an apple pie, and I ask my niece to bring home apples from the market, she needs to be able to tell an apple from an orange. If she comes home with oranges, well…best case scenario, there will be no dessert. Luckily, she can do this easily, because over time as a child, she learned to differentiate an apple from an orange based on color, size, smell, and texture. Someone would tell her that yes, that is an apple, or no, Sweetie, that is not an apple. This is called “supervised learning.” Color, size, smell, and texture are all “features” or “attributes” or as statisticians like to call them, independent variables. These features help determine the outcome. In this case the outcome is binary. It’s an apple, or it’s not an apple. In my niece’s brain, these different attributes are weighted differently. For example, texture or color might be weighted higher than size because it is better at differentiating apples from oranges.
Machine learning doesn’t have to be supervised. Clustering is an example of unsupervised learning.
In supervised learning, there are historic data where a label (target, outcome, dependent variable) value is known. In the case of loan default risk, there is a dataset of people who have defaulted or not defaulted on their loans in the past. There are also attributes (features, inputs, independent variables), such as income, age, and debt amount. The model is trained with the historic data, and then can be applied to new data that has the same attributes to determine the likelihood of the outcome. For example, the likelihood of defaulting on the loan.
In some arenas supervised learning is called “predictive modeling” or “predictive analytics.” Business cases where supervised learning is used include marketing (customer response), fraud detection, risk scoring, and recommender systems (e.g., "if you like the movie Furious 7, you might like the movie San Andreas"). Machine learning includes an iterative learning loop. As new information is received (e.g., new loan defaults), the expected results are compared to the modeled results and the model automatically updates and improves.
In unsupervised learning there are no known labels (outcomes), only attributes (inputs). Examples include clustering, association, and segmentation. Machine learning finds high density areas (in multidimensional space) that are more or less similar to each other, and identifies structures in the data that separate these areas.
Business cases where clustering is used include customer segmentation, text topic detection, and recommendations. Business cases where outlier detection is used include fraud detection, insider threat, and cybersecurity.
Machine learning shares many algorithms and approaches with statistical modeling. In both classical statistics and machine learning, outputs are predicted from inputs. However, while classical statistics is inferential, machine learning is driven by results. Commonly in classical statistics outcomes are predicted with a white box, versus a black box in machine learning.
In the words of Breiman (2001): “There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model [classical statistics]. The other uses algorithmic models and treats the data mechanism as unknown [machine learning].”
Machine learning techniques are useful for prediction problems where:
Two things have made machine learning much more feasible today than 50 years ago. Faster, cheaper computer processing and inexpensive data storage. Early statisticians had to do hand calculations.
Computing power and data storage were expensive at the end of the last millennium, but are cheap now. Just think about how much computer power is on a smartphone today versus a whole room in the 1950s.
For that reason, although many machine learning algorithms were developed in the 1900s, folks did not have adequate computing power to rapidly iterate over and over complex mathematical calculations on large data sets in a reasonable amount of time.
Richard Bellman coined the phrase the “curse of dimensionality” in the 1950s to describe the problem that complexity grows exponentially as the number of scalar inputs grows linearly. This exacerbated the need for computing power. One approach to overcome the curse of dimensionality is feature extraction, i.e., pre-processing the data to reduce the number of inputs. Nowadays, data scientists have much more computing power available to them and feature extraction techniques to help ameliorate the curse.
SAS has included machine learning algorithms for decades. For example machine learning techniques such as neural networks, decision trees, random forests, gradient boosting, and text analytics have been readily available to you for quite some time.
With the introduction of SAS Viya Data Mining and Machine Learning in May 2016, these machine learning methods became readily available through SAS Studio with the extra power of the new CAS architecture behind the scenes. More recently, SAS® Viya™ Data Mining and Machine Learning was redesigned to run natively on Azure, taking advantage of containers, Azure Kubernetes Service (AKS), Azure Synapse and other Microsoft services so you get the most out of your cloud investments. Learn more about SAS Viya Data Mining and Machine Learning.
Machine learning is invaluable for everything from combatting fraud, to marketing effectively, to assessing risk.
So the real question is, can a machine distinguish the Furious 7 movie character Megan Ramsey from the Flashdance movie character Alex Owens? Of course! Although the hair and clothing are similar, the facial features are actually very different.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.