Watch this Ask the Expert session to dive deeper into your data by bridging the gap between data exploration and advanced analytics.
Watch the Webinar
You will learn how to:
Use the Automated Explanation object.
Trust the results of SAS’ modeling objects in SAS Visual Analytics.
Take the next step.
The questions from the Q&A segment held at the end of the webinar are listed below and the slides from the webinar are attached.
Q&A
Can you have an option for both?
Yes, you can always do both!
How do you add multiple items?
With the response, there's only one response. Technically, a category is multiple items, but it's still just one response. Predictors are done for you automatically when you add a response, so it's as easy as either dragging and dropping or letting it add everything for you. The other possible explanation, and for your question, is if you wanted to add multiple automated explanation objects. You can certainly do those multiple ways. You can duplicate the object on canvas; you can drag in a new automated explanation object and add in another predictor response; you could do the same thing with automated prediction. In general, it's one category or one response and then as many predictors, or underlying factors, as you would like to add in your data.
How do you choose which variables to include?
This one is always going to be it depends on the industry, the data, and your knowledge on both of those things. SAS will take all the variables you have that you could either predict or explain and throw them into the model. It will use what sticks. You've got to look at what those variables are. For example, if you're predicting MSRP and you have invoice in there, you want to see what's going to be important for you? What do you want an underlying factor to be? If you're predicting something, think about if you get more data, what data are you going to get? If it's going to be on card data, are you going to get those invoices? It's going to be important to know what data you have, what data you might have in the future, and what logically makes sense to either predict or explain a variable.
How can I trust what Automated Explanation tells me?
Automated explanation has been designed from the ground up to be trustworthy in and of itself. It gives you information about what it's doing in the background, so it's not a complete black box. You can also verify those results. We did that with SAS HELP.cars. We looked at the modeling results and we saw they all made a lot of sense. The important thing to understand about automated explanation is that it gives you some insight into what relationships there are within your data. It's also always important to go in and try and understand those relationships better. But you can trust the results because all that information tells you what it's doing and how it's doing it. Not to mention, these are based off the same types of cases, actions, and models in the background that you would use. In theory, you could create those exact same types of models in SAS and produce the same results as automated explanation using decision trees and the like. In a nutshell, it's using that stuff and presenting it to you in a different way – in an easy-to-understand way. Remember, if you feel more comfortable modeling yourself, go for it. That's why those model objects are there. That's why SAS studio is always there. Automated Explanation is a good place to start. If you ever want more control over things, there are so many ways that you can get there.
Can you see the model fit stats when using Automated Prediction?
It's not an explanation, so you don't see much of these stats. You just see the mean square error. The better answer to that question is with the more advanced features of SAS with visual statistics and visual machine learning. You can see more statistics of a prediction object. This one is just the basic to get you started, so it won’t show you much. We can see the underlying factors of what matters and what changes, but you should always take one more step into thinking about it. If you want to know those underlying statistics, you can start with this automated prediction to use it as a baseline and then create your own.
Stu mentioned a blog where VA and Studio are combined. Could you share the name of this blog?
You will find it in the SAS Communities library. It's written by Ted Stolarczyk. It's a great article that will take what you can do in VBA to the next level.
How much complexity can datasets have to work on automated explanation and prediction?
Honestly, any complexity. The more data, the better. Visual Analytics is designed to aggregate things for you, parse through your data, and crunch the numbers. That's what Cloud Analytics services does best, which is the underlying engine behind all of this. I think the best way to do it is to feed it raw data and let it sort out everything as best as it can. If you take your data as is, don't really do any pre aggregation and let it handle it, you tend to get some interesting results. I wouldn't worry too much about the complexity or the size. Overall, it's designed to handle that.
I tried to do the automated prediction using cars, but mine doesn't auto fill with average data.
To my knowledge it should, but that's OK if it doesn't. You can just type in anything that you want to, anything that's reasonable. You can type in 200 horsepower, 6 cylinders, even your own car, and it's still going to come up with the right thing as far as the median and the mode values that come into it. Just a place to start to show the median values.
How would you do visual analytics if your data comes from multiple relations and have multiple relational tables?
This happens a lot. You know not all your data is always going to come in from just one clean CSV or whatever source. Sometimes they come from different relational databases. Sometimes they come once a month or once a day. Sometimes it can be hundreds of these databases that you want all to be in one. The answer to that is there's a couple options. One is going to be called SAS Data Studio. That's going to allow you to easily combine these data sets straight from the relational databases. Whatever kind of database you have or just from your local computer, whatever it is, you can combine them. You can even schedule jobs so that they can continue combining as you get them in, then you can do all your automated prediction and explanation. Also, if you don't have, or just don't want to use, SAS Data Studio, you can also use SAS Studio or whatever type of SAS coding environment you have. With that, you can also combine these types of databases and create jobs and tasks to have them continuously generated for you.
Is it possible to include open source models in VA and compare it with the built-in models?
Yeah, SAS is great with having open source working with it. You can use Python or R and link it up with the powerful parts of SAS and cloud analytics services (CAS) to look at those models. To my knowledge, Model Pipeline, which is a machine learning component of Visual Analytics. You can put in your code and compare it with the ones that Visual Analytics makes. Within visual analytics, you are using the models that are within there, including anything under the Visual Statistics/Visual Machine Learning group, can be compared. If you want to do open source comparison, the best way would be to use Model Studio or SAS studio. But, within visual analytics itself, you have groups of models that are sort of semi-on-rails types of models that allow you to compare with each other.
It looks like VA prediction uses one field/item at a time. Can one tell VA to make interaction terms?
Absolutely! The best way to do this would be to add calculated data items from the data pane. This allows you to make whatever types of interactions you’d like, which will automatically get passed through the model.
Is there the equivalent of a log file in VA?
There unfortunately is not. However, in SAS Environment Manager, there are some logs you can refer to.
If we have NoSQL and SQL DB, how would the Visual Analytics engine work together?
SQL and SAS are powerful by themselves but are even more powerful when coupled together. Though, this happens mainly in SAS Programming rather than Visual Analytics. One thing you can do in Visual Analytics is link a database.
Recommended Resources
A Survey of Methods in Variable Selection and Penalized Regression
Modernizing Scenario Analysis with SAS S® Viya and SAS® Visual Analytics
AI is Coming for Your BI: Automated Analysis in SAS® Visual Analytics
Please see additional resources in the attached slide deck.
Want more tips? Be sure to subscribe to the Ask the Expert board to receive follow up Q&A, slides and recordings from other SAS Ask the Expert webinars.
... View more