Is this your first time using statistical procedures within SAS software? Are you new to statistics in general? Has it been a while since your last statistics course? Need a review of the multitude of statistical procedures found in SAS? If you answer yes to any of these questions, then this series is for you. In part 1, we discussed aspects of exploring and describing continuous variables. We investigated PROC SGPLOT, MEANS, UNIVARIATE, and CORR. In part 2, our discussion turned to the modeling aspects of continuous variables. Our focus was on PROC REG, GLM, GLMSELECT, and PLM. In part 3, we took our analysis to categorical variables. Specifically, we discussed procedures that allow us to investigate and explore any categorical variables in our data. In part 4, we discussed modeling categorical variables using the popular procedure PROC LOGISTIC. In part 5, we moved away from the classical programming aspects of statistical procedures and explored utilizing graphical interfaces to access descriptive and graphical procedures.
Let’s continue our use of graphical user interfaces. This time we will use SAS Visual Analytics to perform automated explanations and then transform these explanations into linear and logistic regressions.
Within the Explore and Visualize area (SAS Visual Analytics) of SAS Viya, there are multiple objects that can be placed on your report canvas allowing you to graphically view the data.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
Previously in this series, we discussed how to explore the variables to find potential relationships between predictors and responses as well as within predictors themselves. Graphics and summary statistics are great assistants with this exploration. There is another way to perform this exploration, Automated Explanation.
Before you can utilize this object withing SAS Visual Analytics, you must reduce the variables visible from your data set to only those that are to be used within this automation. For example, say your data set contains an Account ID variable. We know that within any model we would not likely be using this variable. Right clicking on this variable will open a menu that will contain the option Hide.
This does not remove the variable from the data set but simply removes the variable from the GUI display.
We will need to do this for each variable that is not part of the possible regression analysis. But this will take a while if the number of variables is large. Is there another way to expedite this process? Why of course there is!
Clicking on the vertical ellipses to the right of the Data label will open a menu that contains Show or hide data items.
Clicking this option will open a window allowing you to move variables from hidden to displayed and back more quickly and efficiently.
The arrow buttons between the two columns move the variables from one state to the other. Note that there are two arrows. The top arrow has one pointer, and the bottom arrow has two pointers. There is a difference between these two buttons. The top (single pointer) button will only move the variables that you have selected in the direction of the arrow. Select an item from Displayed and the arrow will point to Hidden and vice versa. The bottom (double pointer) button will move all variables in one column to the other. Here is a tip on how to use the arrows the most efficiently. Use these arrows like Keep and Drop in a data step. Determine which column will have the fewest variables when done. Move all variables to the other column. Then, using your CTRL button on your keyboard, block select the variables to move back and utilize the top arrow. This will save time and prevent mistakes. The variables that should be in the Displayed column include the response variable of interest and all possible predictor variables from the data set.
With this organization complete, return to your report and right click your response variable. In the menu that appears, select Explain then Explain on current page. This activates the Automated Explanation object and places it on the current page of the report.
It is important to note that if your response variable is categorical, you will have to make sure it is in the Category section of the variable list. If your response variable is binary and represented using 0s and 1s, this response variable will likely be found in the Measure area. To move this categorical variable, right click on the variable and select Convert to category on the menu.
From the resulting output, you will have a multitude of information describing the relationship between the predictor variables that were still displayed and the response variable you used to activate the automated explanation object.
You now have explored your data set and even utilized the automated explanation object. You are ready to move to modeling, either Linear Regression or Logistic Regression. Did you know that you can move directly from the automated explanation directly to a regression object?
In the upper right corner of the automated explanation object is a vertical ellipsis. Clicking this will display a menu that contains the choice, Change Automated explanation to. This menu expands to display other objects that SAS Visual Analytics can display. In this list you will see Linear Regression and Logistic Regression.
Your decision of the next object will depend on the type of response variable that you are interested in modeling and what individual situation your question presents. In my example, we have a binary response, so I selected Logistic Regression.
Immediately upon clicking Logistic Regression, SAS Visual Analytics replaces my automated explanation object with that of a logistic regression object. It appropriately sets the roles for the response and predictor variables from my data set and performs an analysis based on the default settings of the object. Of course you have the option to change these settings as needed.
You can perform these regression analyses without having to go via the automated explanation object. In the object pane, SAS Visual Analytics offers many different possible analyses that can be performed. Under the Statistics area you will find Linear Regression, Logistic Regression, and several others.
Drag and drop one of these objects to your canvas area. On the right side, you will find the Options and Roles areas. Complete the Roles area first. This is where you indicate the role of each variable within the analysis. (Response, Continuous effects, Classification effects, etc)
The Options area provides you with the opportunity to customize the analysis by selecting items from various check boxed and drop-down menus. Note that not every option for a particular analysis will be available in this GUI aspect. The method to access all the options within an analysis will always be within the actual procedure code.
Interested in using SAS code to perform any of these statistical procedures? Please go back to the previous parts of this series. If you are not a coder, SAS has ways for you to access our statistical procedures using several graphical user interfaces. Try out SAS Visual Analytics and SAS Information Catalog and see what aspects you enjoy. See you in the next installment of this series.
Find more articles from SAS Global Enablement and Learning here.
Dive into keynotes, announcements and breakthroughs on demand.
Explore Now →The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.