BookmarkSubscribeRSS Feed

SAS Enterprise Miner shortcut: Use the Advanced option when profiling data

Started ‎07-28-2017 by
Modified ‎07-28-2017 by
Views 3,319

Thank you to Philip Easterling , SAS Principal Systems Engineer, for contributing this tip as part of the SAS Enterprise Minor shortcut series. In his words: 

 

Here’s a simple one which is sometimes overlooked because it’s not the SAS Enterprise Miner (EM) default. 

 

When you go through the “Create Data Source” wizard, you encounter the properties screen, which has two options (Basic and Advanced).  Basic is the default, and, from my observations, most people just go with the default Basic option.  However, I always switch to the Advanced option because it provides a convenient way to profile the data.

 

After completing the “Create Data Source” wizard steps and incorporating this data as an Input Data Source node in an EM diagram, I then select the “Edit Variables” option for the node. When the variable list opens in a window, I then check the “Statistics” box at the top right. This provides a lot of information about both the categorical and numeric variables, including the percent of rows with missing values for each variable. 

 

You can click on a column heading (a statistic) to sort by ascending or descending values of that statistic. So, I can sort the missing values column to get an idea of which of my variables have a large percentage of missing values. Similarly, I can “profile” any other variable by looking at its values for any of the reported statistics. This minimizes the need to switch to something like SAS Enterprise Guide to profile the data.

 

You can also profile the data during the “Create Data Source” process by checking the “Statistics” box at the top right of the variable list screen, since this variables list screen is the next one after the Basic/Advanced property selection step.  If you go with the “Basic” default option, the “Statistics” box selection is always grayed out and unavailable for selection in EM when choosing to “Edit Variables,” causing you to miss out on this convenient data profiling method.

 

 

Have questions related to this tip? Ask them on the SAS Data Mining and Machine Learning Community to get perspective from a large pool of SAS Enterprise Miner experts. Simply click "New Message" (must be logged in!) and ask away.

Version history
Last update:
‎07-28-2017 04:19 PM
Updated by:
Contributors

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags