Thank you to Philip Easterling , SAS Principal Systems Engineer, for contributing this tip as part of the SAS Enterprise Minor shortcut series. In his words:
Here’s a simple one which is sometimes overlooked because it’s not the SAS Enterprise Miner (EM) default.
When you go through the “Create Data Source” wizard, you encounter the properties screen, which has two options (Basic and Advanced). Basic is the default, and, from my observations, most people just go with the default Basic option. However, I always switch to the Advanced option because it provides a convenient way to profile the data.
After completing the “Create Data Source” wizard steps and incorporating this data as an Input Data Source node in an EM diagram, I then select the “Edit Variables” option for the node. When the variable list opens in a window, I then check the “Statistics” box at the top right. This provides a lot of information about both the categorical and numeric variables, including the percent of rows with missing values for each variable.
You can click on a column heading (a statistic) to sort by ascending or descending values of that statistic. So, I can sort the missing values column to get an idea of which of my variables have a large percentage of missing values. Similarly, I can “profile” any other variable by looking at its values for any of the reported statistics. This minimizes the need to switch to something like SAS Enterprise Guide to profile the data.
You can also profile the data during the “Create Data Source” process by checking the “Statistics” box at the top right of the variable list screen, since this variables list screen is the next one after the Basic/Advanced property selection step. If you go with the “Basic” default option, the “Statistics” box selection is always grayed out and unavailable for selection in EM when choosing to “Edit Variables,” causing you to miss out on this convenient data profiling method.
Have questions related to this tip? Ask them on the SAS Data Mining and Machine Learning Community to get perspective from a large pool of SAS Enterprise Miner experts. Simply click "New Message" (must be logged in!) and ask away.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.