BookmarkSubscribeRSS Feed

Rule-Based Codebook Generation for Exploratory Data Analysis

Started ‎03-29-2022 by
Modified ‎03-29-2022 by
Views 1,193

A codebook is a summary of a collection of data that reports significant features of the assembled data. It can be used to provide insights into the data collection. Survey data are often assembled into codebooks for rapid understanding of the questions asked, and typically are limited to variable name, variable label, categorical variable values and their frequency counts, and simple descriptive statistics for continuous variables.

 

Ideally, a codebook provides more value to a statistician or data miner when it presents variables in a format suitable for exploratory data analysis. The %CODEBOOK macro described in this paper uses the SAS® Enterprise Miner heuristic rules for producing metadata, and classifies variables according to the Enterprise Miner rules into nominal, ordinal, or interval measurement scales. It creates presentations of the variables appropriate to their measurement scale and may be used as an exploratory data analysis (EDA) tool for a first look at a set of data.

 

I am sharing the %CODEBOOK macro with the SAS community to assist with the EDA phase of a project. I hope that you will find it useful in your work.

Version history
Last update:
‎03-29-2022 04:10 PM
Updated by:

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags