Generating data has a number of use cases, for example:
For our book Data Management Solutions Using SAS® Hash Table Operations: A Business Intelligence Case Study @hashman and @DonH we needed to generate the sample data for the book. Choosing sample data can be challenging. If you use data that is industry or subject matter dependent, users in other industries have trouble relating (or occasionally dismiss it out of hand). For that reason we decided to use sports related data and choose baseball, in part because @DonH is a baseball geek. There is lots of data collected about baseball games and baseball fans are very focused on the analytics of baseball (referred to as sabermetrics).
We were unable to use the XML data for Major League Baseball so we decided to generate data for a complete season of a game we came to call Bizarro Ball. Bizarro Ball is similar to baseball, but it has some bizarre rules that are different, thus the name.
We used the hash object in many of the programs to generate the data. During the technical review of the book, we got feedback that describing how we generated the data was interesting, but did not seem to fit the Data Management and Business Intelligence theme of the book. So we decided to not include those details in the book; and instead document them externally.
Given that generating data is of broader interest that just what we needed to do for our sample data, we decided that the series of articles we had planned to write might be of interest to SAS users other than those folks who are interested in the book and want to generate different sample data.
This article will be updated as we write the additional articles that talk about our general approaches (use of random numbers, random selection, parameter files, what to parameterize vs. what to hard-code, and so on).
So please follow this article if you are interested in being notified about the followon articles that address these topics in more detail.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.