If it sounds bizarre - that's because it is! Bizarro was a comic book super-villian, a 'negative' clone of Superman who lived on the planet Htrae (Earth, backwards). On Htrae, everything is backwards - wrong is right, good is bad.
Well - as it turns out - authors @DonH and @hashman have previously published (alongside their book), a set of SAS programs that will auto-generate random data for an entire baseball season. This came to be known as Bizarro Ball.
After becoming aware of this project, and recognising that the code / data is so obviously useful for training exercises, product demos, testing etc - with the author's permission, I loaded it onto github.
The reasons for this were severalfold:
The original package involved downloading, unzipping, and saving / configuring the code on a file system. On github, there is now a build script which concatenates all the files into a single program - meaning that SAS Studio / University Edition users can simply click here and copy paste the code directly into their editor. If you have internet access from SAS, it's even easier - simply run:
filename bizarro url
"https://raw.githubusercontent.com/allanbowe/BizarroBall/master/bizarroball.sas";
%inc bizarro;
Each SAS file was marked up with appropriate Doxygen header tags, enabling a documentation site to be generated. The great thing about the doxygen approach is that code becomes 'self documenting' - in the sense that the documentation for each file, lives inside the file itself. An example can be found here.
As is normal for a git repository, every change to the project is visible in the commit history (ie what changed, by whom, and when).
Perhaps the greatest thing about having the project on github is that the entire world can contribute to make it better! Got ideas or suggestions? Raise an issue, or even a pull request (see CONTRIBUTIONS.md).
The sascommunity.org site is being decommissioned, so it had to move at some point.
Apart from the randomness, and the Baseball domain interest, this data can also be modified (unlike SASHELP) - which can be useful in situations like class exercises. It is also interesting to follow and understand the hashing techniques used to create the data, described in more detail in this paper.
* The book (obviously)
* The aforementioned paper
* The DMHashBook tag
* The DMHashBookData tag
Now that all this code / data is documented and so easily accessible, what will YOU build with it? Do let us know!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.