Who wants to be a billionaire? - SAS Voices As you may have heard, billionaire philanthropist Warren Buffett and Cleveland Cavaliers owner Dan Gilbert have teamed up to offer $1 billion to anyone who can create a perfect NCAA March Madness bracket. “Wow,” you might say. “How hard can it be to create a perfect bracket? I could really use a billion bucks!” Well, the answer is “really, really, unbelievably hard.” So hard that in the history of March Madness, no one has ever done it. For you math lovers out there, the odds are supposedly 1 in 9.2 quintillion. And what if someone is actually able to create this magical winning bracket? "I will invite him or her to be my guest at the final game and be there with a check in my pocket, but I will not be cheering for him or her to win," Buffett said, jokingly. "I may even give them a little investment advice.” I wanted to share how I used SAS Enterprise Miner with the Data Mining community as it related to the mania around March Madness. I heavily used data mining techniques with SAS Enterprise Miner and SAS Rapid Predictive Modeler to get customers to be comfortable with data mining techniques via March Madness. These are the steps I took to pull in the data to be analyzed. Through my research, I’ve also compiled a list of some helpful (and some not-so-helpful) factors for selection. Here’s what’s been successful in the past: RPI (Rating Percentage Index), based upon wins, losses, and strength of schedule Jeff Sagarin rankings from USA Today Wins against top 25 teams (per RPI rankings) Wins against teams ranked 26-50 Neutral court wins (Note: conference tournaments matter!) Record and rank in-conference (regular season championships matter!) Strength of conference (conferences do matter!) And here’s what doesn’t work: A team’s record in the last 10 games, i.e. the "hot team" myth: Strong finish is not important to the vs. a team’s overall performance Those "hot teams" are often doing some of the other things -- winning on neutral courts, and beating teams in the top 25 or top 50 – that do help boost their chances according to the Dance Card A team’s record against teams ranked 50-100 Winning against good teams helps, and the Dance Card model shows there's little downside in losing to good teams The lesson for athletic directors? Schedule less cupcakes and more top 50 RPI teams In addition to SAS products, here are the other sources I tapped: Georgia Tech LRMC Bayesian results Basketball Rankings (School of Industrial and Systems Engineering) Kenpom.com Advanced Analysis of College Basketball NCAA Dance Card – University of South Florida (powered by SAS) Ken Sagarin (USA Today Rankings) NCAA Basketball Let's submit a community bracket to prove that SAS has the best modelers around! Entries must be complete prior to the start of the NCAA tournament.
... View more