This presentation combined three techniques to generate better synthetic data. Active sampling, tabular GAN and SAS® autotuning can generate data that improves a model's prediction task. A variation of query by committee written in SAS will be used to subsample a relevant data representation. A sample of cases near the decision boundary for the problem at hand will be drawn. Sampling the data in this manner will reduce the number of observations to only those most relevant to the problem, improving the signal and reducing the computational burden on the GAN. Tabular GAN will learn representations from only the subsampled data. Participants will see how we can leverage autotuning to tune the tabular GAN model to produce better synthetic representations where autotuning will maximize the error of a pseudo discriminator that attempts to distinguish between real and artificial data. Participants can apply what they learn to most supervised learning problems. This presentation combined key SAS technologies in a way that can benefit most data scientists working on supervised learning problems.
Presentation slides are attached to this post.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
SAS Explore 2023 presentations are now available! (Also indexed for search at lexjansen.com!)
View all available SAS Explore content by category: