Here is the Q and A from the February 15th Webinar: The Data Scientist Learning Journey: Experimentation in Data Science. Register to watch on-demand here.
Q) How do you know what covariates and blocks to capture? Do you capture as many as possible and use something like a LASSO to remove non important variables or start with a few and build onto it as the model is trained?
A) Either of those are reasonable approaches for covariates. Blocks are generally always included in models, because they are part of the experimental design and restrict random assignment of units to treatments.
Q) How do you determine the minimum sample size for each "factors combinations"?
A) Usually you want to determine your sample size based on a statistical power calculation. Power is the probability of correctly rejecting the null hypothesis when the null hypothesis is false. The calculation of power is typically performed before the experiment, using an estimate of variability, the mean difference to detect, and your alpha for the test.
Q) Any recommendations for someone who just graduated and looking to into experimentation using SAS (such as A/B testing)?
A) Sure! Check out this course: https://support.sas.com/edu/schedules.html?id=5307&ctry=US
Q) Do you have any recommendations for techniques to account for self-selection bias in testing? For instance when a customer opts into a test treatment.
A) I strongly advise against letting subjects opt into treatments in an experiment. When it happens, then it is good to analyze the data as though it were an observational study instead of an experiment.