Thank you for your reply and for providing additional details in response to my questions. I was informed this morning that at a future date, I will have access to the original data. With that, could you please provide clarity on the following: 1. Can you explain further what you meant by, 'You then simulate from a DISTRIBUTION that has those moments.'? 2. Given that I will have access to the original data, I understand that I could use the bootstrap method in Proc Surveyselect. You can confirm my understanding that I 1) simulate data (i.e. create a new data set with fake data) based on the moments from the original data set and then 2) run Proc Surveyselect with method=bootstrap on the simulated/fake data? Do I understand correctly that the original data is only used to inform simulation process? 3) If the simulated data has values that are beyond the original data (and inclusion criteria), is there a way to restrict this? For example, if the original data included patients >18 and <70 years of age with a normally distributed mean age of 55, and the simulated data included patents <18 or >70 years of age, is it appropriate to place these restrictions on the simulated data? 4) Also, my understanding is that I should have the same number of patients in the exposed and unexposed groups as the original data set. When I simulate the data and the use Proc Surveyselect to draw a sample of 200 (out of 20,000 in simulated data set), I no longer get the same number of patients in the exposed and unexposed groups. This is causing large variations in the variable distributions when stratifying by exposure group.
... View more