04-11-2012 03:57 PM
Did anybody know how I could select a random sample with a condition? Say, I have a 1 MM customers loan dataset. I wanted to select two random samples with a condition that the average loan balance in each sample should close to $800. Did anybody know how I could do it?
Thanks in advance.
04-11-2012 04:09 PM
You probably would have to adjust the code according to your requirement but this would help as start up for you.
INPUT SAMPLE LOAN;
SELECT SAMPLE, AVG(LOAN) AS AVG_LOAN FROM TEMP
WHERE RANUNI(111) < 0.55
GROUP BY SAMPLE HAVING AVG_LOAN = 800;
04-11-2012 04:19 PM
Not being a statistician in my simple world I would expect a RANDOM sample to have the same characteristics than the universe you draw it from. If this is true then you would first have to "tailor" yourself a universe with the desired characteristics.
I assume you first would have to sub-set your source table and then draw the sample from this sub-set (http://support.sas.com/kb/24/722.html).
04-12-2012 09:26 AM
A similar question was asked at https://communities.sas.com/message/122173
Lots of suggestions there. It's not clear how your $800 requirement relates to the properties of the data. Is $800 the average balance among all customers?
04-12-2012 10:56 AM
Two things to consider that would considerably change the approach taken: