Hi,
i have this simple problem that i am trying to solve with the proc surveyselect but I'm not able to obtain the desired results.
I have a dataset with 3 variables: Country (takes only two values EU, NON EU) Segment (takes 3 values Large,Medium,Small) and Dollar (amount of exposure).
The dataset has 10000 lines.
I would like to extract a random sample of 40 lines that has the same (or close to the same) distribution of the original sample in terms of Dollar.
I am using this code:
proc sort data=dataset; by Country Segment;
proc surveyselect data =dataset out = samp1 method = pps sampsize=40 seed = 9876 ;
strata Country Segment;
size Dollar;
run;
I get a sample of 40 records but the proportion of country and segment weighted by dollar are not the same at all with respect to the original sample.
Where am i wrong?