Proc Surveyselect weighted by exposure

AndreaBov · Posted 10-30-2019 12:24 PM

Hi,

i have this simple problem that i am trying to solve with the proc surveyselect but I'm not able to obtain the desired results.

I have a dataset with 3 variables: Country (takes only two values EU, NON EU) Segment (takes 3 values Large,Medium,Small) and Dollar (amount of exposure).

The dataset has 10000 lines.

I would like to extract a random sample of 40 lines that has the same (or close to the same) distribution of the original sample in terms of Dollar.

I am using this code:

proc sort data=dataset; by Country Segment;

proc surveyselect data =dataset out = samp1 method = pps sampsize=40 seed = 9876 ;
strata Country Segment;
size Dollar;
run;

I get a sample of 40 records but the proportion of country and segment weighted by dollar are not the same at all with respect to the original sample.

Where am i wrong?

ballardw · Posted 10-30-2019 01:14 PM

That is what PPS does with a Size variable, if a value of Dollar is larger it is more likely to be selected.

If you want the proportion of dollar values to approximate the data as a whole then look at SRS or SYS methods instead. If you have a wide range of values for you dollar amounts I might suggest the SYS method.

With 6 groups (2*3) and selecting only 40 records you may have to be flexible about how close you want those proportions to match.

Proc Surveyselect weighted by exposure

Re: Proc Surveyselect weighted by exposure

SAS Innovate 2025: Register Now