turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Proc surveyreg giving different results when switc...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-02-2018 08:22 PM - edited 02-02-2018 08:24 PM

Hi,

I was wondering if it was normal to get different results when you change the order of the strata variables. For example, let's say we run two models:

proc surveyreg data=mydata; strata** A B C**; model yvar=xvar;run;

proc surveyreg data=mydata; strata **B A C**; model yvar=xvar;run;

Is it possible for the two models to get different results? The reason I'm asking is because for the model I ran, by simply switching the order of the strata variables, my standard errors and p-values changed. This was not universal however. Results only changed when using a specific xvar.

Thanks.

Accepted Solutions

Solution

Thursday

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to pkfamily

02-05-2018 12:40 PM

From the documentation:

The STRATA statement names one or more variables that identify the first-stage strata in a stratified sample design. The combinations of levels of STRATA variables define the strata in the sample, where strata are nonoverlapping subgroups that were sampled independently.

So the order reflects the sampling strategy. If you change the strata order then you are saying that the sample strategy changed and the results will be (sometimes quite) significantly different. A strata A B C says that we identified records by some characteristic and sampled them, then with in A we sampled by characteristic B, then within B we sampled on C.

Your strata are fixed at sampling. Models should reflect that sample.

And you might show the entire Proc Surveyfreq as CLUSTER statement also affects calculations.

All Replies

Solution

Thursday

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to pkfamily

02-05-2018 12:40 PM

From the documentation:

The STRATA statement names one or more variables that identify the first-stage strata in a stratified sample design. The combinations of levels of STRATA variables define the strata in the sample, where strata are nonoverlapping subgroups that were sampled independently.

So the order reflects the sampling strategy. If you change the strata order then you are saying that the sample strategy changed and the results will be (sometimes quite) significantly different. A strata A B C says that we identified records by some characteristic and sampled them, then with in A we sampled by characteristic B, then within B we sampled on C.

Your strata are fixed at sampling. Models should reflect that sample.

And you might show the entire Proc Surveyfreq as CLUSTER statement also affects calculations.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to pkfamily

02-05-2018 02:05 PM

In addition to ballardw's comments, I will point out that you need to use

PROC SURVEYSELECT SEED=12345 ...

if you want two runs of the procedure to produce the same results. Since you are using the STRATA statement, you will also need to control the stratum random seeds. See the SEED= option in the doc.