08-15-2012 06:53 PM
I am a firstime user of PROC SURVEYLOGISTIC and having a little trouble with relating it to PROC LOGISTIC which I normally use.
Do the strata statements give similar results in both procedures. If I have done conditional logistic regression in proc logistic using the strata statement, does the strata statement in SURVEYLOGISTIC do the same thing??
Is the WEIGHT statement in PROC SURVEYLOGISTIC designed to take inclusion probabilities from a traditional statistical survey design or do the inclusion probabilties need to be transformed for their use in the WEIGHT statement??
Thanks for any assistance
08-22-2012 03:17 PM
I'll preface my answer by admitting that my expertise is in survey statistics and not case-control studies, so I know the latter STRATA statement much better.
The short answer is no: the STRATA statements are designed to do different things in the two PROCs. In PROC LOGISTIC, the STRATA statement is used to specify a conditional logistic regression model, as you say. With PROC SURVEYLOGISTIC (as with SURVEYREG, SURVEYMEANS and SURVEYFREQ), the STRATA statement is used to specify the stratification variable(s) that constitute your sample design; and it is often used along with the CLUSTER and WEIGHT statements to fully specify the sample design.
Selection weights are the inverse of selection probabilities -- e.g., wt = (1 / prob). Often, though, weights are normalized to the sample size, meaning the sum of weights equals the sample size, the weighted and unweighted sample sizes are equal, and the weights have a mean of 1. To do this, it's just wt2 = (1 / prob) * (n / sum (wt) ). I should point out that PROC SURVEYSELECT outputs both selection probabilities and selection weights.
Also note that the WEIGHT statement can be used to simply account for (unequal) probabilities of selection, but the weights might also be poststratification weights -- that is, an initial weight that first accounts for probabilities of selection, and then subsequently makes adjustments for departures from known population parameters (e.g., aligning a sample survey's age, sex and region characteristics to census data).
Hope that helps!