About Marko

Marko · ‎07-26-2016

Thanks. I appreciate your enthusiasm.

Marko · ‎07-25-2016

Hi. One of my regression covariates is LIMITK, a categorical variable related to having a work-limiting condition. The variable has four levels: -9 (not applicable), -8 (did not answer), 1 (condition affects type of work undertaken); and 2 (condition does not affect type of work undertaken). I use the indicator variable SUBCLASS to flag members of my subclass of interest, by including the clause DOMAIN=SUBCLASS. Crucially, there are no records in SUBCLASS=1 that are coded -8 for LIMITK. Nevertheless, the SAS output provides an estimate for the coefficient of the LIMITK dummy corresponding to the value -8. My best guess is that SAS finds it more expedient to keep the estimator the same across both domains, but to model on zeros where there are no applicable values. This approach, while computationally expedient, would not affect the estimated vector parameter beta-hat. Does that make sense to you? Another thing I'm wondering is, is there much sense even bothering with DOMAIN analysis if the overall size of the sample runs into hundereds of thousands? Even if the domain consists of a quarter of records, the variance in the estimate of its size will be the variance of the proportion of a sample, p(1-p)/n, where n is very large. If this formula is at the heart of what SAS is adding to the process when the DOMAIN command is being used (and I'm assuming it is based on my reading of Kish (1965)), it seems to suggest that it really isn't worth the bother.

Marko · ‎07-23-2016

Hello - I am analysing part of a weighted survey sample using the SURVEYLOGISTIC procedure, and have used the DOMAIN statement to identify records that I want to include in the analysis. However, the regression parameter estimates pertaining to the subsample I am analysing: (a) include estimates for dummy variables that don't appear in the subsample even once (how would they be calculated!?) (b) have variance estimates which are actually smaller than when I simply analyse the subsample using the BY command. Neither of these things make sense! I would be extremely grateful for any insights. Thanks, Marko.

Marko · ‎05-06-2016

Thanks very much for taking the time to check that paper out Steve. I will have a read of it. I might also experiment with different scalings and see if the results change!

Marko · ‎05-05-2016

Hello. I have longitudinal survey data supplied with weights and I am aiming to fit a generalised linear mixed model to it, the response variable being employment status at one of five successive survey interviews, and the predictors being things like qualification level, gender, method of job search, and so on. Everybody in my subsample is unemployed at first interview. My question is as follows. I have a set of survey weights, and I suspect it really doesn't matter to the analysis how they are scaled. However, I was wondering whether the WEIGHT option in PROC GLIMMIX, which allows me to supply non-response weights, requires the total sum of the weights to be scaled to the sample size. PS Although the survey sampling scheme is complex, a single set of weights has been provided for use in longitudinal analysis, scaled to the population. Many thanks.

Online Status	Offline
Date Last Visited	‎02-02-2017 01:36 PM

Re: Understanding results from survey subsample analysis

Re: Understanding results from survey subsample analysis

Understanding results from survey subsample analysis

Re: Scaling of weights required in PROC GLIMMIX

Scaling of weights required in PROC GLIMMIX

Re: Scaling of weights required in PROC GLIMMIX

Re: Understanding results from survey subsample analysis

Re: Understanding results from survey subsample analysis

Understanding results from survey subsample analysis

Re: Scaling of weights required in PROC GLIMMIX

Scaling of weights required in PROC GLIMMIX