11-17-2016 07:09 PM
I am using PROC GENMOD for the first time to analyze complex survey data. The outcome variable is a count, injuries per year, over 14 years, from 2002 to 2015.
I'm seeking some help in understanding the error message below.
I found some generic syntax online and ran it w/ my variables and data:
proc genmod data=data; class cluster_var stratum_var ; model y=x /dist=negbin link=log; weight psu_var; repeated subject =cluster_var(stratum_var); domain=flag; run;
I created a subset in a preceding DATA step by defining a variable, flag. I added the DOMAIN statement myself. It was not included in the sample code I found. I did not seem to produce an error, but I'm not sure if it worked properly either.
When I run this, the following messages are generated:
NOTE: Class levels for some variables were not printed due to excessive size.
NOTE: Algorithm converged.
ERROR: A missing value was detected in the SUBJECT, WITHINSUBJECT, or LOGORVAR effect. All values of variables in these effects must be non-missing.
Any ideas what this error message means?
11-18-2016 04:47 PM - edited 11-18-2016 04:49 PM
PROC GENMOD should not be used to analyze complex survey data. Only the SURVEY procedures (SURVEYFREQ, SURVEYLOGISTIC, etc.) can provide a proper analysis of survey sample data. A variable specified in the WEIGHT statement in other procedures may produce correct parameter estimates, but their variances will not be correct. Special variance estimators are needed in the analysis of survey data and only the SURVEY procedures have these estimators. There is currently no SURVEY procedure available for fitting count models such as Poisson or negative binomial models.
See this note for more.
11-18-2016 05:45 PM - edited 11-18-2016 05:46 PM
@StatDave_sas Wow. Thanks, I had no idea.
I have an outcome variable that is injuries per year. It's a count so I was going to use a negative binomial model...What SURVEY procedure should I used to determine if there is a linear relationship between year and injuries per year? Should I treat injuries per year as a continuous variable and use PROC SURVEYREG? Is there another approach?
11-21-2016 10:15 AM
Currently, there is no SURVEY procedure for fitting count models using the Poisson or negative binomial distributions. SURVEYREG will assume that the response is normally distributed.