Hi Sir,
I finally finished the running for a linear regresion model using PROC GENMOD. But it took 25 hours. The dataset has 1 million cases and 40 categorical variables.
What I don't quite understand is that:
The estimated intercept is 1400, as the overall mean (all the predicators in the model are categorical and parameterized with effect coding). But the original observed mean for the dependent variable is only 250. I don't understand why there is such a big difference. Because of poor model fit?
Thanks for your idea.
What is important is the number of levels (=unique values) in your classification variables. If each classification variable has 10 levels, then the regression involves approximately 400 dummy variables as regressors.
If I recall, you are using GENMOD only because you want to use a parametrerization that is different than the GLM encoding. How long does it take for your problem to run in GLM? GENMOD solves a maximum likelihood problem, which involves an iterative optimization, so it will be slower than GLM on the same problem.
For effect coding, the main effects estimate the difference in the effect of each nonreference level compared to the average effect over all four levels. That average effect gets lumped in with the intercept. That's why your Intercept estimate is different than the observed mean.
I assume you know that the predicted values you get from GENMOD are the same as you get from GLM. The only difference is how to INTERPRET the parameters. For an example with continuous variables, see http://blogs.sas.com/content/iml/2010/11/10/regression-coefficients-for-different-polynomial-bases/
It only took 5 mins to run the model in PROC GLM. What a difference.
Thanks again.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.