Does anyone now a good non parametric alternative procedure in sas for GLMM?
I was thinking about Gampl, however, I am not confident about it.
GAMPL and ADAPTIVEREG are both nonparametric procedures. For a GAMPL example, see "Nonparametric regression for binary response data in SAS"
You might also consider defining a spline effect and using GLIMMIX. To learn more about modeling with spline effects, see this example that uses restricted cubic splines.
You can also read about how to interpret the regression coefficients for a spline-effects model..
Thanks for your answer!
Is it also possible to use glimmix for data that is not normal distributed (but should be), but with a different distribution. I allready tried to transform the data.
The SAS System The GLM Procedure Dependent Variable: Pancreas_rel Source DF Sum of Squares Mean Square F Value Pr > F Model 3 19.2570463 6.4190154 0.89 0.4518 Error 56 403.6712273 7.2084148 Corrected Total 59 422.9282736 R-Square Coeff Var Root MSE Pancreas_rel Mean 0.045533 76.68989 2.684849 3.500917 Source DF Type I SS Mean Square F Value Pr > F diet 1 0.04273536 0.04273536 0.01 0.9389 strain 1 11.88762207 11.88762207 1.65 0.2044 diet*strain 1 7.32668889 7.32668889 1.02 0.3177 Source DF Type III SS Mean Square F Value Pr > F diet 1 0.02835042 0.02835042 0.00 0.9502 strain 1 10.77579181 10.77579181 1.49 0.2266 diet*strain 1 7.32668889 7.32668889 1.02 0.3177 Panel of Fit Diagnostics for Pancreas_rel Interaction Plot for Pancreas_rel by diet
Yes, PROC GLIMMIX supports many response distributions, such as binary, binomial, Poisson, lognormal, etc.
Yes, that I knew. However, if something like weight or height (what should be normal distributed) is not normal distributed, you cannot just use poisson distribution right?
Correct. You would use the DIST=NORMAL option and check whether the residuals of the model are approximately normal by looking at a Q-Q plot.
To clarify the difference between the response variable being normally distributed and the RESIDUALS being normally distributed, please see the article "On the assumptions (and misconceptions) of linear regression"
Yes, however, there is my question, because I have data that is not normal distributed, even if i transform it. I would like to use glimmix with these aswell.
At first glance, these diagnostic plots look reasonable, except for the three outliers in the pancreas_rel variable. Which plots are bothering you?
If you show us the procedure statements that you are using, we might be able to offer additional advice.
Sadly enough that I cannot delete the outliers, however, the two plots on the bottom left bother me the most. They go not in a straight line and/or follow the normal dist curve
I don't know what to suggest. A Q-Q plot that is curved up like that indicates right-skewed residuals. The LOG and SQRT transformations are normalizing transformations that are also variance-stabilizing (address heterogeneity). Or the issue could be that the model does not fit the data well and you need to add interactions or nonlinear effects.
However, if in experimental design, we can't add some terms like interactions or nonlinear effects for a specified design. In this case, even after some transformations such as LOG or SQRT, the residual plot still does not good like the pancreas_rel variable in this post and cannot delete the outliers. Then what can we do? Thanks!
If your goal is prediction, the predicitons of OLS are still valid even without the normality assumption.
Inferences are robust to mild deviations from the normality-of-residual assumptions, but you could point out in your report that the normality assumptions are dubious. If you want distribution-free inferential statistics, use bootstrap methods.
Thanks @Rick_SAS ! My interest usually is mean comparison. However, most non parametric methods do not provide mean comparison results especially for factorial design. Edgar Brunner provided a non-parametric methods in factorial designs (https://link.springer.com/article/10.1007%2Fs003620000039), I tried it once, however, the results were not much different from ANOVA, so I'm not really comfortable with this method. Using raw data with mild deviations from the normality-of-residual assumptions by using PROC GLIMMIX; raw data using non-parametric method; or transformed data with better or good residual plot. Which one should we prefer to?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.