Re: Good non parametric alternative procedure for glimmix/GLMM

Paulet · Posted 11-02-2019 09:18 AM

Does anyone now a good non parametric alternative procedure in sas for GLMM?

I was thinking about Gampl, however, I am not confident about it.

Rick_SAS · Posted 11-04-2019 09:54 AM

GAMPL and ADAPTIVEREG are both nonparametric procedures. For a GAMPL example, see "Nonparametric regression for binary response data in SAS"

You might also consider defining a spline effect and using GLIMMIX. To learn more about modeling with spline effects, see this example that uses restricted cubic splines.

You can also read about how to interpret the regression coefficients for a spline-effects model..

Paulet · Posted 11-05-2019 04:52 AM

Thanks for your answer!

Is it also possible to use glimmix for data that is not normal distributed (but should be), but with a different distribution. I allready tried to transform the data.

The SAS System 


The GLM Procedure
 
Dependent Variable: Pancreas_rel 

Source DF Sum of Squares Mean Square F Value Pr > F 
Model 3 19.2570463 6.4190154 0.89 0.4518 
Error 56 403.6712273 7.2084148     
Corrected Total 59 422.9282736       



R-Square Coeff Var Root MSE Pancreas_rel Mean 
0.045533 76.68989 2.684849 3.500917 



Source DF Type I SS Mean Square F Value Pr > F 
diet 1 0.04273536 0.04273536 0.01 0.9389 
strain 1 11.88762207 11.88762207 1.65 0.2044 
diet*strain 1 7.32668889 7.32668889 1.02 0.3177 



Source DF Type III SS Mean Square F Value Pr > F 
diet 1 0.02835042 0.02835042 0.00 0.9502 
strain 1 10.77579181 10.77579181 1.49 0.2266 
diet*strain 1 7.32668889 7.32668889 1.02 0.3177 



Panel of Fit Diagnostics for Pancreas_rel 


Interaction Plot for Pancreas_rel by diet

Rick_SAS · Posted 11-05-2019 04:58 AM

Yes, PROC GLIMMIX supports many response distributions, such as binary, binomial, Poisson, lognormal, etc.

Paulet · Posted 11-05-2019 05:03 AM

Yes, that I knew. However, if something like weight or height (what should be normal distributed) is not normal distributed, you cannot just use poisson distribution right?

Rick_SAS · Posted 11-05-2019 05:15 AM

Correct. You would use the DIST=NORMAL option and check whether the residuals of the model are approximately normal by looking at a Q-Q plot.

To clarify the difference between the response variable being normally distributed and the RESIDUALS being normally distributed, please see the article "On the assumptions (and misconceptions) of linear regression"

Paulet · Posted 11-05-2019 05:25 AM

Yes, however, there is my question, because I have data that is not normal distributed, even if i transform it. I would like to use glimmix with these aswell.

Rick_SAS · Posted 11-05-2019 06:51 AM

At first glance, these diagnostic plots look reasonable, except for the three outliers in the pancreas_rel variable. Which plots are bothering you?

If you show us the procedure statements that you are using, we might be able to offer additional advice.

Paulet · Posted 11-05-2019 09:27 AM

Sadly enough that I cannot delete the outliers, however, the two plots on the bottom left bother me the most. They go not in a straight line and/or follow the normal dist curve

Rick_SAS · Posted 11-05-2019 10:10 AM

I don't know what to suggest. A Q-Q plot that is curved up like that indicates right-skewed residuals. The LOG and SQRT transformations are normalizing transformations that are also variance-stabilizing (address heterogeneity). Or the issue could be that the model does not fit the data well and you need to add interactions or nonlinear effects.

RosieSAS · Posted 11-05-2019 10:39 AM

However, if in experimental design, we can't add some terms like interactions or nonlinear effects for a specified design. In this case, even after some transformations such as LOG or SQRT, the residual plot still does not good like the pancreas_rel variable in this post and cannot delete the outliers. Then what can we do? Thanks!

Rick_SAS · Posted 11-05-2019 10:59 AM

If your goal is prediction, the predicitons of OLS are still valid even without the normality assumption.

Inferences are robust to mild deviations from the normality-of-residual assumptions, but you could point out in your report that the normality assumptions are dubious. If you want distribution-free inferential statistics, use bootstrap methods.

RosieSAS · Posted 11-05-2019 11:23 AM

Thanks @Rick_SAS ! My interest usually is mean comparison. However, most non parametric methods do not provide mean comparison results especially for factorial design. Edgar Brunner provided a non-parametric methods in factorial designs (https://link.springer.com/article/10.1007%2Fs003620000039), I tried it once, however, the results were not much different from ANOVA, so I'm not really comfortable with this method. Using raw data with mild deviations from the normality-of-residual assumptions by using PROC GLIMMIX; raw data using non-parametric method; or transformed data with better or good residual plot. Which one should we prefer to?

Paulet · Posted 11-05-2019 01:54 PM

I used indeed an interaction in this model! However, with transforming with log, it was not really improved.

Ready to join fellow brilliant minds for the SAS Hackathon?