BookmarkSubscribeRSS Feed
Paulet
Calcite | Level 5

Does anyone now a good non parametric alternative procedure in sas for GLMM? 

 

I was thinking about Gampl, however, I am not confident about it. 

13 REPLIES 13
Rick_SAS
SAS Super FREQ

GAMPL and ADAPTIVEREG are both nonparametric procedures. For a GAMPL example, see "Nonparametric regression for binary response data in SAS"

 

You might also consider defining a spline effect and using GLIMMIX. To learn more about modeling with spline effects, see this example that uses restricted cubic splines.

You can also read about how to interpret the regression coefficients for a spline-effects model..

 

Paulet
Calcite | Level 5

Thanks for your answer!

 

Is it also possible to use glimmix for data that is not normal distributed (but should be), but with a different distribution. I allready tried to transform the data.  

The SAS System 


The GLM Procedure
 
Dependent Variable: Pancreas_rel 

Source DF Sum of Squares Mean Square F Value Pr > F 
Model 3 19.2570463 6.4190154 0.89 0.4518 
Error 56 403.6712273 7.2084148     
Corrected Total 59 422.9282736       



R-Square Coeff Var Root MSE Pancreas_rel Mean 
0.045533 76.68989 2.684849 3.500917 



Source DF Type I SS Mean Square F Value Pr > F 
diet 1 0.04273536 0.04273536 0.01 0.9389 
strain 1 11.88762207 11.88762207 1.65 0.2044 
diet*strain 1 7.32668889 7.32668889 1.02 0.3177 



Source DF Type III SS Mean Square F Value Pr > F 
diet 1 0.02835042 0.02835042 0.00 0.9502 
strain 1 10.77579181 10.77579181 1.49 0.2266 
diet*strain 1 7.32668889 7.32668889 1.02 0.3177 



Panel of Fit Diagnostics for Pancreas_rel 


Interaction Plot for Pancreas_rel by diet

 

Rick_SAS
SAS Super FREQ

Yes, PROC GLIMMIX supports many response distributions, such as binary, binomial, Poisson, lognormal, etc.

Paulet
Calcite | Level 5

Yes, that I knew. However, if something like weight or height (what should be normal distributed) is not normal distributed, you cannot just use poisson distribution right? 

Rick_SAS
SAS Super FREQ

Correct. You would use the DIST=NORMAL option and check whether the residuals of the model are approximately normal by looking at a Q-Q plot. 

 

To clarify the difference between the response variable being normally distributed and the RESIDUALS being normally distributed, please see the article "On the assumptions (and misconceptions) of linear regression"

Paulet
Calcite | Level 5

Yes, however, there is my question, because I have data that is not normal distributed, even if i transform it. I would like to use glimmix with these aswell. 

 

DiagnosticsPanel.pngDiagnosticsPanel2.pngDiagnosticsPanel3.png
Rick_SAS
SAS Super FREQ

At first glance, these diagnostic plots look reasonable, except for the three outliers in the pancreas_rel variable. Which plots are bothering you?

 

If you show us the procedure statements that you are using, we might be able to offer additional advice.

Paulet
Calcite | Level 5

Sadly enough that I cannot delete the outliers, however, the two plots on the bottom left bother me the most. They go not in a straight line and/or follow the normal dist curve 

Rick_SAS
SAS Super FREQ

I don't know what to suggest. A Q-Q plot that is curved up like that indicates right-skewed residuals. The LOG and SQRT transformations are normalizing transformations that are also variance-stabilizing (address heterogeneity). Or the issue could be that the model does not fit the data well and you need to add interactions or nonlinear effects.

RosieSAS
Obsidian | Level 7

However, if in experimental design, we can't add some terms like interactions or nonlinear effects for a specified design. In this case, even after some transformations such as LOG or SQRT, the residual plot still does not good like the pancreas_rel variable in this post and cannot delete the outliers. Then what can we do? Thanks!

Rick_SAS
SAS Super FREQ

If your goal is prediction, the predicitons of OLS are still valid even without the normality assumption.

 

Inferences are robust to mild deviations from the normality-of-residual assumptions, but you could point out in your report that the normality assumptions are dubious. If you want distribution-free inferential statistics, use bootstrap methods.

RosieSAS
Obsidian | Level 7

Thanks @Rick_SAS ! My interest usually is mean comparison. However, most non parametric methods do not provide mean comparison results especially for factorial design. Edgar Brunner  provided a non-parametric methods in factorial designs (https://link.springer.com/article/10.1007%2Fs003620000039), I tried it once, however, the results were not much different from ANOVA, so I'm not really comfortable with this method. Using raw data with mild deviations from the normality-of-residual assumptions by using PROC GLIMMIX; raw data using non-parametric method; or  transformed data with better or good residual plot. Which one should we prefer to? 

Paulet
Calcite | Level 5
I used indeed an interaction in this model! However, with transforming with log, it was not really improved.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 1932 views
  • 1 like
  • 3 in conversation