I have a dataset that contains three response variables (density of individuals in a 0.5 m^2 plot, length of the longest individual in the 0.5 m^2 plot, and percent cover of a 0.5 m^2 plot). The sample sizes are very small so I fitted a PROC MIXED statistical model for each response variable with REML and Kenward-Roger correction. I am having troubles finding a way to satisfy the model assumptions with respect to heteroscedasticity and normality. The residual vs predicted mean plot shows a diagonal shape (see photos below) with all of the transformations (log, sqrt, power, etc). I thought if I fit a beta distribution for percent cover and a poisson distribution for density, it might help solve the problem. But the issue is - PROC MIXED does not have the capability to specify a distribution. PROC GLIMMIX does but it does not appear to have the capability to specify REML and Kenward-Roger correction. I am a bit stumped at this point. I am hoping someone can help me on this. Thank you!
Sample Dataset:
data WORK.DATASET2;
infile datalines dsd truncover;
input F1:BEST4. Year:$4. Month:$5. Date:MMDDYY10. Cape:$2. Site:$2. Species:$19. Transect:$1. Quadrat:$2. Percent.Cover:BEST12. Density:BEST12. Frond.Length:BEST5. Length_75:32.;
format F1 BEST4. Date MMDDYY10. Percent.Cover BEST12. Density BEST12. Frond.Length BEST5.;
datalines;
1 2016 May 05/06/2016 CF BB Saccharina sessilis 1 1 30 9 16.5 8.1867769506
2 2016 May 05/06/2016 CF BB Saccharina sessilis 1 2 51 11 26 11.514100371
3 2016 May 05/06/2016 CF BB Saccharina sessilis 1 3 14 5 13 6.846325042
4 2016 May 05/06/2016 CF BB Saccharina sessilis 1 4 3 1 8 4.75682846
5 2016 May 05/06/2016 CF BB Saccharina sessilis 1 5 4 1 2.1 1.7444738796
6 2016 May 05/06/2016 CF BB Saccharina sessilis 1 6 17 4 25.4 11.314237411
7 2016 May 05/06/2016 CF BB Saccharina sessilis 1 7 15 8 14 7.2376241554
8 2016 May 05/06/2016 CF BB Saccharina sessilis 1 8 21 4 23 10.502577066
9 2016 May 05/06/2016 CF BB Saccharina sessilis 1 9 16 5 9.5 5.4111890111
10 2016 May 05/06/2016 CF BB Saccharina sessilis 1 10 24 6 8.9 5.1527907317
11 2016 May 05/06/2016 CF BB Saccharina sessilis 1 11 17 5 2 1.6817928305
12 2016 May 05/06/2016 CF BB Saccharina sessilis 1 12 52 9 18.1 8.7752385459
13 2016 May 05/06/2016 CF BB Saccharina sessilis 1 13 17 2 13.8 7.1599388764
14 2016 May 05/06/2016 CF BB Saccharina sessilis 1 14 30 7 26 11.514100371
15 2016 May 05/06/2016 CF BB Saccharina sessilis 1 15 62 14 33.7 13.986930782
16 2016 May 05/06/2016 CF BB Saccharina sessilis 1 16 30 9 26 11.514100371
17 2016 May 05/06/2016 CF BB Saccharina sessilis 1 17 20 10 16.1 8.0374707792
18 2016 May 05/06/2016 CF BB Saccharina sessilis 1 18 6 3 8 4.75682846
19 2016 May 05/06/2016 CF BB Saccharina sessilis 1 19 10 3 11.1 6.0812412691
20 2016 May 05/06/2016 CF BB Saccharina sessilis 1 20 18 8 2.5 1.9881768219
;;;;
Sample code for Density:
ods graphics on;
PROC MIXED DATA = dataset2 plots(MAXPOINTS=none)=all;
CLASS Year Month Cape Site Transect Quadrat;
MODEL 'Density'n = Year Month(Year) Year|Cape/SOLUTION ddfm = KR CL ALPHA=0.05 INTERCEPT;
RANDOM Quadrat(Transect) Transect(Site) Site(Cape) /CL ALPHA=0.05 TYPE=VC;
LSMEANS Year|Cape / PDIFF CL ALPHA=0.05;
RUN;
ods graphics off;
The pattern of residuals vs predicted is problematic. Fix that first. I think your model is overspecified.
Try
Month Year|Cape
instead of
Year Month(Year) Year|Cape
What is cape?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.