BookmarkSubscribeRSS Feed
vitaaquaticus
Fluorite | Level 6

I have a dataset that contains three response variables (density of individuals in a 0.5 m^2 plot, length of the longest individual in the 0.5 m^2 plot, and percent cover of a 0.5 m^2 plot). The sample sizes are very small so I fitted a PROC MIXED statistical model for each response variable with REML and Kenward-Roger correction. I am having troubles finding a way to satisfy the model assumptions with respect to heteroscedasticity and normality. The residual vs predicted mean plot shows a diagonal shape (see photos below) with all of the transformations (log, sqrt, power, etc). I thought if I fit a beta distribution for percent cover and a poisson distribution for density, it might help solve the problem. But the issue is - PROC MIXED does not have the capability to specify a distribution. PROC GLIMMIX does but it does not appear to have the capability to specify REML and Kenward-Roger correction. I am a bit stumped at this point. I am hoping someone can help me on this. Thank you!

 

Sample Dataset:

data WORK.DATASET2;
  infile datalines dsd truncover;
  input F1:BEST4. Year:$4. Month:$5. Date:MMDDYY10. Cape:$2. Site:$2. Species:$19. Transect:$1. Quadrat:$2. Percent.Cover:BEST12. Density:BEST12. Frond.Length:BEST5. Length_75:32.;
  format F1 BEST4. Date MMDDYY10. Percent.Cover BEST12. Density BEST12. Frond.Length BEST5.;
datalines;
1 2016 May 05/06/2016 CF BB Saccharina sessilis 1 1 30 9 16.5 8.1867769506
2 2016 May 05/06/2016 CF BB Saccharina sessilis 1 2 51 11 26 11.514100371
3 2016 May 05/06/2016 CF BB Saccharina sessilis 1 3 14 5 13 6.846325042
4 2016 May 05/06/2016 CF BB Saccharina sessilis 1 4 3 1 8 4.75682846
5 2016 May 05/06/2016 CF BB Saccharina sessilis 1 5 4 1 2.1 1.7444738796
6 2016 May 05/06/2016 CF BB Saccharina sessilis 1 6 17 4 25.4 11.314237411
7 2016 May 05/06/2016 CF BB Saccharina sessilis 1 7 15 8 14 7.2376241554
8 2016 May 05/06/2016 CF BB Saccharina sessilis 1 8 21 4 23 10.502577066
9 2016 May 05/06/2016 CF BB Saccharina sessilis 1 9 16 5 9.5 5.4111890111
10 2016 May 05/06/2016 CF BB Saccharina sessilis 1 10 24 6 8.9 5.1527907317
11 2016 May 05/06/2016 CF BB Saccharina sessilis 1 11 17 5 2 1.6817928305
12 2016 May 05/06/2016 CF BB Saccharina sessilis 1 12 52 9 18.1 8.7752385459
13 2016 May 05/06/2016 CF BB Saccharina sessilis 1 13 17 2 13.8 7.1599388764
14 2016 May 05/06/2016 CF BB Saccharina sessilis 1 14 30 7 26 11.514100371
15 2016 May 05/06/2016 CF BB Saccharina sessilis 1 15 62 14 33.7 13.986930782
16 2016 May 05/06/2016 CF BB Saccharina sessilis 1 16 30 9 26 11.514100371
17 2016 May 05/06/2016 CF BB Saccharina sessilis 1 17 20 10 16.1 8.0374707792
18 2016 May 05/06/2016 CF BB Saccharina sessilis 1 18 6 3 8 4.75682846
19 2016 May 05/06/2016 CF BB Saccharina sessilis 1 19 10 3 11.1 6.0812412691
20 2016 May 05/06/2016 CF BB Saccharina sessilis 1 20 18 8 2.5 1.9881768219
;;;;

 

Screen Shot 2020-02-19 at 10.00.39.pngScreen Shot 2020-02-19 at 10.00.44.png

 

Sample code for Density:

 

ods graphics on;
PROC MIXED DATA = dataset2 plots(MAXPOINTS=none)=all;
	CLASS Year Month Cape Site Transect Quadrat;
	MODEL 'Density'n = Year Month(Year) Year|Cape/SOLUTION ddfm = KR CL ALPHA=0.05 INTERCEPT;
   RANDOM Quadrat(Transect) Transect(Site) Site(Cape) /CL ALPHA=0.05 TYPE=VC;
    LSMEANS Year|Cape / PDIFF CL ALPHA=0.05;
RUN;
ods graphics off;

 

4 REPLIES 4
PGStats
Opal | Level 21

The pattern of residuals vs predicted is problematic. Fix that first. I think your model is overspecified.

 

Try

 

Month Year|Cape

instead of 

 

Year Month(Year) Year|Cape

 

PG
vitaaquaticus
Fluorite | Level 6
Thanks for the suggestion. I just tried it - it shows a very similar residual vs predicted plot.
vitaaquaticus
Fluorite | Level 6
Cape is a regional level. Site is a local level. It is like a city within a county. But in this case, a site within a cape.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 836 views
  • 0 likes
  • 2 in conversation