Help using Base SAS procedures

Need help finding the right SAS procedure for non-normal seed germination data

Reply
New Contributor
Posts: 2

Need help finding the right SAS procedure for non-normal seed germination data

Hi all,

I would like to seek help in analysis of my data set. I am looking at the effect of storage conditions (humidity and temperature) on germination of dormant seeds (over time). I set up my experiment as a split-split plot (main plot: humidity, subplot: temp and subsubplot: time) and I did two runs (seasons) to see whether treatment results are consistent. Results of the PROC univariate indicates that my data is not normal and is highly positively skewed (2.2). Can I still run an ANOVA with this? I would like to see whether factors I have (and their interactions are significant) and also whether the two runs are significant (which can indicate whether the runs can be combined or not. Attached is a data set and results of the proc univariate. Any help (specifically in writing the analysis code) is greatly appreciated.

Thanks! 

Attachment
Respected Advisor
Posts: 2,655

Re: Need help finding the right SAS procedure for non-normal seed germination data

Recall that the assumptions on ANOVA are that the residuals be relatively normally distributed, not necessarily the response variable.  It really, really looks like your response variable is a count, and it really looks like it is zero-inflated (the median is zero).  I suggest looking at the documentation for PROC GENMOD, especially for the zero-inflated models.  The hard part will be correctly specifying a split-split plot, which is relatively easy to do in MIXED with RANDOM and REPEATED, or GLIMMIX with residual option in the RANDOM statement.  However, neither of those will accurately fit a zero inflated model.

Check the SAS-L archives for many, many threads on fitting zero-inflated models.

Steve Denham

New Contributor
Posts: 2

Re: Need help finding the right SAS procedure for non-normal seed germination data

Thanks for this!

Re: Data. Yes, the response variable is a count, specifically percent germination (i.e. number of germinated seeds/50 seeds).

I'll check the archives and see what I'll find. Thanks again for your help.

Orville

Respected Advisor
Posts: 2,655

Re: Need help finding the right SAS procedure for non-normal seed germination data

Umm.  That would be a proportion, bounded below by zero and above by 1.  Neither the Poisson nor negative binomial really is applicable, because you have a maximum count 50 (out of 50) that would show up in your data as 100.  So that means no "zero inflation."

I thought about this a little bit, and really wanted to use a binomial distribution, but it has convergence problems.  So, since you do not have any 100% obs, I did the add a trivial bit to all values, and used a beta distribution.

data germ2;
set germ_combined;
value=(germ/100);
value2=value+0.0001;
run;

proc glimmix data=germ2 method=rspl abspconv=1e-8;
monthx=month;
class humidity temp rep season month;
nloptions tech=quanew maxiter=2000 ;
model value2=humidity|temp|season|month/dist=beta;
random month/residual subject=humidity*temp*season*rep type=sp(pow)(monthx);
lsmeans humidity|temp|season|month/cl ilink;
run;

This ran for me.  The point estimates obtained with the ilink option should be adjusted for the 0.0001 added.

I have some ideas about how to approach the zero inflation idea, using a fixed offset, but that is for a later post.

Steve Denham

Ask a Question
Discussion stats
  • 3 replies
  • 241 views
  • 0 likes
  • 2 in conversation