Hi,
I am not sure which PROC to choose to analyse the following model, I would greatly appreciate any advices:
the model:
PROC (I used GLIMMIX to start) data=mydata;
class a b c ID;
Y= a|b|c;
/* Y is a categorical variable that can take 9 values: from 3 to 9 (= number of days)*/
/* a is the population name*/
/*b is a categorical variable (treatment 1) with 3 values: 1, 2, 3 for example */
/* c is categorical variable (treatment 2) with 2 values: 1, 2 for example*/
/* ID= individual, each individual has 4 replicates*/
random int/sub=ID(a);
run;
I started with PROC GLIMMIX, it works fine but I am not sure it is designed fir Y categorical variable?
Thanks for your help!
I don't think you have to consider number of days as a categorical variable. If it is a duration, you may want to consider fitting with a gamma distribution, and if it is a count of days (sucha as days when a condition is observed), you may want to consider fitting with a Poisson distribution.
If your research question restricts the dependent variable to being a categorical variable, then you should consider fitting with multinomial distribution and an appropriate link function (most likely a cumulative logit).
Your current approach assumes that the residuals of the model fit a Gaussian distribution. Take a look at the diagnostic plots to see if that is an appropriate assumption.
Steve Denham
I don't think you have to consider number of days as a categorical variable. If it is a duration, you may want to consider fitting with a gamma distribution, and if it is a count of days (sucha as days when a condition is observed), you may want to consider fitting with a Poisson distribution.
If your research question restricts the dependent variable to being a categorical variable, then you should consider fitting with multinomial distribution and an appropriate link function (most likely a cumulative logit).
Your current approach assumes that the residuals of the model fit a Gaussian distribution. Take a look at the diagnostic plots to see if that is an appropriate assumption.
Steve Denham
Hello Steve,
thank you very much for your answer.
The variable "number of days" is more a date than a count variable. Value "3" means we observed that trait (=spores from a fungi) three days after the beginning of the experiment, value 4=four days after the beginning and so on.
I thought the number of days has to be categorical because it can ONLY take few values. For some samples, only values 3, 4 and 5 (meaning that we observed that trait day number 3, 4 and 5 after thet start of the experiment). If Y is only 3, 4 and 5 can I fit a gamma distribution?
Thanks again!
Yes, you can consider it to be gamma.
Is "spores from a fungi" actually a response variable, maybe a count (number of spores) or maybe binary (whether spores were observed on a given day) or maybe the first day on which spores were observed after the beginning of the experiment? Perhaps "number of days" is actually an explanatory variable? I can't tell from your description. We would need more detail to be of more help.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.