BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
celdelmas
Calcite | Level 5

Hi,

 

I am not sure which PROC to choose to analyse the following model, I would greatly appreciate any advices:

 

the model:

PROC (I used GLIMMIX to start) data=mydata;

class a  b c ID;

Y= a|b|c;

/* Y is a categorical variable that can take 9 values: from 3 to 9 (= number of days)*/

/* a is the population name*/

/*b is a categorical variable (treatment 1) with 3 values: 1, 2, 3 for example */

/* c is categorical variable (treatment 2) with 2 values: 1, 2 for example*/

/* ID= individual, each individual has 4 replicates*/

random int/sub=ID(a);

run;

 

I started with PROC GLIMMIX, it works fine but I am not sure it is designed fir Y categorical variable?

 

Thanks for your help!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

I don't think you have to consider number of days as a categorical variable.  If it is a duration, you may want to consider fitting with a gamma distribution, and if it is a count of days  (sucha as days when a condition is observed), you may want to consider fitting with a Poisson distribution.

 

If your research question restricts the dependent variable to being a categorical variable, then you should consider fitting with multinomial distribution and an appropriate link function (most likely a cumulative logit).

 

Your current approach assumes that the residuals of the model fit a Gaussian distribution.  Take a look at the diagnostic plots to see if that is an appropriate assumption.

 

Steve Denham

View solution in original post

4 REPLIES 4
SteveDenham
Jade | Level 19

I don't think you have to consider number of days as a categorical variable.  If it is a duration, you may want to consider fitting with a gamma distribution, and if it is a count of days  (sucha as days when a condition is observed), you may want to consider fitting with a Poisson distribution.

 

If your research question restricts the dependent variable to being a categorical variable, then you should consider fitting with multinomial distribution and an appropriate link function (most likely a cumulative logit).

 

Your current approach assumes that the residuals of the model fit a Gaussian distribution.  Take a look at the diagnostic plots to see if that is an appropriate assumption.

 

Steve Denham

celdelmas
Calcite | Level 5

Hello Steve,

 

thank you very much for your answer.

 

The variable "number of days" is more a date than a count variable. Value "3" means we observed that trait (=spores from a fungi) three days after the beginning of the experiment, value 4=four days after the beginning and so on.

I thought the number of days has to be categorical because it can ONLY take few values. For some samples, only values 3, 4 and 5 (meaning that we observed that trait day number 3, 4 and 5 after thet start of the experiment). If Y is only 3, 4 and 5 can I fit a gamma distribution?

 

 

Thanks again!

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

Yes, you can consider it to be gamma.

sld
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

Is "spores from a fungi" actually a response variable, maybe a count (number of spores) or maybe binary (whether spores were observed on a given day) or maybe the first day on which spores were observed after the beginning of the experiment? Perhaps "number of days" is actually an explanatory variable? I can't tell from your description. We would need more detail to be of more help.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1376 views
  • 0 likes
  • 4 in conversation