turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- choosing the right PROC to analyse categorical res...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-08-2016 09:51 AM

Hi,

I am not sure which PROC to choose to analyse the following model, I would greatly appreciate any advices:

the model:

PROC (I used GLIMMIX to start) data=mydata;

class a b c ID;

Y= a|b|c;

/* Y is a categorical variable that can take 9 values: from 3 to 9 (= number of days)*/

/* a is the population name*/

/*b is a categorical variable (treatment 1) with 3 values: 1, 2, 3 for example */

/* c is categorical variable (treatment 2) with 2 values: 1, 2 for example*/

/* ID= individual, each individual has 4 replicates*/

random int/sub=ID(a);

run;

I started with PROC GLIMMIX, it works fine but I am not sure it is designed fir Y categorical variable?

Thanks for your help!

Accepted Solutions

Solution

04-22-2016
07:42 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-08-2016 01:38 PM

I don't think you have to consider number of days as a categorical variable. If it is a duration, you may want to consider fitting with a gamma distribution, and if it is a count of days (sucha as days when a condition is observed), you may want to consider fitting with a Poisson distribution.

If your research question restricts the dependent variable to being a categorical variable, then you should consider fitting with multinomial distribution and an appropriate link function (most likely a cumulative logit).

Your current approach assumes that the residuals of the model fit a Gaussian distribution. Take a look at the diagnostic plots to see if that is an appropriate assumption.

Steve Denham

All Replies

Solution

04-22-2016
07:42 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-08-2016 01:38 PM

I don't think you have to consider number of days as a categorical variable. If it is a duration, you may want to consider fitting with a gamma distribution, and if it is a count of days (sucha as days when a condition is observed), you may want to consider fitting with a Poisson distribution.

If your research question restricts the dependent variable to being a categorical variable, then you should consider fitting with multinomial distribution and an appropriate link function (most likely a cumulative logit).

Your current approach assumes that the residuals of the model fit a Gaussian distribution. Take a look at the diagnostic plots to see if that is an appropriate assumption.

Steve Denham

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-14-2016 03:34 AM

Hello Steve,

thank you very much for your answer.

The variable "number of days" is more a date than a count variable. Value "3" means we observed that trait (=spores from a fungi) three days after the beginning of the experiment, value 4=four days after the beginning and so on.

I thought the number of days has to be categorical because it can ONLY take few values. For some samples, only values 3, 4 and 5 (meaning that we observed that trait day number 3, 4 and 5 after thet start of the experiment). If Y is only 3, 4 and 5 can I fit a gamma distribution?

Thanks again!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-16-2016 04:44 PM

Yes, you can consider it to be gamma.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-19-2016 11:20 PM

Is "spores from a fungi" actually a response variable, maybe a count (number of spores) or maybe binary (whether spores were observed on a given day) or maybe the first day on which spores were observed after the beginning of the experiment? Perhaps "number of days" is actually an explanatory variable? I can't tell from your description. We would need more detail to be of more help.