Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- PROC GLM WITH BINOMIAL RESPONSE VARIABLES

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 07-23-2018 05:20 PM
(4182 views)

Hello all,

I need help to determine what is wrong with my code (SAS 9.4). I have collected data from four different farms with 3 different treatments applied to the individual animals (each animal is the experimental unit). The response variable is binomial (0=no, 1=yes). I am trying to run PROC GLM to determine different averages. Running Tukey's for 1 degree comparisons. I am also trying to determine if there is a significant farm*treatment interaction. I am including my current working code with sample data below (I have over 400 observations so I thought to decrease the number of observations for this post).

DATA resync;

INPUT @4 farm $ @3 treatment $ @response;

datalines;

Farm1 | A | 0 |

Farm1 | A | 1 |

Farm1 | A | 0 |

Farm1 | B | 0 |

Farm1 | B | 1 |

Farm1 | C | 1 |

Farm1 | C | 0 |

Farm1 | C | 1 |

Farm1 | C | 0 |

Farm2 | A | 1 |

Farm2 | A | 1 |

Farm2 | A | 0 |

Farm2 | A | 0 |

Farm2 | A | 0 |

Farm2 | A | 1 |

Farm2 | B | 1 |

Farm2 | B | 0 |

Farm3 | B | 1 |

Farm3 | B | 0 |

Farm3 | B | 1 |

Farm3 | C | 0 |

Farm3 | C | 0 |

Farm3 | C | 0 |

Farm3 | C | 1 |

Farm3 | C | 1 |

Farm4 | A | 1 |

Farm4 | A | 0 |

Farm4 | B | 1 |

Farm4 | B | 0 |

Farm4 | C | 1 |

Farm4 | C | 1 |

;

PROC GLM data=resync;

CLASS farm treatment farm*treatment;

MODEL farm treatment farm*treatment = response;

MEANS treatment / TUKEY;

PROC PRINT;

RUN;

Any help would be greatly appreciated!

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

That's not how you specify a model..it would be at minimum this. See if you can get that to run - though I suspect it's still wrong. If the observed variable is binomial you want to specify that somehow. I'll move your question to the stats forum and hopefully someone with more statistical knowledge than me can answer it 🙂

```
PROC GLM data=resync;
CLASS farm treatment ;
MODEL response= farm treatment farm*treatment ;
*MEANS treatment / TUKEY;
run;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

With binary responses you would use PROC LOGISTIC, or possibly PROC GLIMMIX with the model option DIST=BINOMIAL.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

To elaborate on the responses by @Reeza and @PaigeMiller:

Clearly, you need to use a procedure for data that are binary or binomial. GLM is definitely *not* the correct procedure, because it assumes the the response is normally distributed (conditional on the predictors).

In your data snippet, it does not look like each individual cow is independent of all other cows. Does each line in your data snippet represent one cow? If so, it seems that there are one *or more* cows receiving a particular treatment at each of four farms.

Cows on the same farm receiving the same treatment are subsamples, and the statistical model should incorporate cows accordingly. Assuming that FARM is a fixed effects factor, I see two options, one of which uses the LOGISTIC procedure, one of which uses the GLIMMIX procedure (you could also use GENMOD):

(1) Combine the data over multiple cows on the same farm and receiving the same treatment so that a new response is defined by the number of cows with outcome=1 (i.e., number of "successes") out of total number of cows. You could then use the LOGISTIC (or GENMOD) procedure with a binomial distribution using the "events/trials" response specification. See http://documentation.sas.com/?docsetId=statug&docsetTarget=statug_logistic_syntax22.htm&docsetVersio...

(2) Use the data in the current format in the GLIMMIX procedure, specifying a mixed model with a RANDOM statement which clusters cows within sets of cows at a given farm receiving the same treatment.

Both approaches will produce the same results, but the first approach using LOGISTIC is more intuitive and that's what I would recommend.

I hope this helps. I think you will want to do some studying about logistic regression (or in this case, logit models because farm and treatment are categorical) and how to implement these models using SAS.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@sld wrote:

To elaborate on the responses by @Reeza and @PaigeMiller:

Clearly, you need to use a procedure for data that are binary or binomial. GLM is definitely

notthe correct procedure, because it assumes the the response is normally distributed (conditional on the predictors).

GLM (and linear regression) assume the *errors* are normally distributed. But, you are correct that in this case, the errors are not normally distributed and thus do not meet the requirements of GLM.

--

Paige Miller

Paige Miller

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.