Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Poisson regression coding - is offset term necessary? And how to inter...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 09-02-2019 06:58 PM
(1219 views)

I am trying to create a causal model for the effect of play type on the rate of injuries. Although the database originally had each individual injury recorded, I created a table using proc sql to create a new variable that specifies the number of injuries for each game and have the relevant exposure variables (number of running plays and passing plays) as well as potential confounding variables (weather conditions - categorical variable, temperature - quantitative variable). An example is included below:

The mean number of injuries/game is 0.6 and given the presence of count data (number of injuries/1 team-game) for the response variable I presumed that Poisson regression would be the proper model to answer my research question. Here is my attempt to create my desired model, including the effect size of each additional 10 passing/running plays:

```
proc genmod data = football;
class weather (ref = "indoors") / param = ref;
model numinjuries = passing_plays running_plays weather temp / dist = poisson;
estimate "10 passing plays" passing_plays 10;
estimate "10 running plays" running_plays 10;
run;
```

I have read through various of the related questions in this community and I cannot determine if I created an appropriate model to answer my research question, if it is coded correctly, and how exactly I should interpret the results. Is this actually a rate and therefore I should create ln(passing_plays) and offset by that? Or does the same exposure to 1 team-game for the all of the injury counts make it a count rather than a rate and therefore I should not use an offset term? If so, how should I code that? The resulting estimate for 10 passing plays was 1.2 - Would the correct interpretation then be (in the case of counts): For each additional 10 passing plays, teams were expected to have 1.2 times the number of injuries (or 20% more injuries)?

Thank you in advance for your help.

3 REPLIES 3

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Your model doesn't need an offset if you want to estimate the number of injuries __per game__. An offset would let you estimate the number of injuries __per play__.

Because the default link function chosen by genmod for the Poisson distribution is log, you should exponentiate the estimates to get the actual values (you can do this by hand or request the EXP option in your estimate statement). Thus, the number of injuries is estimated to increase by a factor of exp(1.2) = 3.32 for 10 extra passing plays.

hth

PG

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The beta coefficient for passing plays was actually 0.0171 and so I believe the estimate already exponentiated this for me: e ^ 0.171 = 1.2. My interpretation of this is as follows: For every 10 additional pass plays, there are 20% greater expected injuries per team-game. Given the overall mean rate of injuries of 0.60 injuries/team-game, this corresponds to approximately 2 additional expected injuries per team each 16-game season (.60 x .2 x 16 = about 2). Would this interpretation be correct?

Also would this be expressing an incidence rate ratio or a different effect?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.