BookmarkSubscribeRSS Feed
jb4105
Calcite | Level 5

I am trying to create a causal model for the effect of play type on the rate of injuries. Although the database originally had each individual injury recorded, I created a table using proc sql to create a new variable that specifies the number of injuries for each game and have the relevant exposure variables (number of running plays and passing plays) as well as potential confounding variables (weather conditions - categorical variable, temperature - quantitative variable). An example is included below:

Screen Shot 2019-09-02 at 6.45.22 PM.png

 

The mean number of injuries/game is 0.6 and given the presence of count data (number of injuries/1 team-game) for the response variable I presumed that Poisson regression would be the proper model to answer my research question. Here is my attempt to create my desired model, including the effect size of each additional 10 passing/running plays:

 

proc genmod data = football;
class weather (ref = "indoors") / param = ref;
model numinjuries = passing_plays running_plays weather temp / dist = poisson;
estimate "10 passing plays" passing_plays 10;
estimate "10 running plays" running_plays 10;
run;

 

I have read through various of the related questions in this community and I cannot determine if I created an appropriate model to answer my research question, if it is coded correctly, and how exactly I should interpret the results. Is this actually a rate and therefore I should create ln(passing_plays) and offset by that? Or does the same exposure to 1 team-game for the all of the injury counts make it a count rather than a rate and therefore I should not use an offset term? If so, how should I code that? The resulting estimate for 10 passing plays was 1.2 - Would the correct interpretation then be (in the case of counts): For each additional 10 passing plays, teams were expected to have 1.2 times the number of injuries (or 20% more injuries)?

 

Thank you in advance for your help.

3 REPLIES 3
PGStats
Opal | Level 21

Your model doesn't need an offset if you want to estimate the number of injuries per game. An offset would let you estimate the number of injuries per play

Because the default link function chosen by genmod for the Poisson distribution is log, you should exponentiate the estimates to get the actual values (you can do this by hand or request the EXP option in your estimate statement). Thus, the number of injuries is estimated to increase by a factor of exp(1.2) = 3.32 for 10 extra passing plays. 

hth

PG
jb4105
Calcite | Level 5

The beta coefficient for passing plays was actually 0.0171 and so I believe the estimate already exponentiated this for me: e ^ 0.171 = 1.2. My interpretation of this is as follows: For every 10 additional pass plays, there are 20% greater expected injuries per team-game. Given the overall mean rate of injuries of 0.60 injuries/team-game, this corresponds to approximately 2 additional expected injuries per team each 16-game season (.60 x .2 x 16 = about 2). Would this interpretation be correct?

 

Also would this be expressing an incidence rate ratio or a different effect?

 



Ksharp
Super User

option OFFSET is not necessary if these COUNT is measured under the same unit (i.e.  these COUNT is appeared in one day or one month ).

 

 

http://support.sas.com/kb/24/188.html

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1220 views
  • 0 likes
  • 3 in conversation