I am trying to create a causal model for the effect of play type on the rate of injuries. Although the database originally had each individual injury recorded, I created a table using proc sql to create a new variable that specifies the number of injuries for each game and have the relevant exposure variables (number of running plays and passing plays) as well as potential confounding variables (weather conditions - categorical variable, temperature - quantitative variable). An example is included below:
The mean number of injuries/game is 0.6 and given the presence of count data (number of injuries/1 team-game) for the response variable I presumed that Poisson regression would be the proper model to answer my research question. Here is my attempt to create my desired model, including the effect size of each additional 10 passing/running plays:
proc genmod data = football;
class weather (ref = "indoors") / param = ref;
model numinjuries = passing_plays running_plays weather temp / dist = poisson;
estimate "10 passing plays" passing_plays 10;
estimate "10 running plays" running_plays 10;
run;
I have read through various of the related questions in this community and I cannot determine if I created an appropriate model to answer my research question, if it is coded correctly, and how exactly I should interpret the results. Is this actually a rate and therefore I should create ln(passing_plays) and offset by that? Or does the same exposure to 1 team-game for the all of the injury counts make it a count rather than a rate and therefore I should not use an offset term? If so, how should I code that? The resulting estimate for 10 passing plays was 1.2 - Would the correct interpretation then be (in the case of counts): For each additional 10 passing plays, teams were expected to have 1.2 times the number of injuries (or 20% more injuries)?
Thank you in advance for your help.
Your model doesn't need an offset if you want to estimate the number of injuries per game. An offset would let you estimate the number of injuries per play.
Because the default link function chosen by genmod for the Poisson distribution is log, you should exponentiate the estimates to get the actual values (you can do this by hand or request the EXP option in your estimate statement). Thus, the number of injuries is estimated to increase by a factor of exp(1.2) = 3.32 for 10 extra passing plays.
hth
The beta coefficient for passing plays was actually 0.0171 and so I believe the estimate already exponentiated this for me: e ^ 0.171 = 1.2. My interpretation of this is as follows: For every 10 additional pass plays, there are 20% greater expected injuries per team-game. Given the overall mean rate of injuries of 0.60 injuries/team-game, this corresponds to approximately 2 additional expected injuries per team each 16-game season (.60 x .2 x 16 = about 2). Would this interpretation be correct?
Also would this be expressing an incidence rate ratio or a different effect?
option OFFSET is not necessary if these COUNT is measured under the same unit (i.e. these COUNT is appeared in one day or one month ).
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.