Is it appropriate to do a t-test/anova for a count, given it is not normally distributed?
If I use genmod with a poisson distribution instead (count as the outcome, categorical variable as the dependent variable) , it is still a test for difference in observed means for the count variable? Can I put my observed means and standard deviations calculated with proc means next to the p-value from genmod? Or would I be mixing things? Should I take the log of my means and present those instead? Thank you.
@proctice wrote:
Is it appropriate to do a t-test/anova for a count, given it is not normally distributed?
If I use genmod with a poisson distribution instead (count as the outcome, categorical variable as the dependent variable) , it is still a test for difference in observed means for the count variable? Can I put my observed means and standard deviations calculated with proc means next to the p-value from genmod? Or would I be mixing things? Should I take the log of my means and present those instead? Thank you.
Real life count data is not frequently distributed as Poisson. Poisson based p-values are usually over liberal (too small). Make sure you check for overdispersion if you go that route.
As for your model, "outcome variable" and "dependent variable" usually mean the same thing. If your model is of the type count = categorical, then you could, as a first safe step, try non-parametric tests. If those aren't significant, proceed with great caution with tests based on stronger assumptions.
Is ChiSquare another approach, it's more designed for count data.
@proctice wrote:
Is it appropriate to do a t-test/anova for a count, given it is not normally distributed?
To test what hypothesis?
To test a question like this:
Is the average count for men higher than the average count for women.
model count=gender/dist=poisson
@proctice wrote:
Is it appropriate to do a t-test/anova for a count, given it is not normally distributed?
If I use genmod with a poisson distribution instead (count as the outcome, categorical variable as the dependent variable) , it is still a test for difference in observed means for the count variable? Can I put my observed means and standard deviations calculated with proc means next to the p-value from genmod? Or would I be mixing things? Should I take the log of my means and present those instead? Thank you.
Real life count data is not frequently distributed as Poisson. Poisson based p-values are usually over liberal (too small). Make sure you check for overdispersion if you go that route.
As for your model, "outcome variable" and "dependent variable" usually mean the same thing. If your model is of the type count = categorical, then you could, as a first safe step, try non-parametric tests. If those aren't significant, proceed with great caution with tests based on stronger assumptions.
Sorry, I meant independent not dependent.
I plan to do all the over-dispersion testing later when I put several variables into the model. Right now, I am just doing a bivariate analysis to decide what variables to include in my bigger model.
I like the non-parametric suggestion! I'll do that instead.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.