## Why is there a difference between two levels of a discrete distribution?

Occasional Contributor
Posts: 15

# Why is there a difference between two levels of a discrete distribution?

I get the feeling the following request is a lot simpler than I am making it out to be.  Assume I have a variable with a finite number of possible nominal values (A - E, for example).  According to a PROC ANOVA, there is a difference between the distribution of this variable at level 1 and at level 2.  I would like to determine which, if any, of the values occur with significantly different frequencies across the two levels, and I am just flat out stuck figuring out a simple way to program this or which PROC statements to use to move forward.

Highlighted
Occasional Contributor
Posts: 8

## Re: Why is there a difference between two levels of a discrete distribution?

[ Edited ]

Apologies if I'm misreading your question, but using ANOVA with a nominal dependent variable is not appropriate. It sounds like you want to see if the distribution of values in one nominal variable differs across levels of another variable. If so, I'd use PROC FREQ and add the CHISQ option to the TABLES statement to get the Pearson Chi-Square test. To investigate how much each cell in the two-way table deviates from its expected value under the null hypothesis of no association between the two variables, you could also add the CELLCHI2 option. This would add an extra number to each cell showing (observed - expected)^2 / expected. Higher values mean greater deviation. The code would look something like this:

``````proc freq data=mydata;
tables var1 * var2 / chisq cellchi2;
run;``````

Occasional Contributor
Posts: 15

## Re: Why is there a difference between two levels of a discrete distribution?

Point taken.  I was doing a distribution analysis for many variables, nominal, ordinal, interval, and ratio, so I just did a massive ANOVA for speed's sake.

That said, can the cell's contribution to the chi-squared value be used in such a way to generate a p-value for its difference from the same cell value in the other level?

SAS Super FREQ
Posts: 4,242