Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Re: Why is there a difference between two levels of a discrete distrib...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 01-15-2018 02:07 PM
(1536 views)

I get the feeling the following request is a lot simpler than I am making it out to be. Assume I have a variable with a finite number of possible nominal values (A - E, for example). According to a PROC ANOVA, there is a difference between the distribution of this variable at level 1 and at level 2. I would like to determine which, if any, of the values occur with significantly different frequencies across the two levels, and I am just flat out stuck figuring out a simple way to program this or which PROC statements to use to move forward.

3 REPLIES 3

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Apologies if I'm misreading your question, but using ANOVA with a nominal dependent variable is not appropriate. It sounds like you want to see if the distribution of values in one nominal variable differs across levels of another variable. If so, I'd use PROC FREQ and add the CHISQ option to the TABLES statement to get the Pearson Chi-Square test. To investigate how much each cell in the two-way table deviates from its expected value under the null hypothesis of no association between the two variables, you could also add the CELLCHI2 option. This would add an extra number to each cell showing (observed - expected)^2 / expected. Higher values mean greater deviation. The code would look something like this:

```
proc freq data=mydata;
tables var1 * var2 / chisq cellchi2;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Point taken. I was doing a distribution analysis for many variables, nominal, ordinal, interval, and ratio, so I just did a massive ANOVA for speed's sake.

That said, can the cell's contribution to the chi-squared value be used in such a way to generate a p-value for its difference from the same cell value in the other level?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

**Don't miss out on SAS Innovate - Register now for the FREE Livestream!**

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.