Dear SAS Community,
I would like to know if there is an association between a nominal variable 'Ethnicity' (8 unordered levels) and an ordinal variable 'Overall1 ' (9 ordered levels) which means overall acceptance. In other words, if there is an influence of the Ethnicity of the consumer in the overall acceptance of the fruit that is presented to the consumer.
If I try proc freq with the chisq option I get the following table:
proc freq data=one;
tables Ethnicity*Overall1 /chisq;
run;
| Statistic | DF | Value | Prob |
|---|---|---|---|
| Chi-Square | 40 | 48.3949 | 0.1702 |
| Likelihood Ratio Chi-Square | 40 | 50.1632 | 0.1302 |
| Mantel-Haenszel Chi-Square | 1 | 0.4461 | 0.5042 |
| Phi Coefficient | 0.1349 | ||
| Contingency Coefficient | 0.1337 | ||
| Cramer's V | 0.0603 | ||
| WARNING: 37% of the cells have expected counts less than 5. Chi-Square may not be a valid test. |
|||
Which statistic test should I use to see if there is an association between these two variables?
I would greatly appreciate your response!
If the association you want to assess is, as you say, the effect of ethnicity, then a nonmodel-based approach would the 2nd CMH statistic from FREQ:
proc freq; table ethnicity*overall1 / cmh; run;
Or you could use a model-based approach - see the Type3 test of ethnicity:
proc logistic; class ethnicity / param=glm; model overall1=ethnicity; run;
If the association you want to assess is, as you say, the effect of ethnicity, then a nonmodel-based approach would the 2nd CMH statistic from FREQ:
proc freq; table ethnicity*overall1 / cmh; run;
Or you could use a model-based approach - see the Type3 test of ethnicity:
proc logistic; class ethnicity / param=glm; model overall1=ethnicity; run;
That was super helpful, thank you so much!
So with the second statistic you mean I should use the Row mean scores differ (0.7198)? I thought I should use the general association (0.1707).
| Cochran-Mantel-Haenszel Statistics (Based on Table Scores) | ||||
|---|---|---|---|---|
| Statistic | Alternative Hypothesis | DF | Value | Prob |
| 1 | Nonzero Correlation | 1 | 0.4461 | 0.5042 |
| 2 | Row Mean Scores Differ | 5 | 2.8717 | 0.7198 |
| 3 | General Association | 40 | 48.3767 | 0.1707 |
ok, thank you! Good to know that in the case of two nominal variables (general association).
What about two ordinal variables? Should I use plcorr?
proc freq data=one;
tables Freq*Score/plcorr;
run;
And if I want to see if there is an effect of a continuous variable in an ordinal variable should I use kendall?
Good to know, thank you StatDave!
If I try this proc logistic code to test the association between the continuous predictor and the ordinal response then I don't get the type 3 effect table. Am I missing something?
proc logistic data=one;
model Overall1=DM;
run;
Oh ok. Thank you so much StatDave for your great help on this!
https://blogs.sas.com/content/iml/2023/12/11/polychoric-correlation.html
And macro %magree might give you a hand.
Very useful, thank you very much!
You can run PROC TRANSREG with a model statement like this:
MODEL MONOTONE(substitute your ordinal variable) = CLASS(substitute your nominal variable) / TEST;
Thanks for the info!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.