Statistical Procedures

NMB82 · Posted 02-10-2021 01:33 PM

I'm running a Chi Sq test with 2 binary variables. The data are large (>1 million rows) and not balanced (rare event). The test is statistically significant (p < .0001) and the Cramer's V is very small (.006). I took this to mean there is no relationship and the p value is due to such a large sample size/power. However, the odds ratio is 4.4.

I'm trying to understand how one effect size (Cramer's V) can be so unlike another (Odds Ratio)? Is the Cramer's V sensitive to data imbalance? Is the OR preferred in this case?

SteveDenham · Posted 02-11-2021 08:13 AM

Looking through the formulas for calculating Cramer's V or phi or the contingency coefficient, which are all the same in this case, it appears that the small value is determined by the imbalance. I just have a hard time thinking of these parameters as effect sizes, when they are a measure of agreement - and when the sample is strongly imbalanced towards one row, the values will be small. Is there some confounding variable, such that these values are influenced by Simpson's paradox?

SteveDenham

reneecandy · Posted 10-16-2024 03:58 AM

I come across the same question. may i ask did you find any evidence, any paper, thesis, journal, or chapters in any book, to support this limitation of Cramer's V? Eager for your reply! Many thanks!

Huang Ling

2024.10.10

SteveDenham · Posted 10-16-2024 02:16 PM

Well, I wouldn't call it refereed so much as crowd sourced, but https://en.wikipedia.org/wiki/Cram%C3%A9r%27s_V provides a lot of insight to those of us who aren't familiar with it. The formulas there use k to index the number of columns and r to index the number of rows, so we are not limited to square matrices. Calculating using the summation formulas given doesn't seem hampered by cells with zero counts, so that isn't an issue. The biggest issue with the use of Cramer's V is that it is severely biased toward 1, and unbalanced data increases this.

Is that of any help to you?

SteveDenham

Statistical Procedures

Is Cramer's V effect size sensitive to unbalanced data (rare event) in Chi Square tests?

Re: Is Cramer's V effect size sensitive to unbalanced data (rare event) in Chi Square tests?

Re: Is Cramer's V effect size sensitive to unbalanced data (rare event) in Chi Square tests?

Re: Is Cramer's V effect size sensitive to unbalanced data (rare event) in Chi Square tests?

Follow Us

What is...

Statistical Procedures

Is Cramer's V effect size sensitive to unbalanced data (rare event) in Chi Square tests?

Re: Is Cramer's V effect size sensitive to unbalanced data (rare event) in Chi Square tests?

Re: Is Cramer's V effect size sensitive to unbalanced data (rare event) in Chi Square tests?

Re: Is Cramer's V effect size sensitive to unbalanced data (rare event) in Chi Square tests?

Our biggest data and AI event of the year.

Follow Us

What is...