BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DanCh
Fluorite | Level 6

Hello everyone,

 

I've been asked to compute a Cochran-Mantel-Haenszel (CMH) Test on 2 variables. Both variables are usual values (columns) for a high number of observations (lines).

 

Based on the little Wikipedia page and the Biostat Handbook, I tried to understand what the CMH test is and here is what I understood :

Cochran–Mantel–Haenszel test for repeated tests of independence : Use the Cochran–Mantel–Haenszel test when you have data from 2×2 tables that you've repeated at different times or locations.

Then it seems it should be done over 3 factors : lines, cols and stratas :

There are three nominal variables: the two variables of the 2×2 test of independence, and the third nominal variable that identifies the repeats.

Indeed, in R, the mantelhaen.test{stats} function needs 3 factors or a 3 dimensions array which makes sense.

 

On the other hand, in SAS, you can easily write a 2 factors CMH test :

PROC FREQ data=MY_DATA;    TABLE VAR1 * VAR2 /CMH; /*or /CHISQ, which give the first MH pvalue too*/    RUN;

The SAS documentation is not crystal clear about it, and examples are maid over 3 factors, but this 2 dimensions test runs normally and give you 3 statistics and p-values, including a "correlation" one. All stats and pvalues are different from a standard chisquare test.

 

This SAS doc link gives another definition of the CMH (sorry no anchor, search for "Mantel-Haenszel"), which is not what I understood from wikipedia. It states that :

The Mantel-Haenszel chi-square statistic tests the alternative hypothesis that there is a linear association between the row variable and the column variable. Both variables must lie on an ordinal scale.

But actually, I can run a proc freq with CMH option on 2 character variables with no error. Is it normal ?

 

I have to compute the CMH test with both R and SAS but I want to know what I am doing.

 

What is the difference between SAS and common conception of the CMH test ?

 

Thanks for helping.

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

The CMH tests provided by PROC FREQ are the same as the CMH tests defined in the statistical literature. The CMH tests generally test for association within a set of tables, where each table represents a stratum.  However, there is no requirement that there is more than one stratum, so it can just as well be used with a single table. In the case of a single table, the CMH correlation test is the same as the Mantel-Haenszel test which is provided by the CHISQ option. The three CMH tests allow for the row and column variables defining each table to have nominal or ordinal scale so that an appropriate hypothesis can be tested. See this note for more on that. 

View solution in original post

7 REPLIES 7
StatDave
SAS Super FREQ

The CMH tests provided by PROC FREQ are the same as the CMH tests defined in the statistical literature. The CMH tests generally test for association within a set of tables, where each table represents a stratum.  However, there is no requirement that there is more than one stratum, so it can just as well be used with a single table. In the case of a single table, the CMH correlation test is the same as the Mantel-Haenszel test which is provided by the CHISQ option. The three CMH tests allow for the row and column variables defining each table to have nominal or ordinal scale so that an appropriate hypothesis can be tested. See this note for more on that. 

DanCh
Fluorite | Level 6
Your note was very interesting, but still the SAS doc talks about linear association, which doesn't make a lot of sense to me with unordered categorical variable. Could you explain a little more please ?
StatDave
SAS Super FREQ

As shown in the note I referred you to, the row and column variables in each stratum table can be ordinal (ordered levels) or nominal (unordered). If both the row and column variables have multiple possible values, and are ordered (such as low, medium, high), then you might be particularly interested to test whether the column variable increases as the row variable increases. The correlation CMH statistic tests this linear association hypothesis. Obviously, the linear association hypothesis is not of interest if both variables are unordered. In that case you would use the third CMH statistic - the general association statistic.

Reeza
Super User

@DanCh wrote:
Your note was very interesting, but still the SAS doc talks about linear association, which doesn't make a lot of sense to me with unordered categorical variable. Could you explain a little more please ?

and the SAS 9.4 Documentation. 

 

The alternative hypothesis for the correlation statistic is that there is a linear association between X and Y in at least one stratum. If either X or Y does not lie on an ordinal (or interval) scale, this statistic is not meaningful.

Reeza
Super User

That documentation reference is about twenty years old. 

See the latest version which aligns fairly well with what you're stating:

http://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.2&docsetId=procstat&docsetTarget=proc...

 

Unless you're using SAS 8 for some reason?

DanCh
Fluorite | Level 6
Ok I didn't see the version was different (curious though that the test changed over versions). I think I get it now, thank to you two.
Reeza
Super User

I don't think the test changed, I think the documentation did.

And SAS 8 was initially released in 1999. 

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 3629 views
  • 0 likes
  • 3 in conversation