turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Understanding the SAS conception of CMH test

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-09-2018 09:30 AM

Hello everyone,

I've been asked to compute a Cochran-Mantel-Haenszel (CMH) Test on 2 variables. Both variables are usual values (columns) for a high number of observations (lines).

Based on the little Wikipedia page and the Biostat Handbook, I tried to understand what the CMH test is and here is what I understood :

Cochran–Mantel–Haenszel test for

repeated tests of independence: Use the Cochran–Mantel–Haenszel test when you have data from 2×2 tables that you've repeated at different times or locations.

Then it seems it should be done over 3 factors : lines, cols and stratas :

There are three nominal variables: the two variables of the 2×2 test of independence, and the third nominal variable that identifies the repeats.

Indeed, in R, the `mantelhaen.test{stats}`

function needs 3 factors or a 3 dimensions array which makes sense.

On the other hand, in SAS, you can easily write a 2 factors CMH test :

`PROC FREQ data=MY_DATA; TABLE VAR1 * VAR2 /CMH; /*or /CHISQ, which give the first MH pvalue too*/ RUN;`

The SAS documentation is not crystal clear about it, and examples are maid over 3 factors, but this 2 dimensions test runs normally and give you 3 statistics and p-values, including a "correlation" one. All stats and pvalues are different from a standard chisquare test.

This SAS doc link gives another definition of the CMH (sorry no anchor, search for "Mantel-Haenszel"), which is not what I understood from wikipedia. It states that :

The Mantel-Haenszel chi-square statistic tests the alternative hypothesis that there is a

linear associationbetween the row variable and the column variable. Both variables must lie on an ordinal scale.

But actually, I can run a proc freq with CMH option on 2 character variables with no error. Is it normal ?

I have to compute the CMH test with both R and SAS but I want to know what I am doing.

What is the difference between SAS and common conception of the CMH test ?

Thanks for helping.

Accepted Solutions

Solution

02-13-2018
05:03 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DanCh

02-09-2018 09:53 AM

All Replies

Solution

02-13-2018
05:03 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DanCh

02-09-2018 09:53 AM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to StatDave_sas

02-09-2018 10:51 AM

Your note was very interesting, but still the SAS doc talks about linear association, which doesn't make a lot of sense to me with unordered categorical variable. Could you explain a little more please ?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DanCh

02-09-2018 10:57 AM

As shown in the note I referred you to, the row and column variables in each stratum table can be ordinal (ordered levels) or nominal (unordered). If both the row and column variables have multiple possible values, and are ordered (such as low, medium, high), then you might be particularly interested to test whether the column variable increases as the row variable increases. The correlation CMH statistic tests this linear association hypothesis. Obviously, the linear association hypothesis is not of interest if both variables are unordered. In that case you would use the third CMH statistic - the general association statistic.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DanCh

02-09-2018 11:11 AM

DanCh wrote:

Your note was very interesting, but still the SAS doc talks about linear association, which doesn't make a lot of sense to me with unordered categorical variable. Could you explain a little more please ?

and the SAS 9.4 Documentation.

The alternative hypothesis for the correlation statistic is that there is a linear association between `X`

and `Y`

in at least one stratum. **If either **`X`

or `Y`

does not lie on an ordinal (or interval) scale, this statistic is not meaningful.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DanCh

02-09-2018 11:05 AM

That documentation reference is about twenty years old.

See the latest version which aligns fairly well with what you're stating:

Unless you're using SAS 8 for some reason?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

02-13-2018 05:04 AM

Ok I didn't see the version was different (curious though that the test changed over versions). I think I get it now, thank to you two.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to DanCh

02-13-2018 10:44 AM

I don't think the test changed, I think the documentation did.

And SAS 8 was initially released in 1999.