Hello everyone, I've been asked to compute a Cochran-Mantel-Haenszel (CMH) Test on 2 variables. Both variables are usual values (columns) for a high number of observations (lines). Based on the little Wikipedia page and the Biostat Handbook, I tried to understand what the CMH test is and here is what I understood : Cochran–Mantel–Haenszel test for repeated tests of independence : Use the Cochran–Mantel–Haenszel test when you have data from 2×2 tables that you've repeated at different times or locations. Then it seems it should be done over 3 factors : lines, cols and stratas : There are three nominal variables: the two variables of the 2×2 test of independence, and the third nominal variable that identifies the repeats. Indeed, in R, the mantelhaen.test{stats} function needs 3 factors or a 3 dimensions array which makes sense. On the other hand, in SAS, you can easily write a 2 factors CMH test : PROC FREQ data=MY_DATA; TABLE VAR1 * VAR2 /CMH; /*or /CHISQ, which give the first MH pvalue too*/ RUN; The SAS documentation is not crystal clear about it, and examples are maid over 3 factors, but this 2 dimensions test runs normally and give you 3 statistics and p-values, including a "correlation" one. All stats and pvalues are different from a standard chisquare test. This SAS doc link gives another definition of the CMH (sorry no anchor, search for "Mantel-Haenszel"), which is not what I understood from wikipedia. It states that : The Mantel-Haenszel chi-square statistic tests the alternative hypothesis that there is a linear association between the row variable and the column variable. Both variables must lie on an ordinal scale. But actually, I can run a proc freq with CMH option on 2 character variables with no error. Is it normal ? I have to compute the CMH test with both R and SAS but I want to know what I am doing. What is the difference between SAS and common conception of the CMH test ? Thanks for helping.
... View more