Hi all,
I want to apply a CMH test to compare Aortic regurgitation (AR) between two groups. Aortic regurgitation is an ordinal variable (tvrn) with 7 modalities (1-None, 2-Trace, 3-Mild, 4-Mild to moderate, 5-Moderate, 6-Moderate to severe, 7-Severe). I observe different p-values for CMH Row Mean Scores Differ test if the ordinal variable is defined as a character variable or as a numeric variable when there is no patient in one of the intermediate level. I would understand the reasons of such difference and be sure to use the correct format this ordinal variable. Thank you
proc sort data=essai2 ; by tvrn trt; run; proc freq data=essai2 order=data; tables trt*tvrn /cmh ; run;
Example: No patient with 5-Moderate or 6-Moderate to severe AR
Results when the ordinal variable is in character.
Results when the ordinal variable is in numeric:
Hi @SaloméT and welcome to the SAS Support Communities!
PROC FREQ assigns numeric scores to the levels of the row and column variables (i.e., trt and tvrn in your case) which are used in the calculation of the first two CMH statistics. As you don't specify the SCORES= option of the TABLES statement, the default scoring method (SCORES=TABLE) is used. These scores equal the variable values for numeric variables and the row/column numbers (starting from 0, although the documentation states 1, but the results are the same) for character variables.
So, as long as all levels 1 through 7 of tvrn exist in the data (and the ORDER= option implies the order 1, 2, ..., 7), it doesn't matter if that variable is numeric or character because either way the scores are essentially the same. However, if categories 5 and 6 are missing (everything else being the same), the scores for a character variable tvrn would be 0, 1, 2, 3, 4, which is not equivalent to 0, 1, 2, 3, 6 (or 1, 2, 3, 4, 7 for that matter). The difference between levels 4 (mild to moderate) and 7 (severe) would appear smaller (than it likely should be) in the first case.
If the numeric values 1, 2, ..., 7 (i.e. equal "distances" between adjacent levels) are deemed appropriate to reflect the seven modalities, then you should use the numeric version of tvrn. Thus the difference between levels 4 and 7 will be handled in the same way, whether observations with levels 5 and 6 occur in the data or not.
You can add the SCOROUT option to the TABLES statement to make the scores visible in the output.
Hi @SaloméT and welcome to the SAS Support Communities!
PROC FREQ assigns numeric scores to the levels of the row and column variables (i.e., trt and tvrn in your case) which are used in the calculation of the first two CMH statistics. As you don't specify the SCORES= option of the TABLES statement, the default scoring method (SCORES=TABLE) is used. These scores equal the variable values for numeric variables and the row/column numbers (starting from 0, although the documentation states 1, but the results are the same) for character variables.
So, as long as all levels 1 through 7 of tvrn exist in the data (and the ORDER= option implies the order 1, 2, ..., 7), it doesn't matter if that variable is numeric or character because either way the scores are essentially the same. However, if categories 5 and 6 are missing (everything else being the same), the scores for a character variable tvrn would be 0, 1, 2, 3, 4, which is not equivalent to 0, 1, 2, 3, 6 (or 1, 2, 3, 4, 7 for that matter). The difference between levels 4 (mild to moderate) and 7 (severe) would appear smaller (than it likely should be) in the first case.
If the numeric values 1, 2, ..., 7 (i.e. equal "distances" between adjacent levels) are deemed appropriate to reflect the seven modalities, then you should use the numeric version of tvrn. Thus the difference between levels 4 and 7 will be handled in the same way, whether observations with levels 5 and 6 occur in the data or not.
You can add the SCOROUT option to the TABLES statement to make the scores visible in the output.
Thanks for explanation. How to deal with situation where few categories are missing and we want to calculate CMH2 for ordinal response?
Hello @vvvv_ggg and welcome to the SAS Support Communities!
The CMH2 option requests the same statistics as the CMH option, except for the "General Association" statistic, which is not included with CMH2. So, what I wrote in response to the original poster of this thread applies to the CMH2 option as well.
If your question addresses a different problem caused by missing categories, please open a new thread in the Statistical Procedures forum, which is more appropriate for statistical questions than the "New SAS User" forum. Make sure that you describe the issue in sufficient detail (ideally provide sample data illustrating it).
This is also a general recommendation: Even if your issue is similar to one that was discussed weeks (let alone years) ago, it is best practice to open a new thread. This way you reach the maximum possible audience (and hence it is most likely that you get a helpful answer as soon as possible), whereas only a few people would even notice that an addition to an old discussion has been made. Also, only the thread starter can mark a reply as the accepted solution, which helps later readers to find the relevant posting. See How to get fast, helpful answers for this and more tips. Note that you can include a link to an old thread (as I did in the previous sentence; use the "Insert/edit link" button between the smiley and the camera icon) if that helps to explain a similar problem.
Good luck!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.