Solved: Re: Are McNemar and Paired t-tessts appropriate for 2:1 matching?

tka726 · Posted 10-28-2020 11:02 AM

I have a matched dataset (2:1). Is it appropriate to use McNemar's test and Paired t-tests?

StatDave · Posted 10-28-2020 12:59 PM

For the case with a binary response, this sort of matched data can be analyzed with a conditional logistic model. That model can be fit in PROC LOGISTIC using the STRATA statement. In the STRATA statement, specify a variable that has a unique value for each set of three matched observations. The input data should have a separate observation for each subject/item in each matched set. A simple example for matched pairs data can be seen in the Examples section of the PROC LOGISTIC documentation.

View solution in original post

Rick_SAS · Posted 10-28-2020 11:50 AM

If I understand your question, these tests are for 1:1 matching. Typically, the observations are the same subject before and after some intervention. McNemar's test is for association in a 2x2 frequency table whereas a paired t test is to detect whether there is a difference in the mean response before/after the intervention.

SteveDenham · Posted 10-28-2020 12:48 PM

It really depends on the matching criteria - if it is within a subject, such as a pre-/post-, I feel confident in the use of a paired t-test. However, if it is matching based on inspection or a propensity score or other factors where the experimental unit and the observational unit are not the same, I would go with a standard t test and allow for heteroscedastic variances.

I can't comment so much on McNemars test, except that it requires a square contingency table, and really depends on what occurs in the off-diagonal cells. So long as you are confident about the matching criteria (like a patient and 2 siblings), I think you would be OK.

SteveDenham

StatDave · Posted 10-28-2020 12:59 PM

For the case with a binary response, this sort of matched data can be analyzed with a conditional logistic model. That model can be fit in PROC LOGISTIC using the STRATA statement. In the STRATA statement, specify a variable that has a unique value for each set of three matched observations. The input data should have a separate observation for each subject/item in each matched set. A simple example for matched pairs data can be seen in the Examples section of the PROC LOGISTIC documentation.

SteveDenham · Posted 10-28-2020 02:22 PM

And for continuous data, any approach that recognizes both the weighting and hierarchical/clustered nature of the design after matching would work. Mixed models come to mind (as they usually do for me).

SteveDenham

tka726 · Posted 10-28-2020 04:35 PM

Thanks!! I am not that familiar with proc mixed. How would the code look?
Assuming data looks like this; a treated patient [treat=1] can be matched with up to 2 controls [treat=0] without replacement, and the matched pair is noted in 'matchID' (unique value for each set of matched observations)

Obs ptid Y treat matchID
1 1 80 1 1
2 2 100 0 1
3 3 60 0 1
4 4 20 1 2
5 5 100 0 2
6 6 75 0 2
7 7 75 1 3
8 8 65 0 3

PROC MIXED DATA = data1 METHOD = ML;
CLASS ptid matchID
MODEL Y = treat;
REPEATED INT / TYPE = cs SUBJECT = ptid;
RUN;

SteveDenham · Posted 10-29-2020 08:38 AM

Close. Replace the ptid as the subject with MatchId. Also consider changing from a REPEATED statement to a RANDOM statement, so that you get this:

random int/subject=matchid type=cs;

SteveDenham

SAS Innovate 2025: Call for Content