I have a matched dataset (2:1). Is it appropriate to use McNemar's test and Paired t-tests?
For the case with a binary response, this sort of matched data can be analyzed with a conditional logistic model. That model can be fit in PROC LOGISTIC using the STRATA statement. In the STRATA statement, specify a variable that has a unique value for each set of three matched observations. The input data should have a separate observation for each subject/item in each matched set. A simple example for matched pairs data can be seen in the Examples section of the PROC LOGISTIC documentation.
If I understand your question, these tests are for 1:1 matching. Typically, the observations are the same subject before and after some intervention. McNemar's test is for association in a 2x2 frequency table whereas a paired t test is to detect whether there is a difference in the mean response before/after the intervention.
It really depends on the matching criteria - if it is within a subject, such as a pre-/post-, I feel confident in the use of a paired t-test. However, if it is matching based on inspection or a propensity score or other factors where the experimental unit and the observational unit are not the same, I would go with a standard t test and allow for heteroscedastic variances.
I can't comment so much on McNemars test, except that it requires a square contingency table, and really depends on what occurs in the off-diagonal cells. So long as you are confident about the matching criteria (like a patient and 2 siblings), I think you would be OK.
SteveDenham
For the case with a binary response, this sort of matched data can be analyzed with a conditional logistic model. That model can be fit in PROC LOGISTIC using the STRATA statement. In the STRATA statement, specify a variable that has a unique value for each set of three matched observations. The input data should have a separate observation for each subject/item in each matched set. A simple example for matched pairs data can be seen in the Examples section of the PROC LOGISTIC documentation.
And for continuous data, any approach that recognizes both the weighting and hierarchical/clustered nature of the design after matching would work. Mixed models come to mind (as they usually do for me).
SteveDenham
Close. Replace the ptid as the subject with MatchId. Also consider changing from a REPEATED statement to a RANDOM statement, so that you get this:
random int/subject=matchid type=cs;
SteveDenham
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.