BookmarkSubscribeRSS Feed
Callam1
Obsidian | Level 7

Hi

I am struggling to understand the result of my matching with replacement with k=5. The output is showing that 99% of the 30,000 treated observations are successfully matched and 88% of the 28,000 control observations are matched. But when I open the resulting dataset (matched_data in the code below), although all the treatment observations have a matching ID,  50% of control observations have a missing matching ID. It appears that  the information provided in the output is not accurate as it is counting as matched the control observations with a missing matching ID. The matching appears very good based on the Standardised difference, but can I trust it? In the matched sample there is no matching set where a  treatment observation was matched to more than 1 control. I do have several treated observations matched to one control observation (as expected given the matching with replacement). Is it that SAS cannot process accurately a matching with replacement with k>1 when the control sample is smaller than the treatment sample ? If I select k=1 the output is accurately reflecting the matching and the matched dataset has no missing matching ID. However the matching performance  is not as good, despite the fact that the number of matched treatment and control observations with a matching ID is the same as in k=5. Any views?

Many thanks in advance!

 

ods graphics on;
proc psmatch data= dataset1 region=cs;
class is_treated binary_covariates categ_covariates;

psmodel is_treated(Treated='1') =categ_covariates continuous_covariates interaction_effect;
match method=replace(k=5) stat=lps caliper=0.2;
assess lps var=(binary_covariate continuous_covariates)
) / plots=(BoxPlot StdDiff barchart);
output out(obs=match)=matched_data matchid=MID ;
run;

3 REPLIES 3
SteveDenham
Jade | Level 19

How was it resolved? Kind of went from an interesting problem to "nothing to see here."

 

SteveDenham

Callam1
Obsidian | Level 7
Hi, basically the problem was that in the last line of the code I only had the name for one machid. When k=n, you need n names. So it should be machid=(mid1 mid2 mid3 …midn).
In this way there are no more missing machid in the matched dataset.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 990 views
  • 1 like
  • 2 in conversation