BookmarkSubscribeRSS Feed
elbarto
Obsidian | Level 7

I am using psmatch to do a simple propensity score matching as follows:

 

A snapshot of my data is the following.

 

DATA have;
input year firm_id place_id treatment var1 var2;
DATALINES;
2000	47500	33019	0	0.026167367	0.118107848
2000	81175	31079	0	0.021634581	0.11606627
2000	174572	34003	0	0.008156653	0.108517274
2000	174572	34005	0	0.007877977	0.091809787
2000	174572	34007	0	0.010265773	0.085315727
2000	174572	34015	0	0.010941376	0.086731717
2000	223005	9009	0	0.009077159	0.088563122
2000	268006	33013	0	0.01309349	0.101272136
2000	268006	50027	0	0.009887925	0.098948561
2000	174572	9009	1	0.008821509	0.104278341
2000	174572	33013	1	0.009961594	0.120187096
2000	174572	45234	1	0.004693746	0.119641706
2000	223005	31079	1	0.005699048	0.144180074
2000	223005	33019	1	0.005843101	0.112463228
2000	223005	42311	1	0.005326652	0.138652623
;
RUN;

I want to do one-to-one nearest neighbor with replacement propensity score matching between treated observations (i.e., treatment=1) with control observations (i.e., treatment=0). Observations must be matched within the same year BUT each treated observation must be matched to a control observation with a DIFFERENT firm_id (that has the closest propensity score). I am using the following code:

 

 

proc psmatch data=have region=allobs;
	class treatment year;
	psmodel treatment(Treated='1') = var1 var2;
	match method=replace(k=1) stat=ps exact=year caliper=.;
	output out(obs=match)=have_match matchid=_MatchID;
run;

data have_match(keep=year firm_id place_id treatment _PS_ _MATCHWGT_ _matchid);set have_match;run;

However, in the resultant "have_match" dataset:

result.png

we can clearly see that row 4 and row 5 are matched with row 1, which has the same firm_id of 174572. All other matches are fine because they are matched to a different firm_id. How can I fix the code using psmatch so that I can restrict the matches to a different firm_id? Thank you!

 

 

 

2 REPLIES 2
elbarto
Obsidian | Level 7

I am using psmatch to do a simple propensity score matching as follows:

 

A snapshot of my data is the following.

DATA have;
input year firm_id place_id treatment var1 var2;
DATALINES;
2000	47500	33019	0	0.026167367	0.118107848
2000	81175	31079	0	0.021634581	0.11606627
2000	174572	34003	0	0.008156653	0.108517274
2000	174572	34005	0	0.007877977	0.091809787
2000	174572	34007	0	0.010265773	0.085315727
2000	174572	34015	0	0.010941376	0.086731717
2000	223005	9009	0	0.009077159	0.088563122
2000	268006	33013	0	0.01309349	0.101272136
2000	268006	50027	0	0.009887925	0.098948561
2000	174572	9009	1	0.008821509	0.104278341
2000	174572	33013	1	0.009961594	0.120187096
2000	174572	45234	1	0.004693746	0.119641706
2000	223005	31079	1	0.005699048	0.144180074
2000	223005	33019	1	0.005843101	0.112463228
2000	223005	42311	1	0.005326652	0.138652623
;
RUN;

I want to do one-to-one nearest neighbor with replacement propensity score matching between treated observations (i.e., treatment=1) with control observations (i.e., treatment=0). Observations must be matched within the same year BUT each treated observation must be matched to a control observation with a DIFFERENT firm_id (that has the closest propensity score). I am using the following code:

 

proc psmatch data=have region=allobs;
	class treatment year;
	psmodel treatment(Treated='1') = var1 var2;
	match method=replace(k=1) stat=ps exact=year caliper=.;
	output out(obs=match)=have_match matchid=_MatchID;
run;

data have_match(keep=year firm_id place_id treatment _PS_ _MATCHWGT_ _matchid);
set have_match;
run;

However, in the resultant "have_match" dataset:

elbarto_0-1654733524945.png

we can clearly see that row 4 and row 5 are matched with row 1, which has the same firm_id of 174572. All other matches are fine because they are matched to a different firm_id. How can I fix the code using psmatch so that I can restrict the matches to a different firm_id? Thank you!

 

JOL
SAS Employee JOL
SAS Employee
Repost under Analytics -> Statistical Procedures

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 339 views
  • 0 likes
  • 2 in conversation