Re: Problem propensity score matching proc psmatch

ammarhm · Posted 10-15-2020 05:46 AM

Hi everyone,

I am using proc psmatch to do propensity score matching for two treatment levels. The code below includes the data and the matching code (using different techniques), but the problem is that I am still not able to achieve adequate matching for one of the variables (variable 'Level'). You can see from the univariate analysis in the step after proc psmatch, using proc npar1way that the p-value for Level by Treatment is 0.008 indicating significant difference in the variable Level between the two treatment groups.

I wonder if anyone could assist and suggest how to overcome this problem and improve the matching?

Much appreciated.

DATA Have;
   input Age Sex $ Size Stage $ Level Marker  Treatment;
   DATALINES;

82	0	23	A	8	5.4	1
82	0	15	A	8	5.4	1
61	1	42	A	9	6.7	1
62	1	12	B	7	6.9	1
56	1	19	B	6	3.7	1
56	1	11	B	6	3.7	1
56	1	16	B	6	3.7	1
56	1	10	B	6	3.7	1
84	1	33	A	6	1110.5	1
64	1	34	A	9	1454.2	1
71	0	16	A	10	5.9	1
84	0	31	A	7	66	1
66	1	35	A	6	.	1
70	1	30	A	.	.	1
65	1	19	0	6	4.2	1
62	1	26	A	6	3.7	1
62	1	15	A	6	3.7	1
39	1	20	A	8	78.9	1
59	1	19	0	7	29.3	1
59	1	25	A	7	18.4	1
57	1	18	0	8	6.4	1
54	0	45	A	6	791.8	1
56	0	23	A	13	343.9	1
73	1	21	A	12	5.1	1
68	1	20	A	9	0.8	1
75	1	20	A	9	1.7	1
64	1	14	A	10	7.5	1
73	1	28	A	17	17	0
62	0	14	A	15	5	0
71	0	11	0	9	4.1	0
66	1	18	0	10	2.1	0
77	1	15	A	8	9.3	0
77	1	12	A	8	9.3	0
77	1	12	A	8	9.3	0
77	1	14	A	8	11.5	0
77	1	13	A	8	11.5	0
57	1	31	A	17	9	0
81	1	22	A	13	7	0
41	0	21	A	13	15	0
53	1	19	0	8	212.8	0
58	1	28	A	10	22.1	0
63	1	18	A	11	10.1	0
63	1	17	A	11	10.1	0
63	1	10	A	11	14.2	0
63	1	13	A	11	14.2	0
56	1	15	0	7	4	0
54	1	15	A	8	7.2	0
54	1	15	A	8	7.2	0
72	1	21	A	6	520.4	0
63	0	17	0	6	.	0
55	1	22	A	11	3.2	0
56	1	13	0	8	2.5	0
54	1	12	A	10	3.2	0
54	1	20	A	10	3.2	0
60	1	27	A	10	53.2	0
60	1	17	0	10	23.4	0
74	1	16	0	16	6.5	0
60	1	16	A	.	.	0
59	1	27	B	7	8.5	0
59	1	17	B	7	8.5	0
59	1	31	B	7	8.5	0
59	1	22	B	7	8.5	0
59	1	24	B	7	8.5	0
70	1	15	0	11	1	0
68	1	16	A	9	1	0
65	1	16	0	5	1.2	0
67	1	17	0	10	4	0
55	1	10	0	8	49	0
65	1	16	A	12	2.5	0
65	1	11	A	12	2.5	0
65	1	10	B	12	6.8	0
65	1	13	B	12	6.8	0
65	1	22	B	12	6.8	0
65	1	26	B	12	6.8	0
63	1	20	A	12	2.2	0
59	1	24	A	7	58.8	0
69	0	27	A	7	2	0
63	1	45	A	12	5607.7	0
46	1	19	0	19	13	0
59	1	18	0	14	9.9	0
55	1	12	A	12	499.5	0
53	1	13	0	12	18.3	0
60	1	13	0	7	1.9	0
75	1	23	A	7	58.4	0
74	1	12	0	7	19.5	0
58	1	15	0	9	319.7	0
68	1	23	A	10	3.1	0
51	1	20	A	8	7.6	0
51	1	18	A	8	7.6	0
52	1	20	A	7	3.6	0
66	1	18	0	6	5.8	0
69	1	20	A	10	.	0
38	1	16	0	9	10.7	0
66	1	12	B	8	61.3	0
66	1	31	B	8	61.3	0
59	1	12	0	8	18.4	0
65	1	32	B	10	9.8	0
67	1	23	A	10	10.5	0
70	1	15	0	16	5.2	0
71	1	20	A	20	81.3	0
64	0	16	A	11	18.2	0
64	0	17	A	11	18.2	0
64	0	14	A	11	18.2	0
52	0	17	0	17	.	0
60	1	17	0	7	23.8	0
59	1	16	A	10	40.6	0
59	1	12	A	10	40.6	0
58	1	26	A	11	91.1	0
50	1	34	B	11	2292.6	0
50	1	14	B	11	2292.6	0
50	1	34	B	11	2292.6	0
69	1	20	A	9	3.9	0
69	1	17	A	9	3.9	0
69	1	23	A	9	3.9	0
68	1	12	A	16	2.3	0
68	1	14	A	16	2.3	0
68	1	13	B	16	3.8	0
68	1	12	B	16	3.8	0
68	1	14	B	16	3.8	0
68	1	14	B	16	3.8	0
68	1	26	B	16	5	0
68	1	14	B	16	5	0
68	1	26	B	16	5	0
68	1	14	B	16	5	0
55	1	13	0	8	4.6	0
49	1	14	0	17	38.5	0
59	1	31	A	11	24.9	0
65	1	20	A	7	142.1	0
66	1	16	A	20	6.3	0
66	1	15	A	20	6.3	0
62	0	20	A	17	651.7	0
62	1	30	A	8	540.8	0
62	1	16	A	8	540.8	0
61	1	33	B	8	4.1	0
61	1	13	B	8	4.1	0
66	1	22	A	10	8	0
66	1	14	A	10	8	0
53	0	20	A	14	28	0
61	0	22	A	10	41.7	0
49	1	18	0	19	5	0
85	0	12	A	10	20	0
85	0	20	A	10	20	0
85	0	18	A	10	20	0
69	0	14	0	15	6.8	0
68	0	12	0	10	27	0
67	0	15	0	10	19.5	0
76	1	26	A	9	21.8	0
64	1	27	A	6	6	0
85	0	17	0	9	0.9	0
61	1	20	A	13	924.6	0
57	1	24	A	11	84.4	0
56	1	12	0	9	29.2	0
57	1	21	A	14	10.6	0
57	1	22	A	14	10.6	0
45	1	20	A	15	5.8	0
66	0	15	0	8	5.2	0
58	1	13	0	13	11.3	0
56	1	16	0	11	1.2	0
80	0	18	0	7	13	0
80	0	25	A	6	.	0
60	1	21	A	10	3.1	0
60	1	17	0	11	3	0
53	1	22	A	13	10.6	0
68	0	17	0	11	2	0
56	1	25	A	14	87	0
56	1	13	A	14	87	0
81	1	28	A	10	2.7	0
58	0	10	A	9	20.5	0
58	0	14	A	9	20.5	0
66	1	32	B	15	121.8	0
66	1	26	B	15	121.8	0
66	1	12	B	15	121.8	0
56	1	25	A	11	8.7	0
62	1	16	0	6	.	0
73	0	17	0	11	.	0
74	0	10	0	12	5.8	0
76	1	38	A	8	8.7	0
53	1	35	A	8	8720	0
60	1	26	A	21	9.6	0
68	1	19	A	7	3.8	0
68	1	17	A	7	3.8	0
81	1	20	A	6	1.5	0
81	1	14	A	6	1.5	0
;
run;


ods graphics on;
proc psmatch data=have;
class Stage Sex Treatment ;
psmodel Treatment(Treated='1') = Level Sex Age  Stage Size  Marker ;
match method=greedy(k=4) distance=lps caliper=0.20;
*match method=optimal(k=1) stat=lps caliper=0.20;
*match stat=ps method=varratio(kmin=1 kmax=10)  caliper=0.4;
assess lps var=( Age   Level  Size Marker ) /plots=(CDFPlot BoxPlot StdDiff);
output out(obs=all)= Matched  matchid=MID;
run;
ods graphics off;

proc npar1way data=Matched wilcoxon;
var Level  Age   Size  Marker;
class Treatment;
where mid ne .;
run;
proc freq data= matched;
tables treatment*(sex stage)/norow nocol chisq;
where mid ne .;
run;

ballardw · Posted 10-15-2020 01:07 PM

When you run your code do you get a warning like this in the log?

WARNING: Some treated units have less than the specified K=4
         matched controls because there are not enough
         available controls for these treated units.

If the example data set you show is all of your data then sample size may be an issue because your combination of values for the PSMODEL statement independent variables almost uniquely identifies records.

ammarhm · Posted 10-15-2020 08:06 PM

Thanks @ballardw

This was all the dataset I had.

Could I ask how you would approach such an issue? If you cannot match using propensity score what are the alternatives?

Is performing the analysis with inverse probability of treatment weighting an alternative?

Thanks.

ammarhm · Posted 10-16-2020 08:16 AM

Does anyone in the community has experience doing inverse probably of treatment weighing (IPTW) when propensity score matching fails?
Basically I am trying to do survival analysis (proc lifetest and proc pheg) in the final dataset either with matching by treatment or weighing with IPTW. Any advice is much appreciated.

ballardw · Posted 10-16-2020 10:31 AM

@ammarhm wrote:

Thanks @ballardw

This was all the dataset I had.

Could I ask how you would approach such an issue? If you cannot match using propensity score what are the alternatives?

Is performing the analysis with inverse probability of treatment weighting an alternative?

Thanks.

I might try reducing the number of variables on the PSMODEL statement. I don't know what any of those mean other than moderately confident in Sex and Age. You might get slightly better results by grouping ages, such as in 5 or 10-year groups unless you have other information about how age affects this process.

Problem propensity score matching proc psmatch