BookmarkSubscribeRSS Feed
twagner
Calcite | Level 5

Hello all,

 

I'm conducting a propensity score match with the %GMATCH macro. There is a great content out there outlining sample code within a textbook with this macro that I have followed to ensure that my code is appropriate.

 

Pre-Match we have 199 patients in the intervention group (treatment) and 802 observations in the comparator group (control).

 

My 1:1 match was great, no issues. 180 patients matched to 180 patients. The code to call the %gmatch macro is below.

 

%gmatch(
  data = out_ps_dm9,
  group = study_group,
  id = id,
  mvars = logit_ps_dm9,
  wts = 1,
  dist = 1,
  dmaxk = &stdcal,
  ncontls = 1,
  seedca = 25102007,
  seedco = 26102007,
  out = matchpairs_dm9,
  print = F
);

** Currently doing a 1 to 1 match, Match subjects on the logit of the propensity score. **;

I'm struggling with the 1:2 match. When matching 1:2, the treatment group (intervention group) is being matched at a rate of 1:2, but the final dataset has 359 patients in intervention group and 359 patients in the comparator group. It is still only ~180 patients from the intervention group in the final dataset, but they are listed twice because they are associated with two pairs in the comparator group.

 

How do I analyze the dataset with 359:359 patients? Using PROC PSMATCH and doing a 1:2 match, I had 180 patients matched to 360 patients in my final dataset.

 

Below is my code of doing the %GMATCH with 1:2 matching:

 

%gmatch(
  DATA = out_ps_dm9, /* Data: the name of the SAS data set containing the treated and untreated subjects.*/
  group = study_group, /* Group: the variable identifying treated/untreated subjects. */
  id = id, /* Id: the variable denoting subjects’ identification numbers. */
  mvars = logit_ps_dm9, /* Mvars: the list of variables on which one is matching.*/
  wts = 1, /* Wts: the list of non-negative weights corresponding to each matching variable.*/
  dist = 1,/* Dist: the type of distance to calculate [1 indicates weighted sum (over matching variables) of absolute case-control differences]*/
  dmaxk = &stdcal, /* Dmaxk: the maximum allowable difference in the matching difference between matched treated and untreated subjects.*/
  ncontls = 2, /* Ncontls: the number of untreated subjects to be matched to each treated subject.*/
  seedca = 25102007, /* Seedca: the random number seed for sorting the treated subjects prior to matching.*/
  seedco = 26102007, /* Seedco: the random number seed for sorting the untreated subjects prior to matching. */
  out = matchpairs_dm92, /* Out: the name of a SAS data set containing the matched sample.*/
  print = F /* Print: the flag indicating whether the matched data should be printed.*/
);
** Currently doing a 1 to 2 match, matching subjects on the logit of the propensity score. **;

DATA matchpairs_dm92;
  SET matchpairs_dm92;
  pair_id = _N_;
RUN;

PROC CONTENTS DATA = matchpairs_dm9 POSITION; RUN;

DATA control_match_dm92;
  SET matchpairs_dm92;
  control_id = __IDCO;
  logit_ps = __CO1;
  KEEP pair_id control_id logit_ps;
RUN;
/* Create a data set containing the matched comparator patients (untreated subjects) */

PROC CONTENTS DATA = control_match_dm92 POSITION; RUN;

DATA case_match_dm92;
  SET matchpairs_dm92;
  case_id = __IDCA;
  logit_ps = __CA1;
  KEEP pair_id case_id logit_ps;
RUN;
/* Create a data set containing the matched intervention patients (treated subjects) */

PROC CONTENTS DATA = case_match_dm92 POSITION; RUN;

PROC SORT DATA=control_match_dm92; BY control_id; RUN;
PROC SORT DATA=case_match_dm92; BY case_id; RUN;

DATA exposed_dm92;
	SET out_ps_dm9;
	IF study_group = 1;
	case_id = id;
RUN;

PROC CONTENTS DATA = exposed_dm92 POSITION; RUN;

DATA control_dm92;
	SET out_ps_dm9;
	IF study_group = 0;
	control_id = id;
RUN;

PROC CONTENTS DATA = control_dm92 POSITION; RUN;

PROC SORT DATA=exposed_dm92; BY case_id; RUN;
PROC SORT DATA=control_dm92; BY control_id; RUN;

DATA control_match_dm92;
  MERGE control_match_dm92 (IN=f1) control_dm92 (IN=f2);
  BY control_id;
  IF f1 and f2;
RUN;

PROC CONTENTS DATA = control_match_dm92 POSITION; RUN;

DATA case_match_dm92;
  MERGE case_match_dm92 (IN=f1) exposed_dm92 (IN=f2);
  BY case_id;
  IF f1 and f2;
RUN;

PROC CONTENTS DATA = case_match_dm92 POSITION; RUN;

DATA long_dm92;
  SET control_match_dm92 case_match_dm92;
  prop_score = exp(logit_ps) / (exp(logit_ps) + 1);
RUN;

Thank you! I want to ensure that I am appropriately assessing both my groups with a 1:2 match and a 1:3 match.

Tyler

3 REPLIES 3
awesome_opossum
Obsidian | Level 7

For analysis, it's essentially a problem of 1) non-independence of observations, and 2) weighting of the multiple control groups in the analysis.  Below is assuming all of your matched cases from both control groups (combined) are unique observations--not duplicated. 

 

The simplest solution, which is probably reasonable acceptable to most people, is likely to simply analyze the two sets of matched pairs separately, and present it as a "replication".  If the results are the same for both matched groups, then you can say the same result is achieved with propensity score matching even with an independent control group.  If the results are different, it would merit investigation as to whether the difference is due to detectable differences in the matched characteristics of the two control groups (or their difference-in-difference with the study group).  Such differences could emerge, and be observed, for example, if a first iteration of the propensity match is a "closer match" than the second iteration of propensity match. 

 

Partial solutions, which I'm sure would be unsatisfactory to the scientific community (and to me), would be to:  a) code the two control groups (e.g. 1 vs. 0) into a variable and enter it as a covariate in a model using all the data, including the double study observations.  This would not solve the problem of non-independence, but would more-or-less ensure both control groups are weighted equally and fairly, removing any bias that snuck in from one propensity match iteration to the next.  b) Pool or average your responses from the two control groups into a single observation, as to be matched with a single observation of the study group.  This resolves the issue of non-independence, and partially ensures the control observations are weighted relatively equally; however, it definitely distorts the underlying true variance of observations in the control groups (which wouldn't necessarily be a problem if you had, say, 20 matched control groups, and could use their observed variance). 

 

A slightly more complicated solution would be to construct a multilevel model.  There, you can enter the study group ID as the identifier for level-2, and the control group ID's as independent observations for level-1.  Since you would presumably be predicting level-2 outcomes from level-1 predictors, you could use a simple fixed-effects model (which is relatively simple, nice).  This should take care of both your issues of non-independence and comparative weighting of the multiple control groups in a single analysis.  Also, as in a) above, you could again code and enter propensity match iteration as a covariate in the model. 

 

 

duongngoclanchi
Calcite | Level 5

Hi, I try to do the match 1:1 using exact codes.

 

I am getting stuck with these. Can you help me?

 

%include 'gmatch.sas';

%gmatch(
data = out_ps,
group = old,
id = id,
mvars = logit_ps,
wts = 1,
dist = 1,
dmaxk = &stdcal,
ncontls = 1,
seedca = 25102007,
seedco = 26102007,
out = matchpairs,
print = F
);

duongngoclanchi_0-1687802266595.png

 

duongngoclanchi
Calcite | Level 5

Hi,

I am doing the match 1:1 but I have an issue with this part. Can you help me?

 

%include 'gmatch.sas';

%gmatch(
data = out_ps,
group = old,
id = id,
mvars = logit_ps,
wts = 1,
dist = 1,
dmaxk = &stdcal,
ncontls = 1,
seedca = 25102007,
seedco = 26102007,
out = matchpairs,
print = F
);

 

 

duongngoclanchi_0-1687802333380.png

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 585 views
  • 0 likes
  • 3 in conversation