Hi,
I’ve been manually recalculating the Standardized Mean Difference (SMD) for categorical variables, but the values I calculate differ from those provided by PROC PSMATCH. Below is my approach and code for one specific variable, PCDK46.
This code generates the propensity score weights (_ATTWgt_) and outputs the standardized differences table.
ods graphics on;
proc psmatch data=ac_padsl region=allobs;
class PSTUDYID PECOGBL PCDK46 PHER2 PRACE PSTAGE PSTYESYN PSTNOYN PSTUNKYN PRASIAYN PRAFRYN PRWHTYN PROTHYN PRUNKYN;
psmodel PSTUDYID (Treated='Study') = PAGE PECOGBL PCDK46 PLINES PRACE PSTAGE PHER2;
psweight weight=attwgt nlargestwgt=6;
assess lps var=(PAGE PECOGBL PCDK46 PLINES PSTYESYN PSTNOYN PSTUNKYN PRASIAYN PRAFRYN PRWHTYN PROTHYN PRUNKYN PHER2)
/ varinfo plots=(barchart boxplot(display=(lps PAGE)) wgtcloud);
id PAGE PLINES PECOGBL PCDK46 PHER2;
output out(obs=all)=ac_OutEx1 weight=_ATTWgt_;
ods output StdDiff=ac_myStdDiff;
run;
ods graphics off;
PCDK46
/* Treatment group prevalence */
proc freq data=ac_OutEx1 noprint;
where pstudyid = "Study";
tables PCDK46 / nocol norow out=treatment_output (rename=(percent=prevalence_treatment));
weight _ATTWgt_;
run;
/* Control group prevalence */
proc freq data=ac_OutEx1 noprint;
where pstudyid = "Flatiron";
tables PCDK46 / nocol norow out=control_output (rename=(percent=prevalence_control));
weight _ATTWgt_;
run;
Merge Treatment and Control Data to Calculate SMD
data combined_output;
merge treatment_output (keep=PCDK46 prevalence_treatment)
control_output (keep=PCDK46 prevalence_control);
by PCDK46;
run;
/* Calculate SMD */
data smd_result_PCDK46;
set combined_output end=last;
/* Convert prevalence to proportions */
if PCDK46 = "Y" then do;
p_treatment = prevalence_treatment / 100;
p_control = prevalence_control / 100;
/* SMD Formula */
smd = (p_treatment - p_control) / sqrt(
((p_treatment * (1 - p_treatment)) + (p_control * (1 - p_control))) / 2
);
end;
variable = "PCDK46";
keep variable smd;
if last;
run;
From this i get 0.063704264 for PCDK46 but in PROC PSMATCH i get 0.05313. Can anyone provide any insights?
Yeah I tried that but no luck and this is what nlargestwgt does. removing it has no impact. It seems like i need a corrected formula for the variance of the weighted probabilities
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.