I wanted to create a inverse probability of treatment weight for further analysis, but when I tested it with chi-square test, the result was still significant. Could anyone tell me where I did wrong or is there a new way to create propensity score weight?
/*create propensity score*/
proc logistic data=data_final;
model treatment=age_group race_group insurance_group income education area facility_type
/link=glogit rsquare; *treatment is nominal and more than 3 categories, so I use the glogit here;
output out=ps pred=ps;
run;
*inverse probability of treatment weight, compute the inverse of the propensity score;
*weights are based on the entire treatment group and would give more weight to the smaller treatment groups;
data ps_w; set ps;
ps_weight=1/ps;
if treatment=_level_;
run;
*create a weight that reflects the sample size for each of the treatment groups;
proc sql;
create table ps_w_adj as
select *, (count(*)/13547)*ps_weight as ps_weight_adj *13547 here is sample size;
from ps_w
group by treatment;
quit;
And then I just want to check if the propensity score makes the group non-significant, I tested it by using chi-square test and the result was still significant.
proc freq data=ps_w_adj;
tables age_group*treatment/chisq measures;
weight ps_weight_adj;
run;
The propensity score weight was supposed make the groups had no significant difference. Could anyone help me with it? Thank you!
Without data or example output it is extremely hard to diagnose such issues.
Since the Weight variable in Proc Freq represents a count basically then wide ranges of values would be expected to yield a significant chi-square difference. Chi-squares with even moderate numbers of values can become very likely to report statistical significance even when there may be little practical difference between the categories.
It may help to include all the output from your Proc Freq to provide some details.
I haven't worked much with propensity scores so don't have any intuition. I do use Chi-squares a fair amount and have done them by hand so know that looking at categorical counts can yield statistically significant differences that the human may not see easily.
You might want to add the CELLCHI2 and DEVIATION options to the Tables statement. The first option gives you the cell contribution to the Chi-sqr statistic and the second the expected cell counts. That may tell you a bit about why you get a statistical significance.
Chi-square at heart compares expected cell counts from the marginal (row/column) totals with the actual count. Which is quite a bit different than a t-test or ANOVA. The more cells with "large" (a very relative to sample size) difference from the expected counts than the more likely for the test to report a significant difference. If you have relatively many categories in two dimensions things get extremely easy to have either an unreliable Chi-sqr, from too many cells will small or zero counts, or report a significant difference. Did you see any messages about "cell counts less than 5" in your output. That usually indicates that you maybe need a different test.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.