Solved: Calculating Cohen's d from multiply imputed data

statshuevo · Posted 04-29-2024 02:58 PM

Hi everyone, apologies if this has been asked but I couldn't find an answer for my specific question. I wanted to get confirmation that what I am doing is appropriate. I have 20 imputed datasets. The grouping variable (2 groups) is called "condition". To compute pooled means across the imputed datasets I used the estimate statement in PROC MIXED for each level of condition:

ODS TRACE ON;
ODS OUTPUT SolutionF=solutionf Estimates =est3;

PROC MIXED data=data noclprint covtest method=ml;

class tid pid condition (ref='0');

model totalscore=condition/solution ddfm=bw;

random intercept/sub=tid type=un;

estimate "Intercept: 0" intercept 1 condition 0 1;
estimate "Intercept: 1" intercept 1 condition 1 0 ;
estimate "Intercept: 1-0" intercept 0 condition 1 -1 ;

store out=model;

by _Imputation_;

run;

proc mianalyze data=est3;

modeleffects Estimate;

stderr stderr;

run;

For example, let's say this procedure gave me pooled estimates (means) of 8 for condition=0 and 7 for condition=1.

Then I calculated pooled SD using proc ttest and mianalyze:

proc ttest data=data;
class condition;
var totalscore;
by _Imputation_;
ods output statistics=ttest_ds
run;
proc sort data=ttest_ds;
by class _Imputation_;
run;

/*The dataset ttest_ds has Class variables of 0 and 1 (for the groups) and "Diff (1-2)" Pooled and Satterthwaite. It also has columns for StdDev and StdErr by Imputation for the two classes.*/

data subset;
set ttest_ds;
where Class='0'; /*switch to 1 for condition=1*/
run;
proc mianalyze data=subset;
by class;
modeleffects stddev;
stderr stderr;
run;

This gave me pooled SDs of, for example, 5.8 for condition=0 and 5.6 for condition=1.

Then I calculated Cohen's d:

data cd;
mean_diff_rand1 = 8 - 7; /* pooled means for conditions = 0 and 1*/
pooled_std_dev_rand1 = sqrt((5.8**2 + 5.6**2) / 2); /* pooled SDs for conditions = 0 and 1 */
d_rand1 = mean_diff_rand1 / pooled_std_dev_rand1;
run;
proc print data=cd; /*printing result*/
run;

I hope this is enough information. Thank you kindly in advance!

SAS_Rob · Posted 04-30-2024 08:20 AM

My suggestion would be to calculate Cohen's D and its standard error for each imputed data set and then use Proc MIANALYZE to get the combined estimate.

You can find several variations of the standard error calculation here:

effect size - What is the formula for the standard error of Cohen's d - Cross Validated (stackexchan...

This would be a simpler approach than what you are taking I believe. Plus, I think the last MIANALYZE step is wrong since the standard error you are supplying is not for the standard deviation which is the statistic you are combining.

proc mianalyze data=subset;
by class;
modeleffects stddev;
stderr stderr;*this is not the standard error for the standard deviation;
run;

View solution in original post

SAS_Rob · Posted 04-30-2024 08:20 AM

My suggestion would be to calculate Cohen's D and its standard error for each imputed data set and then use Proc MIANALYZE to get the combined estimate.

You can find several variations of the standard error calculation here:

effect size - What is the formula for the standard error of Cohen's d - Cross Validated (stackexchan...

This would be a simpler approach than what you are taking I believe. Plus, I think the last MIANALYZE step is wrong since the standard error you are supplying is not for the standard deviation which is the statistic you are combining.

proc mianalyze data=subset;
by class;
modeleffects stddev;
stderr stderr;*this is not the standard error for the standard deviation;
run;

statshuevo · Posted 04-30-2024 11:31 AM

I see, that makes sense. Thank you so much for your helpful response!

Calculating Cohen's d from multiply imputed data

Re: Calculating Cohen's d from multiply imputed data

Re: Calculating Cohen's d from multiply imputed data

Re: Calculating Cohen's d from multiply imputed data

Catch up on SAS Innovate 2026