Hi everyone, apologies if this has been asked but I couldn't find an answer for my specific question. I wanted to get confirmation that what I am doing is appropriate. I have 20 imputed datasets. The grouping variable (2 groups) is called "condition". To compute pooled means across the imputed datasets I used the estimate statement in PROC MIXED for each level of condition:
ODS TRACE ON;
ODS OUTPUT SolutionF=solutionf Estimates =est3;
PROC MIXED data=data noclprint covtest method=ml;
class tid pid condition (ref='0');
model totalscore=condition/solution ddfm=bw;
random intercept/sub=tid type=un;
estimate "Intercept: 0" intercept 1 condition 0 1;
estimate "Intercept: 1" intercept 1 condition 1 0 ;
estimate "Intercept: 1-0" intercept 0 condition 1 -1 ;
store out=model;
by _Imputation_;
run;
proc mianalyze data=est3;
modeleffects Estimate;
stderr stderr;
run;
For example, let's say this procedure gave me pooled estimates (means) of 8 for condition=0 and 7 for condition=1.
Then I calculated pooled SD using proc ttest and mianalyze:
proc ttest data=data;
class condition;
var totalscore;
by _Imputation_;
ods output statistics=ttest_ds
run;
proc sort data=ttest_ds;
by class _Imputation_;
run;
/*The dataset ttest_ds has Class variables of 0 and 1 (for the groups) and "Diff (1-2)" Pooled and Satterthwaite. It also has columns for StdDev and StdErr by Imputation for the two classes.*/
data subset;
set ttest_ds;
where Class='0'; /*switch to 1 for condition=1*/
run;
proc mianalyze data=subset;
by class;
modeleffects stddev;
stderr stderr;
run;
This gave me pooled SDs of, for example, 5.8 for condition=0 and 5.6 for condition=1.
Then I calculated Cohen's d:
data cd;
mean_diff_rand1 = 8 - 7; /* pooled means for conditions = 0 and 1*/
pooled_std_dev_rand1 = sqrt((5.8**2 + 5.6**2) / 2); /* pooled SDs for conditions = 0 and 1 */
d_rand1 = mean_diff_rand1 / pooled_std_dev_rand1;
run;
proc print data=cd; /*printing result*/
run;
I hope this is enough information. Thank you kindly in advance!
My suggestion would be to calculate Cohen's D and its standard error for each imputed data set and then use Proc MIANALYZE to get the combined estimate.
You can find several variations of the standard error calculation here:
This would be a simpler approach than what you are taking I believe. Plus, I think the last MIANALYZE step is wrong since the standard error you are supplying is not for the standard deviation which is the statistic you are combining.
proc mianalyze data=subset;
by class;
modeleffects stddev;
stderr stderr;*this is not the standard error for the standard deviation;
run;
My suggestion would be to calculate Cohen's D and its standard error for each imputed data set and then use Proc MIANALYZE to get the combined estimate.
You can find several variations of the standard error calculation here:
This would be a simpler approach than what you are taking I believe. Plus, I think the last MIANALYZE step is wrong since the standard error you are supplying is not for the standard deviation which is the statistic you are combining.
proc mianalyze data=subset;
by class;
modeleffects stddev;
stderr stderr;*this is not the standard error for the standard deviation;
run;
I see, that makes sense. Thank you so much for your helpful response!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.