BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
statshuevo
Calcite | Level 5

Hi everyone, apologies if this has been asked but I couldn't find an answer for my specific question. I wanted to get confirmation that what I am doing is appropriate. I have 20 imputed datasets. The grouping variable (2 groups) is called "condition". To compute pooled means across the imputed datasets I used the estimate statement in PROC MIXED for each level of condition:

ODS TRACE ON;
ODS OUTPUT SolutionF=solutionf Estimates =est3;

PROC MIXED data=data noclprint covtest method=ml;

class tid pid condition (ref='0');

model totalscore=condition/solution ddfm=bw;

random intercept/sub=tid type=un;

estimate "Intercept: 0" intercept 1 condition 0 1;
estimate "Intercept: 1" intercept 1 condition 1 0 ;
estimate "Intercept: 1-0" intercept 0 condition 1 -1 ;

store out=model;

by _Imputation_;

run;

 proc mianalyze data=est3;

modeleffects Estimate;

stderr stderr;

run;

For example, let's say this procedure gave me pooled estimates (means) of 8 for condition=0 and 7 for condition=1.

 

Then I calculated pooled SD using proc ttest and mianalyze:

proc ttest data=data;
class condition;
var totalscore;
by _Imputation_;
ods output statistics=ttest_ds
run;
proc sort data=ttest_ds;
by class _Imputation_;
run;

/*The dataset ttest_ds has Class variables of 0 and 1 (for the groups) and "Diff (1-2)" Pooled and Satterthwaite. It also has columns for StdDev and StdErr by Imputation for the two classes.*/

 

data subset;
set ttest_ds;
where Class='0'; /*switch to 1 for condition=1*/
run;
proc mianalyze data=subset;
by class;
modeleffects stddev;
stderr stderr;
run;

 

This gave me pooled SDs of, for example, 5.8 for condition=0 and 5.6 for condition=1.

 

Then I calculated Cohen's d:

data cd;
mean_diff_rand1 = 8 - 7; /* pooled means for conditions = 0 and 1*/
pooled_std_dev_rand1 = sqrt((5.8**2 + 5.6**2) / 2); /* pooled SDs for conditions = 0 and 1 */
d_rand1 = mean_diff_rand1 / pooled_std_dev_rand1;
run;
proc print data=cd; /*printing result*/
run;

 

I hope this is enough information. Thank you kindly in advance!

1 ACCEPTED SOLUTION

Accepted Solutions
SAS_Rob
SAS Employee

My suggestion would be to calculate Cohen's D and its standard error for each imputed data set and then use Proc MIANALYZE to get the combined estimate.

You can find several variations of the standard error calculation here:

effect size - What is the formula for the standard error of Cohen's d - Cross Validated (stackexchan...

 

This would be a simpler approach than what you are taking I believe.  Plus, I think the last MIANALYZE step is wrong since the standard error you are supplying is not for the standard deviation which is the statistic you are combining.

 

proc mianalyze data=subset;
by class;
modeleffects stddev;
stderr stderr;*this is not the standard error for the standard deviation;
run;

View solution in original post

2 REPLIES 2
SAS_Rob
SAS Employee

My suggestion would be to calculate Cohen's D and its standard error for each imputed data set and then use Proc MIANALYZE to get the combined estimate.

You can find several variations of the standard error calculation here:

effect size - What is the formula for the standard error of Cohen's d - Cross Validated (stackexchan...

 

This would be a simpler approach than what you are taking I believe.  Plus, I think the last MIANALYZE step is wrong since the standard error you are supplying is not for the standard deviation which is the statistic you are combining.

 

proc mianalyze data=subset;
by class;
modeleffects stddev;
stderr stderr;*this is not the standard error for the standard deviation;
run;

statshuevo
Calcite | Level 5

I see, that makes sense. Thank you so much for your helpful response!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 262 views
  • 2 likes
  • 2 in conversation