BookmarkSubscribeRSS Feed
phuzface
Fluorite | Level 6

I will admit from the start that this isn't really a programming question.  It's an interpretation of the results/statistical concept question.

 

I have a data set with mean salaries of different levels of professors (full, assistant, associate) from over 1,000 US colleges/universities.  I grouped all of the schools into 4 regions (Northeast, South, Midwest, West) and they're all rated on some kind of level of research facility (I, IIA, IIB).  I ran a two-way ANOVA test to model the overall average of all levels of professor, based on both region and research level:

 

proc glm data=Prof_Sal;
class REGION COL_TYPE;
model AVE_SAL_ALL = REGION COL_TYPE REGION*COL_TYPE;
run;

 

The output is below.  I don't understand what the Type I and Type III tables represent.  If the two tables are the same, that means that there's no interaction, right?  The F-statistics differ in each table but the p-values are the same but I also don't understand what that means. Please help me interpret the table!

 

Capture.PNG

3 REPLIES 3
PaigeMiller
Diamond | Level 26

In general, Type III is appropriate for this type of model, while Type I is not appropriate for this type of model.

 

REGION is statistically significant. COL_TYPE is statistically significant. The interaction is statistically significant. (All at the alpha=0.05 level)

 

 

--
Paige Miller
phuzface
Fluorite | Level 6

Thanks for this information.  To make sure I'm analyzing this correctly, what this output says is that the mean salary is statistically significantly different:  by "region," by itself, AND by "college level," by itself.  The output also says that there are statistically significant differences between mean salaries for at least one of the combinations of "region" and "college level."  Since there are four regions and three college level, 3 x 4 = 12, so there are twelve different combinations. 

 

Because there is statistically significant evidence of interaction between "region" and "college level" upon mean salary, it would be inappropriate to run one-way ANOVA for each of "region" and mean salary; and "college level" and mean salary...right?  Instead, I need to run a Tukey test to see which of the twelve combinations have statistically significantly different mean salaries.

 

Thanks again for your help with this.

PaigeMiller
Diamond | Level 26

I agree with all that yoou wrote except the part where you said "I need to run a Tukey test..."

 

You don't "need to" run the Tukey test, it's an option, among many options, to identify the parts of the interaction that are statistically different.

--
Paige Miller

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1968 views
  • 0 likes
  • 2 in conversation