BookmarkSubscribeRSS Feed
clambert22
Fluorite | Level 6

Hi SAS friends! Hoping for some advice (and new ideas) here...

 

I am trying to use ANOVA to evaluate the relationship between an independent categorical variable with multiple levels and a dependent continuous variable.

 

I used PROC GLM to conduct my test and also requested some nonparametric test options and tests for unequal variance (Levene's, Welch's ANOVA). The distribution of my dependent variable is heavily skewed.

 

Here's my original code:

 

ods graphics on;

proc glm data = mydata plots(maxpoints=none)=diagnostics;
	class  independent;
	model dependent = independent;
	means  independent/hovtest welch;
run;

ods graphics off;

 

Then, I realized that since the survey design includes weighting and stratification variables that I needed to take those into account. PROC GLM allowed me to add the weighting variable but doesn't appear to have options for nonparametric tests. I switched to PROC SURVEYREG which allowed for the inclusion of both weighting and stratification variables but still no test options beyond the initial ANOVA.

 

Here's my amended code:

 

proc surveyreg data = mydata;
     weight weightvar;
     strata stratavar;
     model dependent = independent / anova;
run;

 

Should I be using a different PROC? A totally different test? Is there an option that I'm missing in SURVEYREG? Help!

5 REPLIES 5
PaigeMiller
Diamond | Level 26

The data can be skewed, this isn't a problem for GLM or SURVEYREG. The actual condition required is that the residuals (the difference between predicted and actual values) are normally distributed. You can examine the residuals and see if they follow a normal distribution or not.

 

Assuming the residuals are normally distributed, I would think that SURVEYREG would handle the weighting properly.

--
Paige Miller
clambert22
Fluorite | Level 6

Thanks for the reply!! 🙂

 

I checked and unfortunately the residuals are also heavily skewed. I'm thinking that maybe I'm just not looking at this correctly and need to adjust which test I'm using/my research question?

PaigeMiller
Diamond | Level 26

Can you show us a screen capture of the residual plot?

 

If they are skewed, perhaps a transformation of the data would help (depending on the severity of the skewing) to achieve the normal distribution of the residuals.

--
Paige Miller
clambert22
Fluorite | Level 6

Here are all of the diagnostic plots (let me know if this is what you meant!). Thank you so much for your help!

 

 

screencap.png

PaigeMiller
Diamond | Level 26

Obviously, the residuals are not normally distributed, and its not obvious to me that you can transform the data to make them normal. So, I would then consider non-parametric methods, although I'm not sure how the survey weights would apply.

--
Paige Miller

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1603 views
  • 1 like
  • 2 in conversation