turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Non normal distribution in regression

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-25-2016 05:45 AM

May I request someone to shed some light on stattistical test to be conducted when erros in regression don't follow the normal distribution?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-25-2016 06:01 AM

Look at the documentation for the GENMOD procedure, which includes sections about Goodness-of-Fit tests and related statistics. The doc for PROC GENMOD also explain estimates and contrasts.

If you provide more information about your model, more can be said.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-25-2016 06:31 AM

Thank you for your response Rick.

It is a general question which I came across , hence anticipating the simple answer in layman's term rather than bookish languague.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-25-2016 06:59 AM

You might enjoy this graphical comparison of the assumptions for error distributions in linear and nonlinear models:

- The error distribution for linear models
- A comparison of a linear model and a GLM with log link function

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-25-2016 07:14 AM

Thanks again Rick.

So can I assume that answer to my question is 'proc genmod'?

I've also an another novice question - I know to find whether data is following normal distribution or not , but I don't know how to find whether error is following normal distribution or not. May I request you to guide me on this?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-25-2016 07:41 AM

PROC GENMOD is a good place to start for fitting models of this type. There are alternatives, especially if you think the errors are correlated (as in a time series), but I don't want to overwhelm you with too many options.

I recommend that you do an internet search for "SAS" and "regression diagnostics" or "diagnostic plots". This is a deep area that is worth learning about.

A (very) short answer is that when the response variable is contnuous, you can examine the error distribution by fitting a model and then plotting the distribution of the raw residuals. Most SAS procedures, including GENMOD, have an OUTPUT statement that enables you to write the residual values to a data set. The simplest plot is a histogram of the residuals. Does the histogram look approximately "bell shaped"?

You can also plot the raw residuals versus each of the explanatory variables. If any of the plots look "fan shaped" (the size of the residuals depend on an X), that indicates that the model is not capturing the variation in the data. If so, many practitioners try to fit a more sophisticated model.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-25-2016 10:25 AM

Error is data...you should be able to isolate your error terms if you run a regression model.

Babloo wrote:

I know to find whether data is following normal distribution or not , but I don't know how to find whether error is following normal distribution or not.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-25-2016 09:54 PM - edited 01-25-2016 09:56 PM

Just a thought.

You can use proc univariate to check X and Y to see whether they are all normal distribution or not . If they were all conform to normal then you can say residual term is normal distribution.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-26-2016 03:55 PM

If the dependent variable is continuous but the assumptions of OLS regression are not met regarding normality of residuals, then I suggest PROC ROBUSTREG and PROC QUANTREG both of which relax those assumptions. I've written papers on these for last years SGF.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-27-2016 01:21 AM

Are you saying that if my data follows a normal distribution then error in the data will also follow a normal distribution? If not, may I request you to write a simple SAS code to demonstarte normal distribution for errors?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-27-2016 02:34 AM

Yes. According to Statistical Theory , any linear combination of normal variables is also normal distribution. Therefore,

Y-X= epsilon , if Y and X all conform to normal distribution then epsilon also conform normal.

Otherwise, you could use other Robust Regression Method as other suggest .

This is just my two cents.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-27-2016 07:23 AM

But the converse isn't true. That is, you can have Y be non-normal and still have normal residual

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-27-2016 10:35 AM

Look at it this way. If Y is dependent (conditional) on X, then it is irrelevant to test whether Y is normally distriubuted (independent of X). That is, using proc univariate to assess normality of Y is meaningless. You want to check the normality of the residuals, or better, the normality of the studentized residuals. This is automatically done in graphic form by several procedures.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-27-2016 10:43 AM

Look at the Fit Diagnostics panel from Proc Reg. I think it's produced by default these days.

These charts help assess normality of the Residuals. You could also extract the residuals from proc reg and pass them to proc NPAR1WAY which has a bunch of tests for normality.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-27-2016 10:39 AM

But OLS doesn't have an assumption that the X and Y are normally distributed, only the errors. More an assumption that they're random rather than systematic.