Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Re: Non normal distribution in regression

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 01-25-2016 05:45 AM
(4885 views)

May I request someone to shed some light on stattistical test to be conducted when erros in regression don't follow the normal distribution?

16 REPLIES 16

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Look at the documentation for the GENMOD procedure, which includes sections about Goodness-of-Fit tests and related statistics. The doc for PROC GENMOD also explain estimates and contrasts.

If you provide more information about your model, more can be said.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thank you for your response Rick.

It is a general question which I came across , hence anticipating the simple answer in layman's term rather than bookish languague.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You might enjoy this graphical comparison of the assumptions for error distributions in linear and nonlinear models:

- The error distribution for linear models
- A comparison of a linear model and a GLM with log link function

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks again Rick.

So can I assume that answer to my question is 'proc genmod'?

I've also an another novice question - I know to find whether data is following normal distribution or not , but I don't know how to find whether error is following normal distribution or not. May I request you to guide me on this?

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

PROC GENMOD is a good place to start for fitting models of this type. There are alternatives, especially if you think the errors are correlated (as in a time series), but I don't want to overwhelm you with too many options.

I recommend that you do an internet search for "SAS" and "regression diagnostics" or "diagnostic plots". This is a deep area that is worth learning about.

A (very) short answer is that when the response variable is contnuous, you can examine the error distribution by fitting a model and then plotting the distribution of the raw residuals. Most SAS procedures, including GENMOD, have an OUTPUT statement that enables you to write the residual values to a data set. The simplest plot is a histogram of the residuals. Does the histogram look approximately "bell shaped"?

You can also plot the raw residuals versus each of the explanatory variables. If any of the plots look "fan shaped" (the size of the residuals depend on an X), that indicates that the model is not capturing the variation in the data. If so, many practitioners try to fit a more sophisticated model.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Error is data...you should be able to isolate your error terms if you run a regression model.

@Babloo wrote:

I know to find whether data is following normal distribution or not , but I don't know how to find whether error is following normal distribution or not.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Just a thought.

You can use proc univariate to check X and Y to see whether they are all normal distribution or not . If they were all conform to normal then you can say residual term is normal distribution.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Yes. According to Statistical Theory , any linear combination of normal variables is also normal distribution. Therefore,

Y-X= epsilon , if Y and X all conform to normal distribution then epsilon also conform normal.

Otherwise, you could use other Robust Regression Method as other suggest .

This is just my two cents.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

But the converse isn't true. That is, you can have Y be non-normal and still have normal residual

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Look at the Fit Diagnostics panel from Proc Reg. I think it's produced by default these days.

These charts help assess normality of the Residuals. You could also extract the residuals from proc reg and pass them to proc NPAR1WAY which has a bunch of tests for normality.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.