turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Re: What consititutes a non-normal distribution of...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-14-2016 03:37 AM

Thanks for the advice, I will give this a go. A colleague of mine also suggested subsampling the large distribution of residuals, and testing the smaller subsamples for normality. Perhaps if subsamples also failed the KS and/or other tests this might be a stronger indication of an undiagnosed problem?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-14-2016 09:11 AM

How are you conducting this analysis? GLM? GLIMMIX? Suppose you determine that the errors are slightly heavy-tailed? How will that change the way you conduct the analysis?

Are you just "verifying assumptions" or is there a real problem that you are trying to resolve?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-14-2016 10:13 AM

I am using proc mixed.

My model code is:

title 'TOTAL Hi frequency HI v LO FULL MLM';

**proc** **mixed** data=mlm_hi covtest;

class sub sess fbin tbin;

model t_diff=tbin|fbin sess / solution outpredm = pred_hi;

repeated / subject=sub(sess) type=sp(gau) (tbin fbin);

lsmeans tbin*fbin;

**run**;

Rather than trying to solve a known problem, I am really trying to check that I have not violated any assumptions, hence chencking the residuals. And also your suggestion of plotting the predicted versus the residual outputs, which defninitely do not have a fan structure. I will also try the QQplots.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

10-14-2016 09:25 AM

Regarding your friend's suggestion to subsample, I think that would not be helpful. If the subsamples are size 100,000, all subsamples will reject normality. If the subsamples are size 5, every subsample will accept normality. For some value in between (500?) you might get 50% of samples reject and 50% accept.

Look at the normal Q-Q plot, which will graphially indicate whether the data are approximately normal:

proc univariate normal;

QQPLOT x / normal;

run;