Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Analysis of Negatively skewed nested data

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 10-01-2019 03:05 PM
(418 views)

Hi,

I need help analyzing my data which is negatievly skewed (skewness=-2.5 approx) with around 35% data at 0. My experiment is : Each person scanned under diffrent cases, 3 trails and each trial produces 12 scans on a person. So I clearly have nested structure. I tried fitting gamma and lognormal distributions to this data, but they all run into convergence issues. These are residuals from normal distribution fitting. Can anyone suggest what can I do better with this data. Thank you so much.

```
title "Pelvic Lateral Deviation 504 analysis";
proc glimmix data=full_sta1 plots=all;
class case pt trial;
model PlumbResult_0504_LateralDeviatio= case/ddfm=KR ;
random intercept/subject=pt(case) ;
random trial(pt*case);
run;
```

2 REPLIES 2

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You could try something called a Box-Cox transformation which will transform the data to something approximately normally distributed, if such a transformation exists. This can be done in PROC TRANSREG (and maybe other procedures as well).See:https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_odsgraph_sect010.htm&docsetVersio...

I would try this on the average for each person, rather than on the 3 trials x 12 scans for each person.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Your description does not give us enough information to determine whether the statistical model is correct. For example, how many levels of CASE are there, and how does CASE relate to TRIAL? Are there 3 CASEs with one TRIAL each? Or 3 TRIALs for each CASE? What research question do you have that would be addressed by 12 SCANs in each TRIAL?

Your residual plot and data plot show that there is an upper bound (which is zero) to your response "PlumbResult_0504_LateralDeviatio". Neither the lognormal nor the gamma distribution is appropriate for data with an upper bound; both the lognormal and the gamma have a lower bound at zero and an upper bound of infinity. Both should have failed miserably with a response with negative values (the log of zero is not defined, and I would guess that there was a message to that effect in the log window; always pay attention to the log window).

So, we need to know more about what your response is measuring, in addition to more about your experimental design. Guessing wildly, you might have more luck redefining your response as (-1)*response, if that was sensible in context; that redefined response *might* follow the exponential distribution, and then the gamma *might* work (although gamma mixed models can be very persnickety).

Given your current description, I doubt your RANDOM specifications are right but we await more information.

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.