BookmarkSubscribeRSS Feed
Obsidian | Level 7



I need help analyzing my data which is negatievly skewed (skewness=-2.5 approx) with around 35% data at 0. My experiment is : Each person scanned under diffrent cases, 3 trails and each trial produces 12 scans on a person. So I clearly have nested structure. I tried fitting gamma and lognormal distributions to this data, but they all run into convergence issues. These are residuals from normal distribution fitting. Can anyone suggest what can I do better with this data.  Thank you so much. StudentPanel16.pngHistogram504.png

title "Pelvic Lateral Deviation 504 analysis";
proc glimmix data=full_sta1 plots=all;
	class case pt trial;
	model  PlumbResult_0504_LateralDeviatio= case/ddfm=KR  ;
	random  intercept/subject=pt(case) ;
	random trial(pt*case);


Diamond | Level 26

You could try something called a Box-Cox transformation which will transform the data to something approximately normally distributed, if such a transformation exists. This can be done in PROC TRANSREG (and maybe other procedures as well).See:


I would try this on the average for each person, rather than on the 3 trials x 12 scans for each person.

Paige Miller
Rhodochrosite | Level 12 sld
Rhodochrosite | Level 12

Your description does not give us enough information to determine whether the statistical model is correct. For example, how many levels of CASE are there, and how does CASE relate to TRIAL? Are there 3 CASEs with one TRIAL each? Or 3 TRIALs for each CASE? What research question do you have that would be addressed by 12 SCANs in each TRIAL? 


Your residual plot and data plot show that there is an upper bound (which is zero) to your response "PlumbResult_0504_LateralDeviatio". Neither the lognormal nor the gamma distribution is appropriate for data with an upper bound; both the lognormal and the gamma have a lower bound at zero and an upper bound of infinity. Both should have failed miserably with a response with negative values (the log of zero is not defined, and I would guess that there was a message to that effect in the log window; always pay attention to the log window).


So, we need to know more about what your response is measuring, in addition to more about your experimental design. Guessing wildly, you might have more luck redefining your response as (-1)*response, if that was sensible in context; that redefined response might follow the exponential distribution, and then the gamma might work (although gamma mixed models can be very persnickety). 


Given your current description, I doubt your RANDOM specifications are right but we await more information.



Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.


Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 3 in conversation