Hi,
I need help analyzing my data which is negatievly skewed (skewness=-2.5 approx) with around 35% data at 0. My experiment is : Each person scanned under diffrent cases, 3 trails and each trial produces 12 scans on a person. So I clearly have nested structure. I tried fitting gamma and lognormal distributions to this data, but they all run into convergence issues. These are residuals from normal distribution fitting. Can anyone suggest what can I do better with this data. Thank you so much.
title "Pelvic Lateral Deviation 504 analysis";
proc glimmix data=full_sta1 plots=all;
class case pt trial;
model PlumbResult_0504_LateralDeviatio= case/ddfm=KR ;
random intercept/subject=pt(case) ;
random trial(pt*case);
run;
You could try something called a Box-Cox transformation which will transform the data to something approximately normally distributed, if such a transformation exists. This can be done in PROC TRANSREG (and maybe other procedures as well).See:https://documentation.sas.com/?docsetId=statug&docsetTarget=statug_odsgraph_sect010.htm&docsetVersio...
I would try this on the average for each person, rather than on the 3 trials x 12 scans for each person.
Your description does not give us enough information to determine whether the statistical model is correct. For example, how many levels of CASE are there, and how does CASE relate to TRIAL? Are there 3 CASEs with one TRIAL each? Or 3 TRIALs for each CASE? What research question do you have that would be addressed by 12 SCANs in each TRIAL?
Your residual plot and data plot show that there is an upper bound (which is zero) to your response "PlumbResult_0504_LateralDeviatio". Neither the lognormal nor the gamma distribution is appropriate for data with an upper bound; both the lognormal and the gamma have a lower bound at zero and an upper bound of infinity. Both should have failed miserably with a response with negative values (the log of zero is not defined, and I would guess that there was a message to that effect in the log window; always pay attention to the log window).
So, we need to know more about what your response is measuring, in addition to more about your experimental design. Guessing wildly, you might have more luck redefining your response as (-1)*response, if that was sensible in context; that redefined response might follow the exponential distribution, and then the gamma might work (although gamma mixed models can be very persnickety).
Given your current description, I doubt your RANDOM specifications are right but we await more information.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.