04-20-2016 01:45 PM
This study is very difficult to understand and explain over the computer so please bear with me. I have a dataset that has around 30 patients, and I am trying to determine how different covariates are associated with a particular outcome. To explain further, I have patients with colon abnormalities who have colon diameter measurements at 5 cm intervals (from distal point at anus to more proximal in the colon). There is a point in each individual's colon where contractions (termed HAPCs, which signifies a healthy colon) stop. All measurements of the diameter of the colon before the HAPCs stop is considered normal (pre-hapc) and all measurements of the diameter of the colon after the HAPCs stop is considered abnormal (post-hapc). Because each individual has different lengths of the colon, and therefore they may have a different number of diameter measurements (e.g. one patient may have diameter measurements out to 85 cm in their colon, whereas another patient may have diameter measurements out to 65 cm in their colon), as well as different points in the colon where the contractions stop, I took the average diameter of the colon at 5-cm increments in the area where contractions were detected (pre-hapc) and in the area of the colon where contractions were no longer detected (post-hapc), and I had to do this for each patient (could not use the same formula to calculate pre-hapc average diameter and post-hapc average diameter because, as I mentioned earlier, each patient has a different colon length and a different point in their colon where the HAPCs are no longer detected). Essentially, our goal with the study is to see if the colon size (measured by diameter) is larger in the abnormal (post-hapc) section of the colon compared to the normal (pre-hapc) section of the colon. I took the difference between average diameter pre-hapc and average diameter post-hapc and calculated the percent difference. This percent difference is the outcome that I want to investigate in different subgroups.
My main statistical question is, how do I determine the association between percent difference in diameter and other variables (like age, weight, height, gender, etc.), accounting for the fact that there is so much variability in colon lengths among this patient population? First, I transposed my data from wide to long format. Then I used proc mixed with the following code:
proc mixed data=alldata;
class mrn diam;
model pctdiffprepost=_name_ age_onset/s chisq ddfm=kr;
repeated diam/subject=mrn type=un;
lsmeans diam/alpha=0.05 cl diff;
where mrn is basically patient id, pctdiffprepost is the percent difference in diameter between pre-hapc and post-hapc, and diam is the different diameter measurements at 5-cm increments throughout the colon. My problem is, when I ran this code, I received the following warning:
WARNING: Unable to make hessian positive definite.
I'm not sure what I am doing wrong; I am not a statistician and am pretty new with mixed models. Hopefully, someone will be able to help me figure out (or at least give me some type of hint) how to construct models accounting for the repeated diameter measurements and differences in number of measurements for each individual. Also, using SAS version 9.3. Thank you.
04-25-2016 11:23 AM
A couple things here. Frank Harrell has an excellent wiki that talks about percent differences as a horrible endpoint. Take a look at:
So, perhaps considering the post measurement as the response, with the pre-measurement as a covariate will help.
Once all of the modeling is done, change scores could be calculated from the resulting least squares means.
04-25-2016 01:25 PM
Thanks for your response. What's stated in the article makes sense. I will look into analyzing the data as you stated below.