10-19-2014 09:04 AM
I need you advice on how to analyze my data. The data comes from the ophthalmic field (eyes). A small study was conducted to test a new treatment. There was no control group, a single arm study.
n subjects were chosen by random, and one eye was chosen by random for each one of them - the chosen eye received a treatment. The second eye did not.
Before treatment, some measure of the eye (continuous), I'll call it Y, was measures for both eyes. Then the treatment was applied on one eye, and there were several follow up visits: 1 week, 2 weeks and 1 month.
The question was to test if the treatment works, to simplify, I could simply take one time period, say 2 weeks, and run a paired t-test (compared to the Y value before treatment). Ideally, I would like to also check how long does the treatment effect works.
The problem was, that I plotted the untreated eye (which was measured at every follow-up visit) and found a correlation (illustration is attached - a plot of 1 subject, no real data).
Now if I compare the 2 weeks data point with the before data point, using a paired t-test, I do find a statistically significant difference, as can be seen in this representative graph of 1 subject, there is a decrease in Y. However, I see the same pattern in the blue (untreated) line, and I think there must be a better way to model this.
My data currently look like this:
|Subject||Time||Y treated||Y untreated|
I wanted your advice on how to model this data (I thought on mixed model, or GEE). Should I transfer the file so each subject will have 4 additional rows (the untreated eye), or keep it like this and add this variable (untreated eye) as a covariate ?
Usually in this field only the treated eye is analyzed. However, in his small study the population is different and the untreated eye vary more than in the usual population.
I tried this:
proc mixed data = A;
class Subject_ID Time;
model Y_Treated = Time Y_untreated;
lsmeans Time / adjust=TUKEY;
Am I close ? For some reason the diff option did not work (did not become blue). Should I use the contrast or estimate statements if I want to compare a specific time point to the baseline ? I tried and got different results than the multiple comparisons gave me, I must have written it wrong.
The most important question I have is if this way (what I did) takes into account the variability of the untreated eye, as it is very correlated to Y.
Thank you !
This is weird, I ran a paired t-test with the Y levels after 2 weeks and before the treatment, and it was statistically significant. Then I ran a similar code to the above just without the untreated_Y in the model, and although I got the same means, it was not statistically significant. I can't figure out why (anything to do with power ?). When I added the untreated_Y, it is significant again, my assumption is that it's due to the fact that I have taken into account the variance with time (represented by the untreated value). Is it valid and correct (my way of thinking) ?
10-20-2014 08:18 AM
I would arrange the data in 'long' form, so that each eye by time combination is a single row, with response as the value for the eye of interest (treated or untreated). The analysis then would be (in GLIMMIX, so that I can use a more versatile SLICEDIFF= option):
proc glimmix data=long;
class eye subject_id time;
model response = eye time eye*time;
random time/residual type=(several types available here, I would look at CS, CSH, UN and SP(POW)).
10-20-2014 08:35 AM
Steve, thank you for helping out.
I tried your code and it didn't do much (stopped after the dimensions table), while I got this warning:
WARNING: Obtaining minimum variance quadratic unbiased estimates as starting values for the
covariance parameters failed
My sample is rather small, could that be the reason (number of subjects) ?
I understand your logic where the subjects are blocks, and the time is like a repeated measures (R side). I was thinking about this option, but my missing part was the random statement for the time.
10-20-2014 08:57 AM
I tried UN since I have noticed that there is a strong correlation (for both eyes) between the response at 2 weeks and before treatment and between 1 month and before treatment, with hardly any correlation between 1 week and before treatment. I didn't find a suitable correlation structure.
The N's are small, 10 subjects.
The means in the untreated group are close, from 16.7 to 18.5, with SD's of 3.8-5.1. In the treated eye, the means vary from 13.6-18.5 with SD's of 2.0-5.9.
Like I mentioned before, I noticed a large variability in the untreated eye, and a correlation between the eyes, that's why when I tried the first mixed model, I added the untreated eye as a covariate, instead of simply running a paired t-test of D = (2 weeks)-(before), which was statistically significant.
10-20-2014 09:07 AM
UN requires estimation of 10 parameters (4 variances--one for each time point, and 6 covariances-[4 points taken 2 at a time]) so with N=10, the warning you got is not unsurprising.
The variability looks pretty stable based on the raw values. I would give type=CS strong consideration.
Also, this may be a case where treating the pretreatment value as a covariate might be informative. The post treatment curves look almost parallel, and adjusting the means to a common baseline would improve the precision of the treatment comparisons post treatment. Still, there looks to be real interest in comparing pre to post treatment values, which would be lost under this approach.
10-20-2014 09:17 AM
you are right. If the interest would have been to compare two treatments, then ideally the baseline values could be informative. However, this is initially a single arm data, and the main interest is to see if after treatment the values changed significantly.
I ran it with CS, and it did go. The eye factor (treated vs. untreated) is statistically significant, and so is the time. The interaction between them isn't.
The last table: "Simple Effect Comparisons of Eye*Time Least Squares Means By Time" compares the two eyes by time. Non of the comparisons is statistically significant (how can it be if the eye factor is ?), with for both '1 week" and "2 weeks" the P value is around 0.06.
What missing for me in the output is the determination if the response at '1 week' and '2 weeks' differ from 'before" for the treated eye, which is the conclusion with most interest.
10-20-2014 09:36 AM
OK. We need to slice both ways, then.
lsmeans eye*time/slicediff=(eye time);
With the significant interaction, there should now be a significant difference showing up here. However, it could be that the interaction is due to a significant difference that is not captured by simple effect comparisons. That crossing of the graph lines for the two eyes says that the difference in the pretreatment values is opposite in sign to the differences in the post treatment values--and that is a sure case for a significant interaction.
10-21-2014 03:12 AM
The graph is a representative of one subject. In other subject a crossing might not exist.
I ran the updated code, the interaction is still not significant (P=0.39).
On the slicing output, the difference between '1 week' and 'pre treatment' is significant, however the difference between '2 weeks' and 'pre treatment' is not significant (P=0.06). When I ran a simple paired t-test, the latter difference was significant, and also when I ran a mixed model with the untreated eye as a covariate it was significant. Graphically and descriptively I can see differences, and despite the small sample visually it looks like there is something there. Could that be the result of not enough power due to the small sample size ?
The mean difference from the slicing output is exactly the same like the raw one I calculated for the paired t-test. However, the standard errors were around 1.6 and 0.45 for the paired t-test, and now it's around 1.6 for both. For '2 weeks' I had a relatively small standard error for the paired t-test and therefore it was significant. Now when the standard errors are the same for all differences, it isn't.
It is clear to me that at '2 weeks' there is a non-random behavior, I see it clearly graphically. The model doesn't capture this, probably due to lack of power. By saying the difference is not significant, I wouldn't be giving the correct conclusion in my opinion.
10-21-2014 07:17 AM
Hmm. That difference in the standard errors is due to the assumption of homogeneity of variance in the mixed model, so perhaps you can duplicate the different standard errors with:
proc glimmix data=long;
class eye subject_id time;
model response = eye time eye*time;
random time/residual group=eye type=(several types available here, I would look at CS, CSH, UN and SP(POW)).
Regarding the last statement, lack of power is probably the key--and it is very likely why using the pretreatment value as a covariate does give a difference, as a major source of variation (initial differences) gets removed from the residual error. Still, saying the difference is not significant isn't the same as saying you accept the null. If you want additional power for a comparison in a small study, set alpha at a higher level.
But what is really important is the effect size, which can be approximated in a mixed model setting by the t value. This indication of the number of standard errors of the difference that the difference between two means represents captures far more information than a p value (my opinion only, and may be at odds with reviewers).
10-21-2014 08:27 AM
The code doesn't run properly with the addition of the group option. It's the same warning message as before, so I guess there are too many parameters to estimate now.
From SAS manual: Specifying a GROUP effect can greatly increase the number of estimated covariance parameters, which can adversely affect the optimization process.
The absolute value of the t value of the '1 week' vs 'before' comparison is 3.04 (P=0.004), while for '2 weeks' it is 1.9 (P=0.06).
I agree that saying the difference is not significant isn't the same as saying I accept the null. Changing the alpha value is problematic, I don't think anyone would accept results with alpha larger than 5%.
10-22-2014 08:01 AM
So it all comes down to lack of sample size: little power and an inability to fit more complex models. So you have to make do with what you have--marginal significance at one time point, and strong significance at another. Given the 'effect size' t values, I would say that there may be a trend to recovery by Week 2, that depends on individual values. Have you looked at the BLUPs for the individuals? Is there a small set that have returned to baseline values, while the others remain at or near Week 1 values?
I guess I am wandering more into interpretation here than methodology, and since this isn't a field I feel I have a lot of expertise in, I "hesitate to elucidate, for fear I may deviate".
10-26-2014 03:36 AM
You are very correct. There is a trend to recovery. However, according to the clinicians, at the 2 weeks time point, the difference, being less strong, is still clinically significant. Unfortunately, the model, due to lack of power, doesn't capture this.
I have not tried to look at the BLUPs. How do I do that in SAS ? Some individuals did recover while others had a big difference remaining.
Please do not hesitate with your advice, so far without knowing the data you pretty much guessed how it looks like ! Thanks you, it's most appreciated !
Do you think that I can somehow solve this problem by subtracting the levels of the response in the untreated eye from the levels of the response in the treated eye ? Or subtracting the decrease / increase in the untreated eye from the one of the treated eye ?
10-27-2014 08:16 AM
Change the random statement to this as a first attempt:
random time/residual type=cs subject=subject_id solution;
The keyword is solution.
See if that gives a BLUP at each time point. If not, then add
output out=out1 pred(blup noilink)=predicted;
and you should have the BLUPs in a dataset.