04-04-2016 09:10 AM
I am trying to calculate the power for a trial which will be analyzed using survival analysis. In order to do so, I have data from a previous study. I ran PROC LIFETEST and got the table of "Product-Limit Survival Estimates". I saved the estimated into a SAS Dataset, and let's assume that now I have a dataset with two columns: time point and probability. Plotting these columns gives the survival curve. What I want to do, is to smooth the curve, and then to choose points from this curve (every fifth point, or every tenth point, etc...), and to enter these points to PROC POWER as input. Smoothing can be done using PROC LOESS. What I wante to ask, is how can I save "predictions" from a smoothed curve ? In other words, let's say I managed to smooth the curve, and that I want the predicted value of time = 10, 20, 30, 40. How can I get this from SAS ?
Thank you in advance !
04-04-2016 09:48 AM
Something like this?
proc loess data=Melanoma; model Incidences=Year/smooth=0.1 0.25 0.4 0.6 residual; ods output OutputStatistics=Results; run; proc print data=Results(obs=5); id obs; run;
04-04-2016 10:51 AM
Here is an example:
/* create values of explanatory variables at which to score model */ data ScoreIt; do Height = 55 to 70 by 5; output; end; run; /* score model */ proc loess data=sashelp.class plots=none; model weight = height; ods output ScoreResults = ScoreResults; score data=ScoreIt; run; proc print data=ScoreResults; run;
04-06-2016 03:35 AM
Thank you both. The solution you provided is working, however I have encountered an unexpected problem.
This solution works fine if all I have is time and survival. What I did not specify, is that my dataset contains also 2 other variables: treatment group and index (=1,2,3,4). I wish to perform this task, by index and treatment group. Every combination of treatment group and index, has different times (different heights in Rick's example). How can I run this, so for every unique combination of index and treatment group, I will enter the times (heights) into a dataset (all times in the file for this combination), and then predict the survival ?
In other words, for each unique combination I want a different smoothing curve.
04-06-2016 05:46 AM
Thank you, it's working.
I have just encounterd a bigger problem, maybe you will have an idea.
Like I mentioned, the whole idea is to take ths smoothed values and to put them in PROC POWER. The smoothed survival (the predictions) gave me values above 1. PROC POWER won't get it. If I change any value larger than 1 into 1, I might get two values of 1 for one scenario, PROC POWER won't get that either. What shall I do ?
04-06-2016 07:45 AM
Yes, that is a problem that can occur when you use a routine like LOESS to model a probabilliy. LOESS does not know that the quantity must be in the interval (0,1). Another problem you might encounter is that a survival curve should be monotonic, whereas a LOESS curve can have local extrema. I guess I don't understand why you aren't using a survival routine (LIFEREG?) for this analysis.
Without seeing your data, it is hard to know how to respond. I can think of two general approaches:
1) If you want to stay with LOESS, choose a large bandwidth. You might get lucky and end up with a curve that is bounded within (0,1)
2) If you are willing to abandon LOESS, use a different regression procedure that model probabilities.
04-06-2016 07:58 AM
Thank you Rick.
I do not HAVE to use LOESS, I don't know any other smoothing option. I have times and survival probabilities. This is data from an old study. I want to use it to plan a new one using PROC POWER. I thought, that instead of using a data specific numbers in PROC POWER, I will use the smoothed values. If there is an alternative to LOESS that is more suitable, I will gladly change (hopefully the code isn't much harder than the one you specified earlier).
04-06-2016 08:10 AM
Unfortunately, I do not understand the structure of your data well enough to make a recommendation. If you can make up and post some representative data, that would encourage other experts to chime in.
04-06-2016 08:17 AM - edited 04-06-2016 08:18 AM
I will try to simplify.
This is an example data, and not even a good one (I have more observations per group).
This came from LIFETEST. What I want now, it to take these two survival curves, and to smooth them.
I have this data:
data exmaple; input Group Time Survival; datalines; T 0 1 T 10 0.97 T 120 0.93 T 180 0.90 T 270 0.83 C 0 1 C 13 0.96 C 130 0.94 ;
I want to get a new column with new probabilities, of the smoothed curves.
04-06-2016 03:55 PM
You could use the logit transform on the survival probabilities, run it through scoring with Proc Loess, then inverse-logit transform back to the probability scale. That will restrict the estimates to [0,1].
04-06-2016 04:44 PM
After thinking about this problem some more, I don't see how LOESS could possibly be giving you predicted values above 1. The default interpolation is linear. If all of your Y values are between 0 and 1, it is mathematically impossible for a linear interpolation (which is an average) to predict a value outside of [0,1]. (If you have told LOESS to do cubic interpolation, then that is your problem; use linear interpolation.)
LOESS does not do extrapolation, so I suggest you check your response variable to make sure that all values are between 0 and 1.
I sure would like to see an example that does what you claim. Perhas I am misunderstanding the explanation of your data...
04-10-2016 02:15 AM
data example; input Time Survival; datalines; 0 1 40 0.9382 41 0.9164 56 0.8945 61 0.8727 70 0.8509 88 0.8291 92 0.8073 97 0.7855 136 0.7636 137 0.7418 144 0.72 145 0.6982 153 0.6764 169 0.6545 176 0.6327 235 0.6101 244 0.5875 246 0.5649 298 0.5423 308 0.5197 325 0.4971 346 0.4745 402 0.4519 487 0.4294 505 0.4068 568 0.3842 722 0.3616 786 0.339 956 0.3164
Here is an example of a data. If you run PROC LOESS using this, the first predicted value (corresponding to 1) gives you 1.0035, where the code is:
DATA ScoreIt; set Example; RUN; proc loess data=Example plots=none; model Survival = Time; ods output ScoreResults = ScoreResults; score data=ScoreIt; run;
04-11-2016 11:29 AM
First, do you really need to smooth this? Survival probabilities change in discrete lumps, and smoothing will misrepresent the estimates.
Second, If you must interpolate, the simplest way to correct values greater than 1.0 is to set them equal to 1.0 in a data step, and explain your methods.
If that's not good enough, do your smoothing with a spline effect in Proc Glimmix, and constrain the intercept to be equal to 1.0.