BookmarkSubscribeRSS Feed
BlueNose
Quartz | Level 8

Hello all,

 

I am trying to calculate the power for a trial which will be analyzed using survival analysis. In order to do so, I have data from a previous study. I ran PROC LIFETEST and got the table of "Product-Limit Survival Estimates". I saved the estimated into a SAS Dataset, and let's assume that now I have a dataset with two columns: time point and probability. Plotting these columns gives the survival curve. What I want to do, is to smooth the curve, and then to choose points from this curve (every fifth point, or every tenth point, etc...), and to enter these points to PROC POWER as input. Smoothing can be done using PROC LOESS. What I wante to ask, is how can I save "predictions" from a smoothed curve ? In other words, let's say I managed to smooth the curve, and that I want the predicted value of time = 10, 20, 30, 40. How can I get this from SAS ?

 

Thank you in advance !

16 REPLIES 16
Norman21
Lapis Lazuli | Level 10

Something like this?

 

   proc loess data=Melanoma;
      model Incidences=Year/smooth=0.1 0.25 0.4 0.6 residual;
      ods output OutputStatistics=Results;
   run; 

  proc print data=Results(obs=5); 
     id obs;
   run;

From http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_loess_sect00...

 

Norman.
SAS 9.4 (TS1M6) X64_10PRO WIN 10.0.17763 Workstation

Rick_SAS
SAS Super FREQ

What you are asking for is called "scoring a regression model."  My advice is to use the SCORE statement in PROC LOESS.

Here is an example:

/* create values of explanatory variables at which to score model */
data ScoreIt;
do Height = 55 to 70 by 5;
   output;
end;
run;

/* score model */
proc loess data=sashelp.class plots=none;
   model weight = height;
   ods output ScoreResults = ScoreResults;
   score data=ScoreIt;
run;

proc print data=ScoreResults; run;
BlueNose
Quartz | Level 8

Thank you both. The solution you provided is working, however I have encountered an unexpected problem.

 

This solution works fine if all I have is time and survival. What I did not specify, is that my dataset contains also 2 other variables: treatment group and index (=1,2,3,4). I wish to perform this task, by index and treatment group. Every combination of treatment group and index, has different times (different heights in Rick's example). How can I run this, so for every unique combination of index and treatment group, I will enter the times (heights) into a dataset (all times in the file for this combination), and then predict the survival ?

 

In other words, for each unique combination I want a different smoothing curve.

 

Thank you.

Rick_SAS
SAS Super FREQ

Sort the data by Treatment and Index. Then use a BY statement:

BY Treatment Index;

inside the procedure that does the analysis.

BlueNose
Quartz | Level 8

Thank you, it's working.

 

I have just encounterd a bigger problem, maybe you will have an idea.

 

Like I mentioned, the whole idea is to take ths smoothed values and to put them in PROC POWER. The smoothed survival (the predictions) gave me values above 1. PROC POWER won't get it. If I change any value larger than 1 into 1, I might get two values of 1 for one scenario, PROC POWER won't get that either. What shall I do ?

Rick_SAS
SAS Super FREQ

Yes, that is a problem that can occur when you use a routine like LOESS to model a probabilliy. LOESS does not know that the quantity must be in the interval (0,1). Another problem you might encounter is that a survival curve should be monotonic, whereas a LOESS curve can have local extrema.  I guess I don't understand why you aren't using a survival routine (LIFEREG?) for this analysis.

 

Without seeing your data, it is hard to know how to respond.  I can think of two general approaches:

1) If you want to stay with LOESS, choose a large bandwidth. You might get lucky and end up with a curve that is bounded within (0,1)

2) If you are willing to abandon LOESS, use a different regression procedure that model probabilities.

BlueNose
Quartz | Level 8

Thank you Rick.

 

I do not HAVE to use LOESS, I don't know any other smoothing option. I have times and survival probabilities. This is data from an old study. I want to use it to plan a new one using PROC POWER. I thought, that instead of using a data specific numbers in PROC POWER, I will use the smoothed values. If there is an alternative to LOESS that is more suitable, I will gladly change (hopefully the code isn't much harder than the one you specified earlier).

Rick_SAS
SAS Super FREQ

Unfortunately, I do not understand the structure of your data well enough to make a recommendation.  If you can make up and post some representative data, that would encourage other experts to chime in.

BlueNose
Quartz | Level 8

I will try to simplify.

 

This is an example data, and not even a good one (I have more observations per group).

 

This came from LIFETEST. What I want now, it to take these two survival curves, and to smooth them.

 

I have this data:

 

 

data exmaple;
input Group Time Survival;
datalines;
T 0     1
T 10    0.97
T 120   0.93
T 180   0.90
T 270   0.83
C  0    1
C  13   0.96
C  130  0.94
;
run;

 I want to get a new column with new probabilities, of the smoothed curves.

EastwoodDC
Obsidian | Level 7

You could use the logit transform on the survival probabilities, run it through scoring with Proc Loess, then inverse-logit transform back to the probability scale. That will restrict the estimates to [0,1].

Rick_SAS
SAS Super FREQ

After thinking about this problem some more, I don't see how LOESS could possibly be giving you predicted values above 1. The default interpolation is linear. If all of your Y values are between 0 and 1, it is mathematically impossible for a linear interpolation (which is an average) to predict a value outside of [0,1].  (If you have told LOESS to do cubic interpolation, then that is your problem; use linear interpolation.)

 

 LOESS does not do extrapolation, so I suggest you check your response variable to make sure that all values are between 0 and 1. 

 

I sure would like to see an example that does what you claim. Perhas I am misunderstanding the explanation of your data...

BlueNose
Quartz | Level 8

 

data example;
input Time Survival;
datalines;
0   1
40 0.9382
41 0.9164
56 0.8945
61 0.8727
70 0.8509
88 0.8291
92 0.8073
97 0.7855
136 0.7636
137 0.7418
144 0.72
145 0.6982
153 0.6764
169 0.6545
176 0.6327
235 0.6101
244 0.5875
246 0.5649
298 0.5423
308 0.5197
325 0.4971
346 0.4745
402 0.4519
487 0.4294
505 0.4068
568 0.3842
722 0.3616
786 0.339
956 0.3164
;
run;

Hi Rick,

 

Here is an example of a data. If you run PROC LOESS using this, the first predicted value (corresponding to 1) gives you 1.0035, where the code is:

 

DATA ScoreIt;
	set Example;
RUN;

proc loess data=Example plots=none;
   model Survival = Time;
   ods output ScoreResults = ScoreResults;
   score data=ScoreIt;
run;

 

Thank you. 

Rick_SAS
SAS Super FREQ

Thanks for the example. My thinking was incorrect. Please ignore my previous rambings.

EastwoodDC
Obsidian | Level 7

First, do you really need to smooth this? Survival probabilities change in discrete lumps, and smoothing will misrepresent the estimates.

 

Second, If you must interpolate, the simplest way to correct values greater than 1.0 is to set them equal to 1.0 in a data step, and explain your methods. 

 

If that's not good enough, do your smoothing with a spline effect in Proc Glimmix, and constrain the intercept to be equal to 1.0. 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 16 replies
  • 5194 views
  • 6 likes
  • 4 in conversation