I am using Proc Phreg to develop a model, and try to use the model results to score new dataset. I want to score each observation with survival probabilities at time (current years)+1, time+2, time+3, time+4, time+5...
First, I used "baseline", "covariates", "out" to get the survival probabilities, but I got each observation with a list of probabilities within the Time range (min to max), that was not what I want. I want one observation with one probability at that specific time..
Second, I used "store" and Proc Plm, I got one observation with one predicted variable, but that predicted values were not probabilities, and I am not sure what those values are and if I could transform that into probabilities?
Thanks a lot. Any suggestions will be appreciated!
@xzhang wrote:
I am using Proc Phreg to develop a model, and try to use the model results to score new dataset. I want to score each observation with survival probabilities at time (current years)+1, time+2, time+3, time+4, time+5...
First, I used "baseline", "covariates", "out" to get the survival probabilities, but I got each observation with a list of probabilities within the Time range (min to max), that was not what I want. I want one observation with one probability at that specific time..
Second, I used "store" and Proc Plm, I got one observation with one predicted variable, but that predicted values were not probabilities, and I am not sure what those values are and if I could transform that into probabilities?
Thanks a lot. Any suggestions will be appreciated!
As a bare minimum you should show the actual code you used in proc phreg including how you created an output data set to use for scoring and then the Proc Plm code you used. Better would also provide example data used by proc phreg and then input to proc plm.
Logs if any warning or notes other than data set size appear may be helpful as well.
Someone may be able to tell if you missed an option or possibly used the wrong variable when attempting to get your probabilities.
Are you sure than all grouping variables, if present, in the scoring data set were also in the data to score with the same range of values?
If you develop a model without "Apple" as the value of a variable in the first set then the scores are likely not able to be calculated for data containing "Apple" as a value.
I think you got exactly what you want. In the cox-model, the estimated survival curve will be a step-function. It only jumps where you have an observed event. Therefore, if you have the estimated survival probability for the left limit of an interval, then that estimate the survival probability for the whole interval.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.