- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone,
I would like to identify biomarker threshold that can be used for survival prognosis. So basically I have a multivariate cox regression with a continuous variable that represents biomarker expression. I would like to identify the level of expression that can effect survival.
Is there any procedure or macro that helps with this?
Thanks in forward
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have moved this topic to 'Statistical Procedures' board as it is about survival analysis and PROC PHREG.
Or did you use another procedure than PROC PHREG?
Or did you use PROC LIFEREG?
Or did you use PROC LOGISTIC (discrete-time logistic hazard model)?
Thanks,
Koen
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am asking for PROC PHREG indeed. I introduced the biomarker expression as continuous variable in the Cox model.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have a follow-up question.
Why do you want to find a threshold?
Let me guess :
You want to find a threshold for biomarker expression to make a new risk factor X.
The dichotomous risk factor variable X
- takes the value 1 if the biomarker expression is equal or above the threshold and
- takes the value 0 if the risk factor is below the threshold.
You want to find the threshold that maximizes the hazard ratio for the main effect X.
Correct?
Koen
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@sbxkoenk Yes. so?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @Ubai ,
You say : I introduced the biomarker expression as continuous variable in the Cox model.
Do you have any other explanatory variables as well?
Things become more complicated if your biomarker expression is interacting with other explanatory variables.
But, supposing you have NO other explanatory variables (or you have them, but biomarker expression is only a main effect and not involved in any interaction) :
A good threshold can be "guessed" from your (continuous) biomarker expression effect on the survival rate. You might need to build a spline effect with it (or another transformed feature), otherwise you cannot judge well if the odds (ratio) stays constant over the whole "profile".
Another solution, the easiest one, is to do a grid search.
This solution is very greedy and not intelligent !!
You just try 20 (or XX) thresholds to find out about the best one.
It's a mere loop over 20 (or XX) possibilities followed by comparison of the 20 (or XX) results. To be built inside a macro or via data-driven code generation!
The last possibility is an intelligent search for the best threshold.
But that can be mathematically cumbersome. You need to write an objective function that you can then maximize subject to constraints. You need SAS/OR or SAS Optimization for that (PROC OPTMODEL or PROC OPTLSO).
LSO = Local Search Optimization (with GA = Genetic Algorithms is sometimes easier).
Kind regards,Koen
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @sbxkoenk,
thanks for the detailed answer. I do have a fully adjusted Cox model. All established factors associated with survival were included in the model. I have prepared a DAG diagram, and I think it is plausible to assume that the biomarker expression has a main effect on survival and is not interacting with other explanatory variables such as treatment.
My plan was to plot smooth hazards ratio using spline effects and try to guess the threshold from this. However, my sample size is relatively small ~ 100 patients.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes, 100 patients is not that much.
I actually never do survival analysis on living organisms ( patients / animals / plants ).
I only do it on things (like machines or machine parts). Never problems with small datasets there 😁.
I would try it anyway with that spline effect. Maybe you see a kink in the curve somewhere.
Good luck,
Koen