- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am using restricted cubic splines with logistic regression. I can’t seem to find much documentation on the output so I am not sure how to interpret! I am using 5 knots (percentile list).
proc logistic data=EVENTS0;
effect spl = spline(PRALBUM / details naturalcubic basis=tpf(noint)
KNOTMETHOD=PERCENTILELIST(5 27.5 50 72.5 95));
model MAJOR(event="1") = spl / selection=none alpha=0.10;
oddsratio PRALBUM / at(PRALBUM=2.0 to 5.0 by .10) cl=pl;
ods output ORPlot=orp;
quit;
I specified 5 knots, which it shows above, so what does the ‘basis details for spline effect’ mean?
How do I interpret the estimates for the splines?
I thought I could exponentiate the estimates to have the OR's for each spline, but that doesn't seem to go along with the OR's below.
Any help is appreciated!!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
See this note, particularly the "Models involving constructed effects such as splines" section and this section on the ESTIMATE statement where its use with spline models is discussed. The following uses the kyphosis data example in this note. The PROC LOGISTIC statements below fit a spline model similar to yours and computes the odds ratio for the predictor at 10. As discussed in the first note above, you can use the HAZARDRATIO statement in PROC PHREG to determine the coefficients of the linear combination of model parameters that estimates a unit increase in the predictor. Those coefficients are used in an ESTIMATE statement to reproduce the results from the ODDSRATIO statement. Alternatively, you can use the nonpositional syntax in the ESTIMATE, along with the E option, to see the coefficients for the estimated log odds at 10 and at 11. The difference in those two sets of coefficients is the set of coefficients for the unit change as from the HAZARDRATIO statement. The effect plot gives a visual confirmation of the values.
proc phreg data=kyphosis;
effect spl = spline(StartVert / naturalcubic basis=tpf(noint));
model Kyphosis = spl;
hazardratio StartVert / at(StartVert=10) e;
run;
proc logistic data=kyphosis;
effect spl = spline(StartVert / naturalcubic basis=tpf(noint));
model Kyphosis(event="1") = spl;
oddsratio StartVert / at(StartVert=10);
estimate 'or' spl 1 8.992647 / e exp cl;
estimate 'at 10' intercept 1 spl[1,10] / e exp cl;
estimate 'at 11' intercept 1 spl[1,11] / e exp cl;
effectplot / link;
run;
But I have to ask, what is the need to know the actual coefficients of the linear combination that the ODDSRATIO statement uses if the ODDSRATIO statement determines them for you?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The best documentation for spline effects is the doc for the EFFECT statement.
Briefly, the "Details" table is telling you that the EFFECT statement resulted in four columns in the design matrix. The "break knots" define the regions on which each knot is nonzero, and shows that the interior knots are cubic polynomials (power=3).
To understand the parameter estimates, see the article, "Visualize a regression with splines."
I do not know the answer to your question about odds ratios.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
here you'll find the answer.
https://blogs.sas.com/content/iml/2019/10/16/visualize-regression-splines.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Exponentiating a predictor's parameter estimate only works when the predictor is not involved in interactions or in constructed effects such as splines. In general, the computation of the odds ratio is a linear combination of model parameters. This is correctly done by the ODDSRATIO statement. So, your odds ratio results are properly telling you the change in odds for a unit increase in the original predictor at various points on that predictor.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I know that the parameter estimates from logistic regression are the change in log odds.. I guess my question is - how are these odds ratios actually calculated from the parameter estimates? For example, I see that if someone has a level of 3.6 then the OR is 0.379 but I can't figure out how this is calculated from the parameter estimates.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
See this note, particularly the "Models involving constructed effects such as splines" section and this section on the ESTIMATE statement where its use with spline models is discussed. The following uses the kyphosis data example in this note. The PROC LOGISTIC statements below fit a spline model similar to yours and computes the odds ratio for the predictor at 10. As discussed in the first note above, you can use the HAZARDRATIO statement in PROC PHREG to determine the coefficients of the linear combination of model parameters that estimates a unit increase in the predictor. Those coefficients are used in an ESTIMATE statement to reproduce the results from the ODDSRATIO statement. Alternatively, you can use the nonpositional syntax in the ESTIMATE, along with the E option, to see the coefficients for the estimated log odds at 10 and at 11. The difference in those two sets of coefficients is the set of coefficients for the unit change as from the HAZARDRATIO statement. The effect plot gives a visual confirmation of the values.
proc phreg data=kyphosis;
effect spl = spline(StartVert / naturalcubic basis=tpf(noint));
model Kyphosis = spl;
hazardratio StartVert / at(StartVert=10) e;
run;
proc logistic data=kyphosis;
effect spl = spline(StartVert / naturalcubic basis=tpf(noint));
model Kyphosis(event="1") = spl;
oddsratio StartVert / at(StartVert=10);
estimate 'or' spl 1 8.992647 / e exp cl;
estimate 'at 10' intercept 1 spl[1,10] / e exp cl;
estimate 'at 11' intercept 1 spl[1,11] / e exp cl;
effectplot / link;
run;
But I have to ask, what is the need to know the actual coefficients of the linear combination that the ODDSRATIO statement uses if the ODDSRATIO statement determines them for you?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
One more question - I understand the estimates 'at 10' and 'at 11', but what is this estimate statement estimating the OR of?
estimate 'or' spl 1 8.992647 / e exp cl;
I tried that in my program (with a different value) and it did not match the result from the Odds Ratios Estimates.
For example, I tried estimate: 'or' spl 1 3.5 / e exp cl and the exponentiated result is 0.5510 which doesn't match with the 0.398 from the Odds Ratios Estimates section.
Thanks again, your help is greatly appreciated!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much for your patient answer! But how is "8.992647"calculated with spline🤔
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
exp("at 11")/exp("at 10")=exp("or")?
spl 1 8.992647=spl [1,8.992647 ] ?
"8.992647"=10-1?
Thanks for your answer, I’d be very grateful!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You need to run the example that you marked as the correct solution. When you run the PROC PHREG example, you will see that the parameter estimates for the hazard ratios for STARTVERT are
spl1 Coefficient = 1.0
spl2 Coefficient = 8.992647
That is why those value are use in the first ESTIMATE statement in PROC LOGISTIC.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The result of this calculation (estimate 'or' spl 1 8.992647 / e exp cl;)
is the same as the result of (estimate 'or' spl [-1,10] [1,11]/ e exp cl;).
So can I get odds ratio from "estimate" by using a loop macro and specifying a reference?
Thank you very much!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
It's not clear what you are asking, but if you mean that you want to estimate the odds ratio for a unit increase in STARTVERT at more than one value - for example, at 10, 12, and 14 instead of just at 10 - then the easiest way is to simply specify the values in the AT option in the ODDSRATIO statement:
oddsratio StartVert / at(StartVert=10 12 14);
To do the same with ESTIMATE statements, you would need to produce multiple ESTIMATE statements - the first with values 10 and 11 like you've done, the next with values 12 and 13, and the last with values 14 and 15.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
the 'odds ratio ' table is not produced for the model with constructed fixed effects, but "estimate" dose, there are also confidence intervals.
A reference point will set at 5th. Yeah, You're absolutely right. I just want to estimate the odds ratio for a unit increase in STARTVERT with a range from minimum and maximum by 0.1.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This is a sad story😩. I want to generate a spline plot with "proc glimmix-effect", by using "estimate" to obtain odds ratio and the confidence interval, as shown below.
Can I use a loop macro and specify a reference (a reference point will set at 5th) in "estimate" statement to realize? Thank you very much!
The macro is as follows
%macro est(ref=, start=, end=, by=);
%Do i = 1 %To %eval(%SysFunc( Ceil( %SysEvalF( ( &End - &Start ) / &By ) ) ) +1) ;
%Let value=%SysEvalF( ( &Start - &By ) + ( &By * &I ) ) ;
estimate "&value." spl [-1, &ref] [1, &value] / exp cl;
%end;
%mend est;