BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tka726
Obsidian | Level 7

I am using restricted cubic splines with logistic regression. I can’t seem to find much documentation on the output so I am not sure how to interpret! I am using 5 knots (percentile list).

 

proc logistic data=EVENTS0;
effect spl = spline(PRALBUM / details naturalcubic basis=tpf(noint)
                       KNOTMETHOD=PERCENTILELIST(5 27.5 50 72.5 95)); 
model MAJOR(event="1") = spl / selection=none alpha=0.10;
oddsratio PRALBUM / at(PRALBUM=2.0 to 5.0 by .10) cl=pl;
ods output ORPlot=orp;
quit;

 

tapruzzese_0-1634177629181.png

 

I specified 5 knots, which it shows above, so what does the ‘basis details for spline effect’ mean?

 

tapruzzese_1-1634177629185.png

 

How do I interpret the estimates for the splines?

tapruzzese_2-1634177629191.png

I thought I could exponentiate the estimates to have the OR's for each spline, but that doesn't seem to go along with the OR's below.

tapruzzese_3-1634177629199.png

Any help is appreciated!!

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

See this note, particularly the "Models involving constructed effects such as splines" section and this section on the ESTIMATE statement where its use with spline models is discussed. The following uses the kyphosis data example in this note. The PROC LOGISTIC statements below fit a spline model similar to yours and computes the odds ratio for the predictor at 10. As discussed in the first note above, you can use the HAZARDRATIO statement in PROC PHREG to determine the coefficients of the linear combination of model parameters that estimates a unit increase in the predictor. Those coefficients are used in an ESTIMATE statement to reproduce the results from the ODDSRATIO statement. Alternatively, you can use the nonpositional syntax in the ESTIMATE, along with the E option, to see the coefficients for the estimated log odds at 10 and at 11. The difference in those two sets of coefficients is the set of coefficients for the unit change as from the HAZARDRATIO statement. The effect plot gives a visual confirmation of the values.

proc phreg data=kyphosis;
effect spl = spline(StartVert / naturalcubic basis=tpf(noint));
model Kyphosis = spl;
hazardratio StartVert / at(StartVert=10) e;
run;
proc logistic data=kyphosis;
effect spl = spline(StartVert / naturalcubic basis=tpf(noint));
model Kyphosis(event="1") = spl;
oddsratio StartVert / at(StartVert=10);
estimate 'or' spl 1 8.992647 / e exp cl;
estimate 'at 10' intercept 1 spl[1,10] / e exp cl;
estimate 'at 11' intercept 1 spl[1,11] / e exp cl;
effectplot / link;
run;

But I have to ask, what is the need to know the actual coefficients of the linear combination that the ODDSRATIO statement uses if the ODDSRATIO statement determines them for you? 

View solution in original post

17 REPLIES 17
Rick_SAS
SAS Super FREQ

The best documentation for spline effects is the doc for the EFFECT statement.

Briefly, the "Details" table is telling you that the EFFECT statement resulted in four columns in the design matrix. The "break knots" define the regions on which each knot is nonzero, and shows that the interior knots are cubic polynomials (power=3).

 

To understand the parameter estimates, see the article, "Visualize a regression with splines."  

 

I do not know the answer to your question about odds ratios.

 

StatDave
SAS Super FREQ

Exponentiating a predictor's parameter estimate only works when the predictor is not involved in interactions or in constructed effects such as splines. In general, the computation of the odds ratio is a linear combination of model parameters. This is correctly done by the ODDSRATIO statement. So, your odds ratio results are properly telling you the change in odds for a unit increase in the original predictor at various points on that predictor.

tka726
Obsidian | Level 7

I know that the parameter estimates from logistic regression are the change in log odds.. I guess my question is - how are these odds ratios actually calculated from the parameter estimates? For example, I see that if someone has a level of 3.6 then the OR is 0.379 but I can't figure out how this is calculated from the parameter estimates.

StatDave
SAS Super FREQ

See this note, particularly the "Models involving constructed effects such as splines" section and this section on the ESTIMATE statement where its use with spline models is discussed. The following uses the kyphosis data example in this note. The PROC LOGISTIC statements below fit a spline model similar to yours and computes the odds ratio for the predictor at 10. As discussed in the first note above, you can use the HAZARDRATIO statement in PROC PHREG to determine the coefficients of the linear combination of model parameters that estimates a unit increase in the predictor. Those coefficients are used in an ESTIMATE statement to reproduce the results from the ODDSRATIO statement. Alternatively, you can use the nonpositional syntax in the ESTIMATE, along with the E option, to see the coefficients for the estimated log odds at 10 and at 11. The difference in those two sets of coefficients is the set of coefficients for the unit change as from the HAZARDRATIO statement. The effect plot gives a visual confirmation of the values.

proc phreg data=kyphosis;
effect spl = spline(StartVert / naturalcubic basis=tpf(noint));
model Kyphosis = spl;
hazardratio StartVert / at(StartVert=10) e;
run;
proc logistic data=kyphosis;
effect spl = spline(StartVert / naturalcubic basis=tpf(noint));
model Kyphosis(event="1") = spl;
oddsratio StartVert / at(StartVert=10);
estimate 'or' spl 1 8.992647 / e exp cl;
estimate 'at 10' intercept 1 spl[1,10] / e exp cl;
estimate 'at 11' intercept 1 spl[1,11] / e exp cl;
effectplot / link;
run;

But I have to ask, what is the need to know the actual coefficients of the linear combination that the ODDSRATIO statement uses if the ODDSRATIO statement determines them for you? 

tka726
Obsidian | Level 7
Thank you!! That is very helpful! I like to manually check my answers when I can.
One more question - I understand the estimates 'at 10' and 'at 11', but what is this estimate statement estimating the OR of?
estimate 'or' spl 1 8.992647 / e exp cl;
I tried that in my program (with a different value) and it did not match the result from the Odds Ratios Estimates.
For example, I tried estimate: 'or' spl 1 3.5 / e exp cl and the exponentiated result is 0.5510 which doesn't match with the 0.398 from the Odds Ratios Estimates section.
Thanks again, your help is greatly appreciated!
tka726
Obsidian | Level 7
Ignore above, I understand perfectly now. Thanks again!
April2211
Fluorite | Level 6

Thank you very much for your patient answer! But how is "8.992647"calculated with spline🤔

 

April2211
Fluorite | Level 6

exp("at 11")/exp("at 10")=exp("or")?

spl 1 8.992647=spl [1,8.992647 ] ?

"8.992647"=10-1?

 Thanks for your answer, I’d be very grateful!

 

Rick_SAS
SAS Super FREQ

You need to run the example that you marked as the correct solution. When you run the PROC PHREG example, you will see that the parameter estimates for the hazard ratios for STARTVERT are 

spl1 Coefficient = 1.0

spl2 Coefficient = 8.992647

That is why those value are use in the first ESTIMATE statement in PROC LOGISTIC.

April2211
Fluorite | Level 6

The result of this calculation (estimate 'or' spl 1 8.992647 / e exp cl;)  

is the same as the result of  (estimate 'or' spl [-1,10] [1,11]/ e exp cl;).

So can I get odds ratio from "estimate" by using a loop macro and specifying a reference?

Thank you very much!

StatDave
SAS Super FREQ

It's not clear what you are asking, but if you mean that you want to estimate the odds ratio for a unit increase in STARTVERT at more than one value - for example, at 10, 12, and 14 instead of just at 10 - then the easiest way is to simply specify the values in the AT option in the ODDSRATIO statement: 

oddsratio StartVert / at(StartVert=10 12 14);

To do the same with ESTIMATE statements, you would need to produce multiple ESTIMATE statements - the first with values 10 and 11 like you've done, the next with values 12 and 13, and the last with values 14 and 15.

April2211
Fluorite | Level 6

the 'odds ratio ' table is not produced for the model with constructed fixed effects, but "estimate" dose, there are also confidence intervals.

A reference point will set at 5th. Yeah, You're absolutely right.just want to estimate the odds ratio for a unit increase in STARTVERT with a range from minimum and maximum by 0.1.

April2211
Fluorite | Level 6

This is a sad story😩. I want to generate a spline plot with "proc glimmix-effect", by using "estimate" to obtain odds ratio and the confidence interval, as shown below.

April2211_0-1639466284520.png

 

Can I use a loop macro and specify a reference (a reference point will set at 5th) in "estimate" statement to realize? Thank you very much!

The macro is as follows

%macro est(ref=, start=, end=, by=);
%Do i = 1 %To %eval(%SysFunc( Ceil( %SysEvalF( ( &End - &Start ) / &By ) ) ) +1) ;
%Let value=%SysEvalF( ( &Start - &By ) + ( &By * &I ) ) ;
estimate "&value." spl [-1, &ref] [1, &value] / exp cl;
%end;
%mend est;

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 17 replies
  • 4638 views
  • 5 likes
  • 5 in conversation