BookmarkSubscribeRSS Feed
SAS_Muggle
Fluorite | Level 6

Hello,

 

I have some data for a binary Y (Y/N) and a continuous X.

I cut the X variable into categories based on increments of 10 (5 groups) and plotted the predicted probability. The relationship does not look linear to me. However, I would prefer to use the continuous version of the variable as I am interested in its relationship with Y. More specifically, I want to know the optimal cutpoint to discriminate Y.  I hypothesize that the association with the log odds of Y will increase and then be constant. I want to know that value of X that best predicts success, and after that higher X is no longer beneficial. What is the best way to do this? I am investigating adding an effect for a spline, but how do I know where the knots should be, as  I assume the knots are those critical values? Then, how can I estimate the OR for those intervals between the knots?

2 REPLIES 2
StatDave
SAS Super FREQ

If you use a spline effect, you can avoid choosing knot points by using a natural cubic spline. See the example in this note. Note that if the curve on the log odds scale is as you expect, it might look somewhat different on the event probability scale as in the example. When you say you want to find the point after which "X is no longer beneficial," I assume you mean on the probability scale but are thinking of the log odds curve. So, it would be best to plot both curves to help you be certain of what you want.

 

Consider the probability curve in the example in the note and suppose you wanted to pick the value where the curve begins to drop. You could use the method shown in the Risk Difference section of the note to estimate the sequential one-unit probability differences. Using the tests or confidence intervals of those differences you could select the point at which the difference becomes significantly negative. In that example, you can see that the change from StartVert=7 to 8 is -0.035 and is the first difference to become significant (p=.038).

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 406 views
  • 2 likes
  • 2 in conversation