BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
PamG
Quartz | Level 8

How does one calculate the slope between knots in an interaction model with a bspline of degree 1?  I would like to report the rate of change between the knot points for M and F in the example given below.

 

Without an interacting variable, it is given as explained in https://documentation.sas.com/doc/en/imlug/15.2/imlug_langref_sect072.htm.

How is it calculated if you have an interaction variable?

proc logistic data=sashelp.class;
class sex(ref="F")/param=ref;
effect htS=spline(height/ degree=1 knotmethod=percentilelist(25,75));
*effect htS=spline(height/ basis=bspline degree=1 knotmethod=percentilelist(25,75));
model age=htS sex htS*sex;
store model;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

The model on that example has considerable problems with separation issues and huge standard errors. Instead, consider the neuralgia data in the example titled "Logistic Modeling with Categorical Predictors" in the LOGISTIC documentation.

 

The following fits the interaction model between SEX and the spline on the continuous DURATION variable. The DETAILS option in the EFFECT statement shows that the knots are at 8 and 26, so the following DATA step adds observations at those values for both SEX levels in order to compute predicted values which are displayed by PROC PRINT after the model is fit in PROC LOGISTIC. All that is needed in the MODEL statement is the interaction term since the degrees of freedom for the two main effects get included with the interaction effect. So, the model fit is identical to specifying the main effects and interaction. The ESTIMATE statements use the nonpositional syntax to request the predicted values (on the log odds and, with the ILINK option, on the probability scale) for both sexes at both knot points. These are then confirmed by the predicted values at those points by the OUTPUT statement and displayed by PROC PRINT using the added observations.

data a; 
duration=8; sex='M'; output;
duration=8; sex='F'; output;
duration=26; sex='M'; output;
duration=26; sex='F'; output;
run;
data b; set a neuralgia; run;
proc logistic data=b;
effect spld=spline(duration/ degree=1 details knotmethod=percentilelist(25,75)); 
class sex/param=glm;
model pain=spld*sex;
estimate 
 'F8'  intercept 1 spld*sex[1, 1 8],
 'F26' intercept 1 spld*sex[1, 1 26] / ilink e;
estimate 
 'M8'  intercept 1 spld*sex[1, 2 8],
 'M26' intercept 1 spld*sex[1, 2 26] / ilink e;
ods output coef=c;
store log;
output out=out p=p xbeta=xb;
run;
proc print data=out(obs=4);
run;

The slopes between the knots for each sex can then be computed, on the log odds scale, as the differences of the values at the knots using the following ESTIMATE statements. This is computed with the usual rise/run concept of a slope, so the difference is divided by 26-8=18. Plots of the fitted model on both the log odds (LINK option) and on the probability scale (no option) are given by the EFFECTPLOT statements.

proc plm source=log;
effectplot;
effectplot/link yrange=(-2,2);
estimate 'F logit slope' spld*sex [1, 1 8][-1, 1 26],
         'M logit slope' spld*sex [1, 2 8][-1, 2 26] / divisor=18 cl;
run;

Finally, the slopes on the probability scale cannot be obtained using ESTIMATE statements, but can be obtained with the NLMeans macro using the fitted model from the STORE statement and the coefficients of the ESTIMATE statements as saved by the ODS OUTPUT statement in PROC LOGISTIC. Note that the use of the f=, fset=, and flabel= option requires the most recent release of the macro. See the macro documentation for details and more examples on using the macro and these options.

%nlmeans(instore=log, coef=c, link=logit, f=(mu1-mu2)/18, fset=1, flabel=F Prob Slope)
%nlmeans(instore=log, coef=c, link=logit, f=(mu1-mu2)/18, fset=2, flabel=M Prob Slope)

Note that this use of a rise/run computation assumes a straight line fit between the knots which generally is not true using splines and particularly when considering the model on the probability scale. But the plots seem to show reasonably straight lines in this example. Also note that the same approach could be used for a computation between other points 

View solution in original post

12 REPLIES 12
PamG
Quartz | Level 8
Thanks for the links, Ksharp. Both the links do not address my question. My spline is a linear one (degree=1), so I would like to get the slope at the points of inflection.
Rick_SAS
SAS Super FREQ

This is a cumulative logit model. When combined with the spline effects, I think it will be hard to obtain an analytical expression for the derivative of the predicted responses. It's much easier to use a numerical approximation. I propose:

 

  1. Create scoring data that uses small steps between points in the independent variable (HEIGHT).
  2. Use the DIF function to form the finite-difference derivatives.  See also the article, "Compute derivatives for nonparametric regression models."

Here is the code that creates the scoring data and that visualizes the predicted responses. You can use the links I've provided to obtain the finite-difference derivatives at any (sex, height) combination.

 

proc means data=sashelp.class;
class sex;
var height;
run;

data scoreData;
sex='F';
do height = 51.4 to 66.5 by 0.1;
   output;
end;
sex = 'M';
do height = 57.5 to 72 by 0.1;
   output;
end;
run;


proc logistic data=sashelp.class;
class sex(ref="F")/param=ref;
effect htS=spline(height/ degree=1 knotmethod=percentilelist(25,75));
model age=htS sex htS*sex;
score data=scoreData out=ScoreWide;
run;

title "Visualize the Cumulative Logit Model";
proc sgpanel data=ScoreWide;
panelby sex;
series x=height y=P_11;
series x=height y=P_12;
series x=height y=P_13;
series x=height y=P_14;
series x=height y=P_15;
series x=height y=P_16;
run;

Rick_SAS_0-1760365056446.png

 

StatDave
SAS Super FREQ

Can you confirm if the response for your actual data (rather than the CLASS data example you show)  has two or multiple levels?  If it is actually binary, then this could be done relatively easily with ESTIMATE statements if you want the slopes between points on the log odds scale. If you want to estimate the slopes on the probability scale, then you would also need the NLMeans macro.

PamG
Quartz | Level 8
Thanks for pointing that out. I typed out my code without checking the data. My outcome is binary and what I want is to get the slope , on log odds, probability and odds ratio scale. I have changed my code -
data class; set sashelp.class; wt=(weight > 90); run;


proc logistic data=class desc;
class sex(ref="F")/param=ref;
effect htS=spline(height/ degree=1 knotmethod=percentilelist(25,75));
*effect htS=spline(height/ basis=bspline degree=1 knotmethod=percentilelist(25,75));
model wt=htS sex htS*sex;
store model;
run;
PamG
Quartz | Level 8
I would like the slope not on the transformed spline basis but the original variable. So I can say for example that the slope is x for females when height is less than the 25th percentile.
StatDave
SAS Super FREQ

The model on that example has considerable problems with separation issues and huge standard errors. Instead, consider the neuralgia data in the example titled "Logistic Modeling with Categorical Predictors" in the LOGISTIC documentation.

 

The following fits the interaction model between SEX and the spline on the continuous DURATION variable. The DETAILS option in the EFFECT statement shows that the knots are at 8 and 26, so the following DATA step adds observations at those values for both SEX levels in order to compute predicted values which are displayed by PROC PRINT after the model is fit in PROC LOGISTIC. All that is needed in the MODEL statement is the interaction term since the degrees of freedom for the two main effects get included with the interaction effect. So, the model fit is identical to specifying the main effects and interaction. The ESTIMATE statements use the nonpositional syntax to request the predicted values (on the log odds and, with the ILINK option, on the probability scale) for both sexes at both knot points. These are then confirmed by the predicted values at those points by the OUTPUT statement and displayed by PROC PRINT using the added observations.

data a; 
duration=8; sex='M'; output;
duration=8; sex='F'; output;
duration=26; sex='M'; output;
duration=26; sex='F'; output;
run;
data b; set a neuralgia; run;
proc logistic data=b;
effect spld=spline(duration/ degree=1 details knotmethod=percentilelist(25,75)); 
class sex/param=glm;
model pain=spld*sex;
estimate 
 'F8'  intercept 1 spld*sex[1, 1 8],
 'F26' intercept 1 spld*sex[1, 1 26] / ilink e;
estimate 
 'M8'  intercept 1 spld*sex[1, 2 8],
 'M26' intercept 1 spld*sex[1, 2 26] / ilink e;
ods output coef=c;
store log;
output out=out p=p xbeta=xb;
run;
proc print data=out(obs=4);
run;

The slopes between the knots for each sex can then be computed, on the log odds scale, as the differences of the values at the knots using the following ESTIMATE statements. This is computed with the usual rise/run concept of a slope, so the difference is divided by 26-8=18. Plots of the fitted model on both the log odds (LINK option) and on the probability scale (no option) are given by the EFFECTPLOT statements.

proc plm source=log;
effectplot;
effectplot/link yrange=(-2,2);
estimate 'F logit slope' spld*sex [1, 1 8][-1, 1 26],
         'M logit slope' spld*sex [1, 2 8][-1, 2 26] / divisor=18 cl;
run;

Finally, the slopes on the probability scale cannot be obtained using ESTIMATE statements, but can be obtained with the NLMeans macro using the fitted model from the STORE statement and the coefficients of the ESTIMATE statements as saved by the ODS OUTPUT statement in PROC LOGISTIC. Note that the use of the f=, fset=, and flabel= option requires the most recent release of the macro. See the macro documentation for details and more examples on using the macro and these options.

%nlmeans(instore=log, coef=c, link=logit, f=(mu1-mu2)/18, fset=1, flabel=F Prob Slope)
%nlmeans(instore=log, coef=c, link=logit, f=(mu1-mu2)/18, fset=2, flabel=M Prob Slope)

Note that this use of a rise/run computation assumes a straight line fit between the knots which generally is not true using splines and particularly when considering the model on the probability scale. But the plots seem to show reasonably straight lines in this example. Also note that the same approach could be used for a computation between other points 

PamG
Quartz | Level 8

This works brilliantly!!! Thanks so much StatDave!  

I am not able to run the NLMEANS macro.  I downloaded the latest version from the link provided but it gives an error saying that "parameter where, covdrop and null was defined in the macro".  These arguments are also optional so I am not sure why it is not working.

StatDave
SAS Super FREQ
Did you also download the latest version of the NLEST macro? NLEST is called by NLMeans so you will need the latest version of that too. A link to NLEST is in the NLMeans documentation. If you still get an error after updating both, attach the section of your log showing your NLMeans call and any messages that follow.
PamG
Quartz | Level 8

I found out that I also need to download the %NLEST macro, which I did and it works.  Thanks a lot StatDave!  Also the results coincides with the solution given by RickSAS.  I do find your solution easier to use with fewer lines of code.

PamG
Quartz | Level 8

Thank you @StatDave  for quick responses and a detailed solution with all the explanantions!!!!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 215 views
  • 1 like
  • 4 in conversation