BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yunqiang12
Calcite | Level 5

for the following regression, set maximum weight =100, the upper 95% confidence line of fitted values will cross the horizontal line of weight=100, so how to find the x values at the cross points for each sex group? I found a similar question in the following link: https://communities.sas.com/t5/Statistical-Procedures/Confidence-Bands-Formula-in-PROC-REG/td-p/2289.... The answer use the scoring method to get an approximate value. But I want the exact value, anyone can help me with it? thanks a lot.

 

proc glm data=sashelp.class;
class sex;
model weight =sex height;
quit;

ANCOVAPlot.png

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

I don't really see the purpose of trying to do this. Presumably, your parameter estimates are based on data, so you will get an approximate value (an estimate) no matter what you do. Also, for most regression models, the solution you seek requires that you solve a nonlinear equation, which will require an approximate solution.

 

I will outline the method. I assume you are asking, "If I have a formula for the confidence limits for the mean predicted value, how could I find an exact value?"  This is a root-finding problem. If L(x) is the formula for the lower bound and you want to find where it crosses the line Y=100, then solve the equation for the value of x such that L(X)-100=0.  Similarly for the upper bound: U(x)-100=0.

 

So there are two issues: How can you get a formula for the upper/lower CLM and how do you solve for a root.

 

1. For an arbitrary regression model, a formula might not exist, but it does for OLS models. The formula for the CLM (predicted mean) is in the SAS/STAT documentation. Be sure to use the formulas for the predicted mean (NOT individual predictions). The formula is essentially x`*b +/- t_crit*StdErr(x), where t_Crit is the (alpha/2) quantile of the t distribution with the correct DF. (Or use 1.96 ~ z_Crit if you have a large sample.) Notice, however, that the formula involves a quadratic form that uses the "hat matrix", which is the inverse of the X`X matrix, where X is the design matrix. Not an easy thing to write exactly, although I suppose it's possible for the case of 1 regressor and a linear model.

2. You can use the FROOT function in SAS/IML to solve for the root of an arbitrary function. If your regression is a linear equation with one explanatory variable, you might be able to invert the formula and solve the equation by hand in terms of the regression coefficients, but I wouldn't want to do it.

 

In short, what you ask might be possible for a linear model of one variable, but I would be reluctant to attempt it myself. Since most people will use a numerical approximation of the hat matrix and a numerical root-finding method, there is no advantage over the "scoring method" that I showed in the other  post.

 

View solution in original post

5 REPLIES 5
Rick_SAS
SAS Super FREQ

I don't really see the purpose of trying to do this. Presumably, your parameter estimates are based on data, so you will get an approximate value (an estimate) no matter what you do. Also, for most regression models, the solution you seek requires that you solve a nonlinear equation, which will require an approximate solution.

 

I will outline the method. I assume you are asking, "If I have a formula for the confidence limits for the mean predicted value, how could I find an exact value?"  This is a root-finding problem. If L(x) is the formula for the lower bound and you want to find where it crosses the line Y=100, then solve the equation for the value of x such that L(X)-100=0.  Similarly for the upper bound: U(x)-100=0.

 

So there are two issues: How can you get a formula for the upper/lower CLM and how do you solve for a root.

 

1. For an arbitrary regression model, a formula might not exist, but it does for OLS models. The formula for the CLM (predicted mean) is in the SAS/STAT documentation. Be sure to use the formulas for the predicted mean (NOT individual predictions). The formula is essentially x`*b +/- t_crit*StdErr(x), where t_Crit is the (alpha/2) quantile of the t distribution with the correct DF. (Or use 1.96 ~ z_Crit if you have a large sample.) Notice, however, that the formula involves a quadratic form that uses the "hat matrix", which is the inverse of the X`X matrix, where X is the design matrix. Not an easy thing to write exactly, although I suppose it's possible for the case of 1 regressor and a linear model.

2. You can use the FROOT function in SAS/IML to solve for the root of an arbitrary function. If your regression is a linear equation with one explanatory variable, you might be able to invert the formula and solve the equation by hand in terms of the regression coefficients, but I wouldn't want to do it.

 

In short, what you ask might be possible for a linear model of one variable, but I would be reluctant to attempt it myself. Since most people will use a numerical approximation of the hat matrix and a numerical root-finding method, there is no advantage over the "scoring method" that I showed in the other  post.

 

yunqiang12
Calcite | Level 5

Hi Rick, thanks a lot for your response. In the pharmaceutical field, we need to know when will the tributes of the drug exceed the criteria. As you said this is possible when there is one predictor. 

Rick_SAS
SAS Super FREQ

If you want more help, post some sample data, the model you want to fit, and the threshold value.

yunqiang12
Calcite | Level 5

Thanks Rick,  please see the following codes, i can't make the froot function work.

 

data d3lots;
input Lot $ Time Result;
datalines;
A 0 0
A 3 0.06
A 6 0.08
A 9 0.11
A 12 0.13
A 18 0.21
A 24 0.21
A 36 0.34
B 0 0
B 3 0.05
B 6 0.08
B 9 0.11
B 12 0.11
B 18 0.19
B 24 0.2
B 36 0.31
C 0 0
C 3 0.05
C 6 0.08
C 9 0.11
C 12 0.11
C 18 0.2
C 24 0.22
C 36 0.43
;
run;

ods trace on;
proc glm data=d3lots;
class Lot;
model Result=Time Lot Time*Lot /e3 solution p clm inverse;
ods output InvXPX=inv FitStatistics=rootmse;
run;
ods trace off;

/* get x'x-1 matrix and betahat */
data inv(drop=Parameter Result) bhat(keep=Result);
set inv;
where Parameter ne 'Result';
run;

/* get sigmahat */
data rtmse;
set rootmse(keep=RootMSE);
run;

/* find the x value when 95% confidence interval of predicted mean cross the criteria Y value. */
proc iml;
start Func(x);
0.3=x*bhat + tinv(0.975, 22)*sqrt(x*inv*x`); /*0.3 is the criteria. */
finish Func;
interval = {25 36};
roots = froot( "Func", interval);
print roots;
quit;

Rick_SAS
SAS Super FREQ

The FUNC function would have to have a body like

return ( x*bhat + tinv(0.975, 22)*sqrt(x*inv*x`) - 0.3 );   /* find the zero of this function */

 

However, the equation you wrote is a multivariate function. Your original picture indicates that x is a scalar value, but In your equation, x would have to be an 8-dimensional vector.  The locus of x values for which the equation is satisfied is a 7-dimensional surface, which cannot be easily found.  (For example, the 7-dimensional surface y=x1^2 + x2^2 intersects y=0.3 on the 1-dimensional ellipse x1^2 + x2^2=03.)

 

In other words, there are infinitely many values of x that satisfy your equation.

 

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1289 views
  • 2 likes
  • 2 in conversation