BookmarkSubscribeRSS Feed
FelixHugh
Calcite | Level 5

Hi, I wanted to set a constraint on a linear regression model (y = b0+b1*x1+b2*x2+e), such that the the predicted value (b0+b1*x1+b2*x2) is positive. Is there a SAS procedure that can be used for the purpose. Any help is greatly appreciated!

11 REPLIES 11
SteveDenham
Jade | Level 19

Easy way: Fit log(y) and back transform.

More difficult way: Use PROC MODEL with a RESTRICT statement.  I know it can be done, but I have never tried it myself.

Steve Denham

FelixHugh
Calcite | Level 5

Thank you Steve for your suggestions!

I did try PROC MODEL. I found it difficult to set up a RESTRICT statement for the purpose of bounding all the predicted values of y as positive.

I tried to use log(y) as the dependent variable, but the parameter estimates are not close to the values from the paper, which I was trying to replicate.

Ksharp
Super User

Doc Steve,

If I want all the coefficient ( b0 b1 b2)  greater than zero. what I am going to do ?

Best.

Xia Keshan

FelixHugh
Calcite | Level 5

Hi, sorry to just jump in. I think you may try to set multiple RESTRICT statements (restrict b0>0, b1>0, b2>0).

Rick_SAS
SAS Super FREQ

The RESTRICT statement in PROC REG only handles equality constraints, but you can use the BOUNDS statement in PROC NLIN to restrict the range of the parameters.

Rick_SAS
SAS Super FREQ

As stated, your problem is impossible unless b1=b2=0 and b0>0.  Otherwise there will always be a value of (x1, x2) for which the predicted value will be negative.

If you know that (x1,x2) are restricted to some domain (like the unit square), then it is possible.

FelixHugh
Calcite | Level 5

Thank you Rick for your reply!

The dependent variable is variance in stock market returns, which is why I want to make sure the predicted variance is positive.

The independent variables include variance and market return from last period. Market return can be negative.

Rick_SAS
SAS Super FREQ

In survival analysis, logistic regression, Poisson regression, and other models, the linear portion of the model is transformed by a "link function" to ensure that the result is positive.

lvm
Rhodochrosite | Level 12 lvm
Rhodochrosite | Level 12

Because your response variable is a variance, I recommend that you model it as a gamma distribution, with a log link. That is, use GENMOD or GLIMMIX, and choose dist=gamma and link=log. The gamma often works very well as an approximation for the true distribution of a variance at small (finite) sample sizes. You can still get predictions for the original scale in an output file. For instance,

proc glimmix ;

model var = .... / dist=gamma link=log s ;

output out=pred pred(blup ilink)=predicted;

run;

There are many other options for the output file.

Schabenberger and Pierce (2002 textbook) give an example of a regression analysis of a variance dependent variable using this idea (but with GENMOD).

FelixHugh
Calcite | Level 5

Thank you very much!

I will try that.

Ksharp
Super User

Doc Rick.

What about survival analysis ? in Survival Analysis , dependent variable(survival time) is always  > 0.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 3736 views
  • 0 likes
  • 5 in conversation