Hi
I am trying to match output that SAS produces in weighted ridge regression by doing an optimization routine.
What I implemented in SAS:
proc reg data=dataset
outest=b outseb ridge = 0.5;
OurPick5: model varY = varX1 varX2 varX3;
weight var5;
run;
proc
In doing this in excel through an optimization routine, I am unable to get the output that SAS produces.
I have learnt the following that is also perplexing to me and would like a betterunderstanding:
1. If my weights are constant and 1, I get the same adjusted coeffcieints in the ridge regression (after adjustment of multiplying with the inverse of the standard deviation of the Xs) as I do with a ridge regression with weights that are constant yet notEqualTo 1. I expect this in OLS but didnt expect this in ridge. Can someone explain why please?
2. When I have a sequence of the weights, whyadjustment do I need to apply to my excel/optimiaztion coefficients to recover the ridge and weighted coefficients that SAS produces.
Hope this makes sense, any help would be greatly appreciated!
Yes. "#" is the elementwise multiplication operator. The line multiplies the i_th row of X by w[i].
As a rule of thumb, weighted regression uses the normal equations X`WX on the left and X`WY on the right. Thus you can get equivalent results by multiplying each observation by the square-root of the weight and using ordinary regression (in Excel, for example). That is, define V = sqrt(W)*X and U = sqrt(W)*Y and perform an ordinary regression of U on V. Notice that if you rescale the vector of weights by any constant, then the results are exactly the same. Some textbooks assume that the weights are often standardized so that sum(W) = 1, which makes some formulas easier to write.
So (1) the answers are the same because you are multiplying the weights by a constant.
(1) the answers are the same because you are multiplying the weights by a constant.
(2) I don't know what your Excel computations look like, but try multiplying the X's and Y's by sqrt of the weights.
Thanks Rick
I get the results of SAS if I multiply the weights with the residuals but not if I multiply the data with the weights (or their sqrt). I find this very puzzling... Im guessing there is some adjustment for the betas that I am missing to replicate what the results of SAS are?
Also, does SAS do some normalization if the weights dont sum up to one? Because in my case, the weights do not sum to one and I dont want them to (i.e. I dont want to use the reweight command)
I suggest you post your SAS code and results so that we can see what you are talking about.
I don't think that PROC REG explicitly standardizes the weight variable, but as I've indicated the parameter estimates are identical to what you get when the weights are standardized.
Hi
Here is the SAS code
------------------------------------------------------
proc reg data=GlobalRegData outest=b outseb ridge = 0.5; OurPick5: model InflationAdj_Log_Ch = InflationAdj_Log_Lag12 GdpRealYoY_Lag12 Yld10yr_Ch_Lag12 GdpRealYoY_Ch_Lag12 InflationAdj_Log_Ch_Lag12; weight SqrtGdpwgtNorm; where 195001<=Eom<=200512 and CountryID^=100; run; proc print data=b; run;
-----------------------------------------------------
The output from SAS is:
SAS Output
The SAS System |
12510 |
7209 |
5301 |
5 | 8.14800 | 1.62960 | 466.74 | <.0001 |
7203 | 25.14870 | 0.00349 | ||
7208 | 33.29669 |
0.05909 | 0.2447 |
-0.00886 | 0.2442 |
-666.73591 |
-0.53073 | 0.01686 | -31.49 | <.0001 |
-0.15417 | 0.00621 | -24.84 | <.0001 |
3.63783 | 0.11334 | 32.10 | <.0001 |
4.43167 | 0.24145 | 18.35 | <.0001 |
-1.35277 | 0.09557 | -14.15 | <.0001 |
-0.19969 | 0.01158 | -17.24 | <.0001 |
SAS Output
OurPick5 | PARMS | InflationAdj_Log_Ch | . | . | 0.059088 | -0.53073 | -0.15417 | 3.63783 | 4.43167 | -1.35277 | -0.19969 | -1 |
OurPick5 | SEB | InflationAdj_Log_Ch | . | . | 0.059088 | 0.01686 | 0.00621 | 0.11334 | 0.24145 | 0.09557 | 0.01158 | -1 |
OurPick5 | RIDGE | InflationAdj_Log_Ch | 0.5 | . | 0.060758 | -0.33352 | -0.10023 | 1.90394 | 2.36412 | -0.14068 | -0.09679 | -1 |
OurPick5 | RIDGESEB | InflationAdj_Log_Ch | 0.5 | . | 0.060758 | 0.01067 | 0.00397 | 0.05857 | 0.15231 | 0.04671 | 0.00678 | -1 |
Objective function | 509.574 | ||||||
Weighted residual sum of squares | 478.758 | ||||||
Ridge penalty function | 30.815 | ||||||
GdpRealYoY_Ch | Constant | InflationAdj_Log_Lag12 | GdpRealYoY_Lag12 | Yld10yr_Ch_Lag12 | GdpRealYoY_Ch_Lag12 | InflationAdj_Log_Ch_Lag12 | |
Betas | -0.0508 | 0.0644 | 0.0315 | -0.0054 | -0.0284 | ||
3604.5 | 3604.5 | 3604.5 | 3604.5 | 3604.5 | |||
Ridge coefficients | 0.5000 | 0.5000 | 0.5000 | 0.5000 | 0.5000 | ||
-0.33747425 | -0.102608608 | 1.852412258 | 2.407528659 | -0.122062167 | -0.096160888 | ||
Notice the slight difference: I am sure this is because of weights and ridge together. When I run the corresponding results for no ridge/weights and ridge/no weights, I can match the SAS output to the 6th decimal place.
Have you read the article about "Understanding ridge regression in SAS", including the part about centered and scaling the data? There is a SAS/IML version of ridge regression, which should be similar to your MATLAB version.
Yes Rick, I came across this. I de-meaned my data and scaled each variable with its stdev. which is what I read there.
I didnt see anything in terms of doing anything to weights on the page though...
Did you use WEIGHTED means to center the data?
The following program is from the blog post that I linked to. The only substantive changes are (1) use weighted means to center the data, and (2) use weights in matrix computations. The program gets the same parameter estimates as PROC REG, up to numerical precision.
/* Ridge regression example from PROC REG documentation */
data acetyl;
input x1-x4 @@;
x1x2 = x1 * x2;
x1x1 = x1 * x1;
call streaminit(12345);
w = rand("uniform");
datalines;
1300 7.5 0.012 49 1300 9 0.012 50.2 1300 11 0.0115 50.5
1300 13.5 0.013 48.5 1300 17 0.0135 47.5 1300 23 0.012 44.5
1200 5.3 0.04 28 1200 7.5 0.038 31.5 1200 11 0.032 34.5
1200 13.5 0.026 35 1200 17 0.034 38 1200 23 0.041 38.5
1100 5.3 0.084 15 1100 7.5 0.098 17 1100 11 0.092 20.5
1100 17 0.086 29.5
;
proc reg data=acetyl outest=b ridge=0.5 noprint;
model x4=x1 x2 x3 x1x2 x1x1;
weight w;
run;
proc print data=b(where=(_TYPE_="RIDGE")) noobs;
var _RIDGE_ Intercept x1 x2 x3 x1x2 x1x1;
run;
/* Ridge regression coefficients computed in SAS/IML */
proc iml;
use acetyl;
read ALL var {x1 x2 x3 x1x2 x1x1} into X;
read all var {x4} into y;
read all var {w};
close acetyl;
use b where(_TYPE_="RIDGE");
read all var {_RIDGE_} into k;
read all var {Intercept x1 x2 x3 x1x2 x1x1} into RegB;
close;
start WtMean(x, w);
m = (x#w)[+,] / sum(w); /* compute weighted mean */
return( m );
finish;
/* ridge regression */
xBar = WtMean(X,w); /* save row vector of means */
yBar = WtMean(y,w); /* save mean of Y */
X = X - xBar; /* center X and y; do not add intercept */
y = y - yBar;
/* include weights for matrix computations */
X = sqrt(w)#X;
y = sqrt(w)#y;
D = vecdiag(X`*X);
Z = X / sqrt(D`); /* Z`Z = corr(X) */
b = (1/sqrt(D)) # inv(Z`*Z + k*I(ncol(X))) * (Z`*y); /* formula in REG doc */
/* Recover the intercept */
b0 = ybar - xbar * b;
b = b0 || b`;
print b[colname={"Intercept" "x1" "x2" "x3" "x1x2" "x1x1"}];
diff = (RegB-b);
print diff;
Hi Rick
This is very helpful!
One question: I dont understand the computation done in this line:
/* include weights for matrix computations */
X = sqrt(w)#X;
Is the previous de-meaned X being multiplied by its weight?
Yes. "#" is the elementwise multiplication operator. The line multiplies the i_th row of X by w[i].
Thanks a lot Rick, that worked.
Even SAS user help didnt know all the info to recreate the results from SAS.
They should not just hire you they should make you their client service head 😉
I know you meant it as a joke/compliment, but I wouldn't be critical of Tech Support. How to generalize an analysis to handle weights is rarely clear or simple. Most textbooks and papers in journals do not describe how an algorithm should be modified for weights, and in this case the SAS documentation does not provide details for the weighted computation.
Even for a simple statistic like a weighted standard deviation, there is no general consensus about how to compute it, which is why SAS provided the VARDEF= option in many procedures for controlling how variances are computed.
Thanks Rick.
I have a related question: I implemented your IML code with a small alteration. While yours works perfectly, mine has the following error:
107 D = vecdiag(X`*X);
ERROR: (execution) Invalid argument or operand; contains missing values.
The (X`*X) is causing the issue.
Here is the code Im running
/* Ridge regression example from PROC REG documentation data acetyl; input x1-x4 @@; x1x2 = x1 * x2; x1x1 = x1 * x1; call streaminit(12345); w = rand("uniform"); datalines; 1300 7.5 0.012 49 1300 9 0.012 50.2 1300 11 0.0115 50.5 1300 13.5 0.013 48.5 1300 17 0.0135 47.5 1300 23 0.012 44.5 1200 5.3 0.04 28 1200 7.5 0.038 31.5 1200 11 0.032 34.5 1200 13.5 0.026 35 1200 17 0.034 38 1200 23 0.041 38.5 1100 5.3 0.084 15 1100 7.5 0.098 17 1100 11 0.092 20.5 1100 17 0.086 29.5 ; proc reg data=acetyl outest=b ridge=0.5 noprint; model x4=x1 x2 x3 x1x2 x1x1; weight w; run; proc print data=b(where=(_TYPE_="RIDGE")) noobs; var _RIDGE_ Intercept x1 x2 x3 x1x2 x1x1; run; */ libname CmeData "\\link\CME_SAS_inputData_differentPeriods"; run; data GlobalRegData; set data_database; output GlobalRegData; run; /*************************************************************** Global Curvature ****************************************************************/ proc reg data=GlobalRegData outest=bb ridge = 0.5; OurPick3: model Curvature_Ch = Curvature_Lag12 Yld10yr_Ch; weight SqrtGdpwgtNorm; where 195001<=Eom<=200512 and CountryID^=100; run; proc print data=bb; run; /* Ridge regression coefficients computed in SAS/IML */ proc iml; use GlobalRegData; read ALL var {Curvature_Lag12 Yld10yr_Ch} into X; read all var {Curvature_Ch} into y; read all var {SqrtGdpwgtNorm}; close GlobalRegData; /* use b where(_TYPE_="RIDGE"); read all var {_RIDGE_} into k; read all var {Intercept x1 x2 x3 x1x2 x1x1} into RegB; close; */ start WtMean(x, w); m = (x#w)[+,] / sum(w); /* compute weighted mean */ return( m ); finish; w=SqrtGdpwgtNorm; /* ridge regression */ xBar = WtMean(X,w); /* save row vector of means */ yBar = WtMean(y,w); /* save mean of Y */ X = X - xBar; /* center X and y; do not add intercept */ y = y - yBar; print xBar; /* include weights for matrix computations */ X = sqrt(w)#X; y = sqrt(w)#y; D = vecdiag((X`)*X); print (X`); Z = X / sqrt(D`); /* Z`Z = corr(X) */ b = (1/sqrt(D)) # inv(Z`*Z + k*I(ncol(X))) * (Z`*y); /* formula in REG doc */
As the error message says, you can't multiply two matrices that have missing values. Imitate PROC REG and remove the missing values from the data before you perform the analysis.
Hello JB1983
Did you get standard error or t value in the ridge regression output?
if yes then how?
if no then how will you explain the significance of beta coefficients. Also what about model fit using f value, r square/adj r square.
Please help.
Thank you.
Rajneesh Ranjan Jha
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.