BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
plf515
Lapis Lazuli | Level 10

In SAS PROC ROBUSTREG you can set K1, which affects the efficiency of the procedure. But I didn't see anything in the documentation about exactly what "efficiency" means nor about the advantages of changing K1 from its default value.

 

Any insights would be appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

1. K1 does not affect the efficiency of the procedure, it affects the efficiency of the estimator,

We know that under the usual assumptions of linear regression that the least squares estimates of the betas are BLUE. The ROBUSTREG doc seems to be saying that the efficiency of the M estimator is a certain percentage of the OLS estimates when the scaling parameter k1 is properly chosen.  In other words, the M estimates for the betas have more variance (they have to), but not too much more.

 

2. The k1 parameter simply scales the function used to penalize large residuals. For OLS, the penalty function is the quadratic function and we try to minimize the sum of the SQUARES of the residuals. For M estimation, we replace the quadratic function with a different function that caps the weights given to extreme residuals. The Tukey and Yohai functions are two choices. You minimize the sum of the "Tukey function" (or "Yohai function") of the residuals. The following graph compares the Tukey and Yohai functions to the quadratic function. For large residuals (large values of s), the penalty from Tukey or Yohai is much less than for the quadratic function that OLS uses.

 

data Rho;
b0 = 1.792; b1 = -0.972; b2 = 0.432; b3 = -0.052; b4 = 0.002;
do s = -5 to 5 by 0.1;
   k1 = 3.440;
   t = s / k1;
   if abs(s) <= k1 then
      Tukey = 3*t**2 - 3*t**4 + t**6;
   else Tukey=1;

   k1 = 0.868;
   t = s / k1;
   if abs(s) <= 2*k1 then
      Yohai = s**2/2;
   else if 2*k1 < abs(s) and abs(s) <= 3*k1 then
      Yohai = k1**2 * (b0+b1*t**2 + b2*t**4 + b3*t**6 + b4*t**8);
   else Yohai = 3.25*k1**2;

   Quadratic = s**2;
   if Quadratic > 3 then Quadratic=.;  /* cap the height of the quadratic function */
   output;
end;
run;

proc sgplot data=rho;
series x=s y=Tukey / curvelabel;
series x=s y=Yohai / curvelabel;
series x=s y=Quadratic / curvelabel;
xaxis label="Size of Residual";
yaxis label="Weight Given to Penalty Function";
run;

View solution in original post

1 REPLY 1
Rick_SAS
SAS Super FREQ

1. K1 does not affect the efficiency of the procedure, it affects the efficiency of the estimator,

We know that under the usual assumptions of linear regression that the least squares estimates of the betas are BLUE. The ROBUSTREG doc seems to be saying that the efficiency of the M estimator is a certain percentage of the OLS estimates when the scaling parameter k1 is properly chosen.  In other words, the M estimates for the betas have more variance (they have to), but not too much more.

 

2. The k1 parameter simply scales the function used to penalize large residuals. For OLS, the penalty function is the quadratic function and we try to minimize the sum of the SQUARES of the residuals. For M estimation, we replace the quadratic function with a different function that caps the weights given to extreme residuals. The Tukey and Yohai functions are two choices. You minimize the sum of the "Tukey function" (or "Yohai function") of the residuals. The following graph compares the Tukey and Yohai functions to the quadratic function. For large residuals (large values of s), the penalty from Tukey or Yohai is much less than for the quadratic function that OLS uses.

 

data Rho;
b0 = 1.792; b1 = -0.972; b2 = 0.432; b3 = -0.052; b4 = 0.002;
do s = -5 to 5 by 0.1;
   k1 = 3.440;
   t = s / k1;
   if abs(s) <= k1 then
      Tukey = 3*t**2 - 3*t**4 + t**6;
   else Tukey=1;

   k1 = 0.868;
   t = s / k1;
   if abs(s) <= 2*k1 then
      Yohai = s**2/2;
   else if 2*k1 < abs(s) and abs(s) <= 3*k1 then
      Yohai = k1**2 * (b0+b1*t**2 + b2*t**4 + b3*t**6 + b4*t**8);
   else Yohai = 3.25*k1**2;

   Quadratic = s**2;
   if Quadratic > 3 then Quadratic=.;  /* cap the height of the quadratic function */
   output;
end;
run;

proc sgplot data=rho;
series x=s y=Tukey / curvelabel;
series x=s y=Yohai / curvelabel;
series x=s y=Quadratic / curvelabel;
xaxis label="Size of Residual";
yaxis label="Weight Given to Penalty Function";
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 489 views
  • 4 likes
  • 2 in conversation