Solved: Re: Advantages of higher or lower K1 in PROC ROBUSTREG

plf515 · Posted 12-17-2019 12:50 PM

In SAS PROC ROBUSTREG you can set K1, which affects the efficiency of the procedure. But I didn't see anything in the documentation about exactly what "efficiency" means nor about the advantages of changing K1 from its default value.

Any insights would be appreciated.

Rick_SAS · Posted 12-17-2019 02:03 PM

1. K1 does not affect the efficiency of the procedure, it affects the efficiency of the estimator,

We know that under the usual assumptions of linear regression that the least squares estimates of the betas are BLUE. The ROBUSTREG doc seems to be saying that the efficiency of the M estimator is a certain percentage of the OLS estimates when the scaling parameter k1 is properly chosen. In other words, the M estimates for the betas have more variance (they have to), but not too much more.

2. The k1 parameter simply scales the function used to penalize large residuals. For OLS, the penalty function is the quadratic function and we try to minimize the sum of the SQUARES of the residuals. For M estimation, we replace the quadratic function with a different function that caps the weights given to extreme residuals. The Tukey and Yohai functions are two choices. You minimize the sum of the "Tukey function" (or "Yohai function") of the residuals. The following graph compares the Tukey and Yohai functions to the quadratic function. For large residuals (large values of s), the penalty from Tukey or Yohai is much less than for the quadratic function that OLS uses.

data Rho;
b0 = 1.792; b1 = -0.972; b2 = 0.432; b3 = -0.052; b4 = 0.002;
do s = -5 to 5 by 0.1;
   k1 = 3.440;
   t = s / k1;
   if abs(s) <= k1 then
      Tukey = 3*t**2 - 3*t**4 + t**6;
   else Tukey=1;

   k1 = 0.868;
   t = s / k1;
   if abs(s) <= 2*k1 then
      Yohai = s**2/2;
   else if 2*k1 < abs(s) and abs(s) <= 3*k1 then
      Yohai = k1**2 * (b0+b1*t**2 + b2*t**4 + b3*t**6 + b4*t**8);
   else Yohai = 3.25*k1**2;

   Quadratic = s**2;
   if Quadratic > 3 then Quadratic=.;  /* cap the height of the quadratic function */
   output;
end;
run;

proc sgplot data=rho;
series x=s y=Tukey / curvelabel;
series x=s y=Yohai / curvelabel;
series x=s y=Quadratic / curvelabel;
xaxis label="Size of Residual";
yaxis label="Weight Given to Penalty Function";
run;

View solution in original post

Rick_SAS · Posted 12-17-2019 02:03 PM

1. K1 does not affect the efficiency of the procedure, it affects the efficiency of the estimator,

We know that under the usual assumptions of linear regression that the least squares estimates of the betas are BLUE. The ROBUSTREG doc seems to be saying that the efficiency of the M estimator is a certain percentage of the OLS estimates when the scaling parameter k1 is properly chosen. In other words, the M estimates for the betas have more variance (they have to), but not too much more.

2. The k1 parameter simply scales the function used to penalize large residuals. For OLS, the penalty function is the quadratic function and we try to minimize the sum of the SQUARES of the residuals. For M estimation, we replace the quadratic function with a different function that caps the weights given to extreme residuals. The Tukey and Yohai functions are two choices. You minimize the sum of the "Tukey function" (or "Yohai function") of the residuals. The following graph compares the Tukey and Yohai functions to the quadratic function. For large residuals (large values of s), the penalty from Tukey or Yohai is much less than for the quadratic function that OLS uses.

data Rho;
b0 = 1.792; b1 = -0.972; b2 = 0.432; b3 = -0.052; b4 = 0.002;
do s = -5 to 5 by 0.1;
   k1 = 3.440;
   t = s / k1;
   if abs(s) <= k1 then
      Tukey = 3*t**2 - 3*t**4 + t**6;
   else Tukey=1;

   k1 = 0.868;
   t = s / k1;
   if abs(s) <= 2*k1 then
      Yohai = s**2/2;
   else if 2*k1 < abs(s) and abs(s) <= 3*k1 then
      Yohai = k1**2 * (b0+b1*t**2 + b2*t**4 + b3*t**6 + b4*t**8);
   else Yohai = 3.25*k1**2;

   Quadratic = s**2;
   if Quadratic > 3 then Quadratic=.;  /* cap the height of the quadratic function */
   output;
end;
run;

proc sgplot data=rho;
series x=s y=Tukey / curvelabel;
series x=s y=Yohai / curvelabel;
series x=s y=Quadratic / curvelabel;
xaxis label="Size of Residual";
yaxis label="Weight Given to Penalty Function";
run;

Advantages of higher or lower K1 in PROC ROBUSTREG

Re: Advantages of higher or lower K1 in PROC ROBUSTREG

Re: Advantages of higher or lower K1 in PROC ROBUSTREG

SAS Innovate 2025: Save the Date