BookmarkSubscribeRSS Feed
willi_m
Calcite | Level 5

Hi,

In order to estimate a LASSO regression (and it's extension afterwards such as Adaptative and Elastic Net and after afterwards cross validation )  i want to encode the Descent Coordinate Algorithm in SAS/IML.

 

in this algorithm my objective function is the Penalized Residual Sum of Square (PRSS) that I minimize succesively the long of the coordinate of my vector beta. For The lasso regression the update of the coordinate is done by the soft thresholding function.

 

The problem that i met is that the algorithm doesn't select the good predictor, and don't reduce coefficient. So if someone can help me to understand how fix the problem or easily.localize the problem in my code.

To test the code, i used simulated data :

data donnee ;
array x{*} x1-x15 ;
do i=1 to 8 ;
    do j=1 to dim(x);
    x{j} = rannor(5)*20+6 ;
    end;
y= 3*x1+2*x5 + 6 + rannor(1);  *so the true Y;
output;
end;
drop i j ;
run;



So in the selection of the algorithm, i would have x1 , x5 and the intercept.
The algorithm :

 

quit;
PROC IML ;
use donnee ;
read all into G ;
close donnee ;

G = repeat(1,nrow(G))||G ;   *in order to have a column of 1 to estimate the intercept;
Y = G[,17];
X=G[,1:16];

p=ncol(X);
done=0;
lambda=0.5;
idx=do(1,p,1);
beta0 = inv(t(X)*X + lambda*I(p))*t(X)*Y;      *initial beta  (ridge regression);
PRSS0 = t(y-X*beta0)*(y-X*beta0)+lambda*(abs(beta0))[+]; *objective function  initial value;
beta1=beta0;


do j=1 to p ;
    b=setdif(idx,j);  *correspond to indice (-j) in the formula;
    do k=1 to 1000 until(done);
        a= t(X[,j])*(y-X[,b]*beta0[b]);
        Slamb = a#(1-(lambda/(2#abs(a))));
        beta1[j]=Slamb;                            *update the coordinate  beta;
        PRSS1 = t(y-X*beta1)*(y-X*beta1)+lambda*(abs(beta1))[+];    *recompute the PRSS with the new value of coordinate;
        if PRSS1 - PRSS0 >= 0 then do;  *stop criterion ;
            PRSS0 = PRSS1 ;    
            beta0 = beta1 ;
            end;
        else do ; *  if PRSS1 (with beta1) is less than PRSS0 (with initial beta) then PRSS1-PRSS0 is negative and we keep the coordinate;   
            done=1 ;
            end;
              
    end;
end;

 

Quit;

 

The algorithm select after the first 3 predictors and put the other at 0, but it's not the good predictors.

If someone can help me, i will very recognized.

 

PS : I want to use this algorithm (and not the LARS algorithm encode in the PROC GLMSELECT) because after i want to try classification problem with penalization such as logistic regression penalized.

1 REPLY 1
Rick_SAS
SAS Super FREQ

Do you have a reference that shows the formulas that you are attempting to implement? I suspect your DO loop

do k=1 to 1000 until(done);

is not correct, since DONE is never set to 1. 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 1 reply
  • 1176 views
  • 0 likes
  • 2 in conversation