Hi,
In order to estimate a LASSO regression (and it's extension afterwards such as Adaptative and Elastic Net and after afterwards cross validation ) i want to encode the Descent Coordinate Algorithm in SAS/IML.
in this algorithm my objective function is the Penalized Residual Sum of Square (PRSS) that I minimize succesively the long of the coordinate of my vector beta. For The lasso regression the update of the coordinate is done by the soft thresholding function.
The problem that i met is that the algorithm doesn't select the good predictor, and don't reduce coefficient. So if someone can help me to understand how fix the problem or easily.localize the problem in my code.
To test the code, i used simulated data :
data donnee ;
array x{*} x1-x15 ;
do i=1 to 8 ;
do j=1 to dim(x);
x{j} = rannor(5)*20+6 ;
end;
y= 3*x1+2*x5 + 6 + rannor(1); *so the true Y;
output;
end;
drop i j ;
run;
So in the selection of the algorithm, i would have x1 , x5 and the intercept.
The algorithm :
quit;
PROC IML ;
use donnee ;
read all into G ;
close donnee ;
G = repeat(1,nrow(G))||G ; *in order to have a column of 1 to estimate the intercept;
Y = G[,17];
X=G[,1:16];
p=ncol(X);
done=0;
lambda=0.5;
idx=do(1,p,1);
beta0 = inv(t(X)*X + lambda*I(p))*t(X)*Y; *initial beta (ridge regression);
PRSS0 = t(y-X*beta0)*(y-X*beta0)+lambda*(abs(beta0))[+]; *objective function initial value;
beta1=beta0;
do j=1 to p ;
b=setdif(idx,j); *correspond to indice (-j) in the formula;
do k=1 to 1000 until(done);
a= t(X[,j])*(y-X[,b]*beta0[b]);
Slamb = a#(1-(lambda/(2#abs(a))));
beta1[j]=Slamb; *update the coordinate beta;
PRSS1 = t(y-X*beta1)*(y-X*beta1)+lambda*(abs(beta1))[+]; *recompute the PRSS with the new value of coordinate;
if PRSS1 - PRSS0 >= 0 then do; *stop criterion ;
PRSS0 = PRSS1 ;
beta0 = beta1 ;
end;
else do ; * if PRSS1 (with beta1) is less than PRSS0 (with initial beta) then PRSS1-PRSS0 is negative and we keep the coordinate;
done=1 ;
end;
end;
end;
Quit;
The algorithm select after the first 3 predictors and put the other at 0, but it's not the good predictors.
If someone can help me, i will very recognized.
PS : I want to use this algorithm (and not the LARS algorithm encode in the PROC GLMSELECT) because after i want to try classification problem with penalization such as logistic regression penalized.
Do you have a reference that shows the formulas that you are attempting to implement? I suspect your DO loop
do k=1 to 1000 until(done);
is not correct, since DONE is never set to 1.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.