BookmarkSubscribeRSS Feed
jwg
Calcite | Level 5 jwg
Calcite | Level 5

I am at my wits end trying to find SAS implementations of several standard statistical procedures, having come from the world of R and Python. Can someone please help me? I am trying really hard to appreciate SAS.

 

How can one do Logistic Regression optimized with a ridge regression, in SAS? According to comments here and here this should already be implemented in SAS with PROC HPGENSELECT. But how? I am new to SAS, having come from the world of R. I am a little disoriented and having a generally hard time finding R-analogues in SAS.

 

Note: The Newton-Raphson with ridging method implemented in PROC HPGENSELECT, which is implemented "as needed", is probably done for computational reasons when computing the maximum likelihood (especially when there is multicollinearity). I am guessing that the ridge parameter there is really tiny, and for proper ridge regression you want to penalize large coefficients and must be able to tune the ridge parameter to your needs.

7 REPLIES 7
Rick_SAS
SAS Super FREQ

I interpret @SteveDenham's response in the previous thread to mean that PROC HPGENSELECT supports the LASSO method for variable selection. I don't think he meant to imply that the procedure has an option for user-controlled ridge regression (the way that PROC REG does).  In other words, I think the sources you linked to claim that HPGENSELECT uses ridge regression internally as part of the LASSO method, not that the ridging method is surfaced to the user.

jwg
Calcite | Level 5 jwg
Calcite | Level 5

The user in the linked thread clearly was asking for an implementation of ridge logistic regression, so your interpretation seems strange to me. I understood @SteveDenham mentioned that this functionality would be bundled in with the Lasso method, since the user was directly asking about *ridge* logistic regression.

 

I agree that in the comments section of your article that I also linked to, you answered a user's question about how to implement ridge logistic regression by advertising a different product instead (Lasso). It seems you think this is not doable in SAS, without the tedious effort of researching and writing your own MACRO for it.

Rick_SAS
SAS Super FREQ

I don't want to argue about other people's intentions, so let me rephrase my answer. I don't think HPGENSELECT provides control over the ridging method. It sounds like you want to implement this method in SAS. I suggest you use the SAS/IML matrix language rather than macro. As you know, logistic regression is not a direct method, it requires an iterative method to optimize the LL.

There is an example of parameter estimation for the logistic model in the SAS/IML doc.  You can start with that example, but instead of inverting the weighted normal equations (the line XPXI = .INV(...)) you would solve a ridged equation that might look something like (untested):

A = xx`*(w#xx)+ lambda*I(ncol(xx));  /* add ridging */

RHS = xx`*(w#(y-p));

db = solve(A, RHS);

b = b + db;

 

Here lambda is a fixed ridge parameter. Since you are coming from R, you should have no problem writing IML code, which is similar in spirit. If you are new to SAS/IML, see "Ten tips for learning the SAS/IML language". If you get stuck, the SAS/IML Support Community is available for assistance.

 

Good luck!

jwg
Calcite | Level 5 jwg
Calcite | Level 5

Thank you for your help, @Rick_SAS. I will try to implement this later.

Ksharp
Super User
Maybe you want this option.

proc logistic;
model .........../   RIDGING=ABSOLUTE | RELATIVE | NONE


jwg
Calcite | Level 5 jwg
Calcite | Level 5
I think you are mistaken.
StatDave
SAS Super FREQ

There are two types of shrinkage (aka penalization, aka regularization): L1 (or lasso) regularization which adds an absolute value penalty, and L2 regularization (or ridging) that adds a quadratic penalty.  A combination of these is the so-called elastic net.  L1 regularization (lasso) and the combination (elastic net) are available in PROC HPGENSELECT.  L2 regularization (ridging) should be possible in NLMIXED by simply adding the penalty in the log likelihood. For example, these statements add the quadratic penalty on the parameters of a logistic model using a 0.1 shrinkage parameter which could, of course, be adjusted. The data is the remission data in the first example in the LOGISTIC documentation.

 

proc nlmixed data=remission;
parms b0=0 b1=0 b2=0 b3=0;
p=1/(1+exp(-(b0+b1*li+b2*cell+b3*temp)));
ll=(remiss=1)*log(p)+(remiss=0)*log(1-p) -
0.1*sqrt(b1**2+b2**2+b3**2+1e-8);
model remiss ~ general(ll);
run;

 

Compare the results to those from the unpenalized (unridged) logistic model:

 

proc logistic data=remission;
model remiss(event='1') = cell li temp;
run;

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 4949 views
  • 2 likes
  • 4 in conversation