BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Bond007
Obsidian | Level 7

Hi,

 

I have researched on how to replicate proc logistic from SAS to Logistic Regression in Python and come up with the following observations:

1. SAS uses unpenalized regression and python uses penalty=l2 by default. So I have changed it to penalty=none.

2. SAS has default convergence criteria GCONV=1E-8. Python has default value for convergence criteria with tol= 1e-4. I have updated it to 1e-8.

3. I have used solver=lbfgs (default value in Python).

 

According to the articles that I have found online, the parameter changes 1 and 2 should get the intercept and coefficient outputs for all model variables similar upto few decimal points in SAS and Python implementations.

 

SAS code:

 

proc logistic DESCENDING data=dataset_name;

model target_var=&modelvariables/ selection = none CLPARM=WALD lackfit RSQ STB ;

output out=result p=pred;

run;

 

Python code:

clf=LogisticRegression(penalty='none',solver='lbfgs',tol=0.00000001)
clf.fit(X, y)

 

When I compare the SAS and Python outputs, I am having an issue as the intercepts have difference of 2 and one of the variable has the coefficient with opposite sign.

 

Example: 

Intercept: SAS (-7.03), Python (-5.21)

Coefficient for one of the variable: SAS (2.56), Python (-2.45)

 

The remaining variable coefficients match if I round them up to one decimal point.

 

Can anyone suggest if any other parameter changes are required in Python?

 

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Bond007
Obsidian | Level 7

I have used the following code and the results from SAS and Python Logistic Regression are a match.

clf=LogisticRegression(penalty='none',solver='newton-cg',tol=0.00000001)
clf.fit(X, y)

View solution in original post

6 REPLIES 6
Bond007
Obsidian | Level 7

I have tried the C=1e9 option, instead of penalty=none and there are no changes with the output. The issue is not resolved.

Reeza
Super User
STB displays the standardized estimate. If you remove the STB from SAS do your answers match?
Bond007
Obsidian | Level 7

The SAS logistic regression output gives both the estimate and standard estimate columns, along with few other columns (Error, Chi-Square, Pr > ChiSq).
I am attempting to accurately replicate the SAS model using Python. So for this case, SAS model is non-editable.

Ksharp
Super User
You got different sign from " SAS (2.56), Python (-2.45)" ,
that means you model different level of Y between SAS and Python,
Try remove "DESCENDING " option from PROC LOGISTIC ,
or try "target_var(event='0')" or "target_var(event='1')"
Bond007
Obsidian | Level 7

I have used the following code and the results from SAS and Python Logistic Regression are a match.

clf=LogisticRegression(penalty='none',solver='newton-cg',tol=0.00000001)
clf.fit(X, y)

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 4503 views
  • 1 like
  • 3 in conversation