BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Bond007
Obsidian | Level 7

Hi,

 

I have researched on how to replicate proc logistic from SAS to Logistic Regression in Python and come up with the following observations:

1. SAS uses unpenalized regression and python uses penalty=l2 by default. So I have changed it to penalty=none.

2. SAS has default convergence criteria GCONV=1E-8. Python has default value for convergence criteria with tol= 1e-4. I have updated it to 1e-8.

3. I have used solver=lbfgs (default value in Python).

 

According to the articles that I have found online, the parameter changes 1 and 2 should get the intercept and coefficient outputs for all model variables similar upto few decimal points in SAS and Python implementations.

 

SAS code:

 

proc logistic DESCENDING data=dataset_name;

model target_var=&modelvariables/ selection = none CLPARM=WALD lackfit RSQ STB ;

output out=result p=pred;

run;

 

Python code:

clf=LogisticRegression(penalty='none',solver='lbfgs',tol=0.00000001)
clf.fit(X, y)

 

When I compare the SAS and Python outputs, I am having an issue as the intercepts have difference of 2 and one of the variable has the coefficient with opposite sign.

 

Example: 

Intercept: SAS (-7.03), Python (-5.21)

Coefficient for one of the variable: SAS (2.56), Python (-2.45)

 

The remaining variable coefficients match if I round them up to one decimal point.

 

Can anyone suggest if any other parameter changes are required in Python?

 

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Bond007
Obsidian | Level 7

I have used the following code and the results from SAS and Python Logistic Regression are a match.

clf=LogisticRegression(penalty='none',solver='newton-cg',tol=0.00000001)
clf.fit(X, y)

View solution in original post

6 REPLIES 6
Bond007
Obsidian | Level 7

I have tried the C=1e9 option, instead of penalty=none and there are no changes with the output. The issue is not resolved.

Reeza
Super User
STB displays the standardized estimate. If you remove the STB from SAS do your answers match?
Bond007
Obsidian | Level 7

The SAS logistic regression output gives both the estimate and standard estimate columns, along with few other columns (Error, Chi-Square, Pr > ChiSq).
I am attempting to accurately replicate the SAS model using Python. So for this case, SAS model is non-editable.

Ksharp
Super User
You got different sign from " SAS (2.56), Python (-2.45)" ,
that means you model different level of Y between SAS and Python,
Try remove "DESCENDING " option from PROC LOGISTIC ,
or try "target_var(event='0')" or "target_var(event='1')"
Bond007
Obsidian | Level 7

I have used the following code and the results from SAS and Python Logistic Regression are a match.

clf=LogisticRegression(penalty='none',solver='newton-cg',tol=0.00000001)
clf.fit(X, y)

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2371 views
  • 1 like
  • 3 in conversation