Hi, I have researched on how to replicate proc logistic from SAS to Logistic Regression in Python and come up with the following observations: 1. SAS uses unpenalized regression and python uses penalty=l2 by default. So I have changed it to penalty=none. 2. SAS has default convergence criteria GCONV=1E-8. Python has default value for convergence criteria with tol= 1e-4. I have updated it to 1e-8. 3. I have used solver=lbfgs (default value in Python). According to the articles that I have found online, the parameter changes 1 and 2 should get the intercept and coefficient outputs for all model variables similar upto few decimal points in SAS and Python implementations. SAS code: proc logistic DESCENDING data=dataset_name; model target_var=&modelvariables/ selection = none CLPARM=WALD lackfit RSQ STB ; output out=result p=pred; run; Python code: clf=LogisticRegression(penalty='none',solver='lbfgs',tol=0.00000001) clf.fit(X, y) When I compare the SAS and Python outputs, I am having an issue as the intercepts have difference of 2 and one of the variable has the coefficient with opposite sign. Example: Intercept: SAS (-7.03), Python (-5.21) Coefficient for one of the variable: SAS (2.56), Python (-2.45) The remaining variable coefficients match if I round them up to one decimal point. Can anyone suggest if any other parameter changes are required in Python? Thanks in advance.
... View more