Proc HPLogistic and Logistic end up with different parameter estimates

Reply
Occasional Contributor
Posts: 6

Proc HPLogistic and Logistic end up with different parameter estimates

I decided to port my "normal" model build code to proc hplogistic. After wrestling with all the options I think I managed to do this, however in the process I have noticed something odd.

When fed with the same settings (same data set, same target variable, same variable list) proc logistic and proc hplogisitc end up with ever so slightly different estimates for the coefficients. This is true even I I specify the same technique in proc logistic as in proc hplogisitc (technique=newton). The differences are very small, however I am worried I have missed something. Any idea what it could be?

Estimate differences

Variable nameProc logistic estimateProc hplogisitc estimate
Intercept-0.40017041503038-0.40016018866442
Variable A-0.86355666216567-0.86355818846804
Variable B-0.00421933005198-0.00421943244787

Proc logistic code

proc logistic data=&modelin. descending outest=lib.modparams;

  weight &weightvar.;

  model &targetvar.= &modnum.

            /selection=stepwise slentry=&entry. slstay=&exit. RSQ lackfit outroc=Roc technique=newton MAXITER=50;

  output out=&modelout. p=pred;

run;

Proc hplogistic code

proc hplogistic data=&modelin. technique=NRRIDG;

  model &targetvar. (descending) = &modnum. / lackfit rsquare;

  selection method=stepwise(slentry=&entry. slstay=&exit.) details=all;

  output out=&modelout. p=pred copyvar=(&keyCol. &weightvar.);

run;

Any ideas what the difference is,

Super User
Posts: 17,818

Re: Proc HPLogistic and Logistic end up with different parameter estimates

Why doesn't your PROC HPLOGISTIC have a WEIGHT statement, similar to your logistic model?

I'm also assuming you're familiar with the caveats of using the WEIGHT statement in PROC LOGISTIC.

Occasional Contributor
Posts: 6

Re: Proc HPLogistic and Logistic end up with different parameter estimates

Excellent point, however in both cases weight is set to 1 for each observation (in proc logistic case I added a column to the dataset with value of 1 for each row).

I rerun the code and I can see the same result with or without WEIGHT statement.

For reference this PROC SQL is used to create a dataset that is then fed into PROC LOGISTIC/PROC HPLOGISTIC.

proc sql;

  create table build as select *, 1 as w from l.data where isDue = 1;

quit;

Super Contributor
Posts: 287

Re: Proc HPLogistic and Logistic end up with different parameter estimates

You should not worry about such small differences. The estimates calculated by most all procedures (including the proc logistic/hplogistic) is found by numerical maximation of a likelihood function. The difference you see  is caused by either different methods to maximize the function, or different tolerance of how close the solution should be to the maximum before it returns its results.

If you are currious, you can try play a bit with the tolerence parameters. Try for instance to set the GCONV option at a smaller level.  Btw, when you set maxiter at 50, it just Means it allows for 50 iterations, but it will stop when it reach convergence criterium.

Good luck.

proc hplogistic data=&modelin. technique=NRRIDG GCONV=1E-6;

  model &targetvar. (descending) = &modnum. / lackfit rsquare;

  selection method=stepwise(slentry=&entry. slstay=&exit.) details=all;

  output out=&modelout. p=pred copyvar=(&keyCol. &weightvar.);

run;

proc logistic data=&modelin. descending outest=lib.modparams;

  weight &weightvar.;

  model &targetvar.= &modnum.

            /selection=stepwise slentry=&entry. slstay=&exit. RSQ lackfit outroc=Roc technique=newton MAXITER=50 GCONV=1E-6;

  output out=&modelout. p=pred;

run;

Occasional Contributor
Posts: 6

Re: Proc HPLogistic and Logistic end up with different parameter estimates

Thanks for your reply. I appreciate that the difference is small (and in practice the difference is meaningless), however I would like to understand what is the underlying source of the difference. Is this difference caused by settings or difference in underlying implementation of logistic regression?

Btw using GCONV=1E-6 for both generates slightly difference estimates from original run with a small difference between PROC LOGISTIC and PROC HPLOGISTIC.

Super Contributor
Posts: 287

Re: Proc HPLogistic and Logistic end up with different parameter estimates

maybe, if you change to TECHNIQUE=NEWRAP in hplogistic, there is a change that it will use exactly same way from startingpoint approching the maximum. If not, then dont think more about it:-)

Respected Advisor
Posts: 2,655

Re: Proc HPLogistic and Logistic end up with different parameter estimates

has identified the difference, I believe.  HPLOGISTIC is using Newton-Raphson with ridging (the default method, but also specified in your code).  LOGISTIC uses straight Newton-Raphson, with no ridging--and there is the source of the trivial differences.

Steve Denham

Super User
Posts: 9,676

Re: Proc HPLogistic and Logistic end up with different parameter estimates

Steve,

proc logistic default using Fisher Score to estimate Max Likelihood, not Newton-Raphson . Maybe it is a little difference between them.

BTW, What module is HPLOGISTIC  in ? I didn't find it in SAS/STAT yet ?

Best

Xia Keshan

Super Contributor
Posts: 287

Re: Proc HPLogistic and Logistic end up with different parameter estimates

hplogistic is in SAS/STAT, at least from  sas/stat 12.3. They are described in the documentation in its own section within the SAS/STAT.

Correct that the Fisher scoring is used by default in proc logistic, but here the option "technique=newton" is Applied.

Super User
Posts: 9,676

Re: Proc HPLogistic and Logistic end up with different parameter estimates

Jacob,

Thanks. But I can't run hplogistic  in SAS University Edition , Why ?

Respected Advisor
Posts: 2,655

Re: Proc HPLogistic and Logistic end up with different parameter estimates

Xia,

I don't think any of the high performance stat procs are available in UE, but I may be mistaken.

Steve Denham

Ask a Question
Discussion stats
  • 10 replies
  • 866 views
  • 1 like
  • 5 in conversation