Yes, HPLOGISTIC (and HPGENSELECT), like PROC LOGISTIC, use the score statistic because doing likelihood ratio tests requires refitting each model which can become very costly (time-consuming) in larger problems, and as you note, the score, Wald, and likelihood ratio tests are asymptotically the same.
It is possible. What is the proper calculation +/- 1 DF?
Also, can you take the variables that belong in the best-fit model, execute the process using only those variables in your model using proc logistic and review the results? Does it give you different point estimates?
So upon some investigation... select=sl ... means to use the Score value. I change this to select=BIC and received the same result. The selection details still shows the Score output, but it appears that the method for selection did indeed change. Sharing the same results shouldn't be much of a shock seeing how Wald, Score, and LLR should all give approximately the same values.
There issue with changing select from SL is that it appears I lose control over changing the value from which I want to enter variables based on p-value. This is an option I would like to retain. I would feel more comfortable if the calculations were done by LLR rather than Score. If this is an option, please show me where in the code this can be changed.
Secondly, the results for the second variable to be included is not what I would expect. The first variable is X30, expected. The second variable that SAS wants to include is X16. However, when I choose X16 for inclusion in JMP, the LLR p-value is 0.0931. The value I expect to be added is X42, which JMP shows has a p-value of 0.0442. Based on these values, X42 provide greater explainitory power to the model than X16 would. SAS on the other hand provides the following Score values: X16=0.0584 and X42=0.1627. I understand why SAS would choose X16 over X42, but LLR should be more accurate than Score.
As a minimum I would suggest posting the code used for both processes that do the chi-squares. Some one with experience in both JMP and SAS HP procedures may recognize either an option used/ left out/ and/or different defaults.
Yes, HPLOGISTIC (and HPGENSELECT), like PROC LOGISTIC, use the score statistic because doing likelihood ratio tests requires refitting each model which can become very costly (time-consuming) in larger problems, and as you note, the score, Wald, and likelihood ratio tests are asymptotically the same.
Thank you for the reply. I was working through a known data set to familiarize myself with SAS and increase confidence in using an automated system. I was frustrated to see SAS not providing the solution I expected. My known data set has high correlation in it, so when one variable is selected over the other, the whole solution changes. I've looked at it a bit more and the SAS solution appears to be acceptable...although it is different than how I manually recreated models in JMP. Like you said, each model needed to be refitted which is very time consuming.
The data set only had 52 rows, the perhaps the sampel size isn't large enough for the tests to be similar.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.