BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mr_Nobody
Calcite | Level 5

Hi,

I want to learn how gconv model statement option work ? I think it round data but how ? İt's default value 1E-8. What is that mean? Is there anyone to explain me with simple examples?

Thanks.

I am giving a proc logistic output :

The LOGISTIC Procedure

  Model Information

Data Set TMP1.HSB2

Response Variable ses

Number of Response Levels 3

Number of Observations 200

Model cumulative logit

Optimization Technique Fisher's scoring

  Response Profile

Ordered Total

  Value ses Frequency

  1 3 58

  2 2 95

  3 1 47

Probabilities modeled are cumulated over the lower Ordered Values.

  Model Convergence Status

  Convergence criterion (GCONV=1E-8) satisfied.

Score Test for the Proportional Odds Assumption

Chi-Square DF Pr > ChiSq

  2.1498 3 0.5419

  Model Fit Statistics

  Intercept

  Intercept and

Criterion Only Covariates

AIC 425.165 399.605

SC 431.762 416.096

-2 Log L 421.165 389.605

  Testing Global Null Hypothesis: BETA=0

Test Chi-Square DF Pr > ChiSq

Likelihood Ratio 31.5604 3 <.0001

Score 28.9853 3 <.0001

Wald 29.0022 3 <.0001

  Analysis of Maximum Likelihood Estimates

  Standard Wald

Parameter DF Estimate Error Chi-Square Pr > ChiSq

Intercept 3 1 -5.1055 0.9226 30.6238 <.0001

Intercept 2 1 -2.7547 0.8607 10.2431 0.0014

science 1 0.0300 0.0159 3.5838 0.0583

socst 1 0.0532 0.0149 12.7778 0.0004

female 1 -0.4824 0.2785 3.0004 0.0832

  Odds Ratio Estimates

  Point 95% Wald

Effect Estimate Confidence Limits

science 1.030 0.999 1.063

socst 1.055 1.024 1.086

female 0.617 0.358 1.066

Association of Predicted Probabilities and Observed Responses

Percent Concordant 68.1 Somers' D 0.368

Percent Discordant 31.3 Gamma 0.370

Percent Tied 0.6 Tau-a 0.235

Pairs 12701 c 0.684

1 ACCEPTED SOLUTION

Accepted Solutions
SteveDenham
Jade | Level 19

The data will never be "changed".  GCONV has nothing to do with the values you presented, which I see as score values from the logistic regression.  It has to do with the derivative with respect to the parameters of the log likelihood function.  I think at this point you need to familiarize yourself with the maximum likelihood algorithm and how it works.

Steve Denham

View solution in original post

10 REPLIES 10
SteveDenham
Jade | Level 19

GCONV specifies a relative gradient (a quadratic form involving the Hessian matrix) convergence criterion.  The formula can be found in the Shared Concepts and Topics>NLOPTIONS Statement documentation.  Essentially, once the relative change in the gradient of the likelihood function stabilizes (change is less than 1e-8 for the default setting), the iterative process stops, and final estimates, tests, etc. are computed.

Steve Denham

Mr_Nobody
Calcite | Level 5

Thanks for the replay. I have read documentation. But I didint understand Smiley Sad

If the value is less than 1e-8 then it is changed? İf it is true how it is changed?

SteveDenham
Jade | Level 19

If the gradient is zero, then the response surface is at a stationary point (minimum, maximum or saddle-point).  The fit of the model is not improved by moving in any direction in the parameter space, and the iterative process stops.  No changes or rounding of data.

Steve Denham

Mr_Nobody
Calcite | Level 5

I am so sorry for my bad English, The last time help me again.

suppose that, I have 2 rows of data. if the real_score_col = 0 then

model_score_col

0,006828647300000000

0,001962600300000000

then if the real_score_col = 1 then

model_score_col

0,012321732800000000

0,049357227400000000

then this values to be compared.

0,012321732800000000 < or = or > 0,006828647300000000

0,012321732800000000 < or = or > 0,001962600300000000

0,049357227400000000 < or = or > 0,006828647300000000

0,049357227400000000 < or = or > 0,001962600300000000

Then İf the gconv = 1e-8 (default) this how to do this comparison ? how data will be changed ? I am so sorry ,I didn't understant your explanation and want to give an example.

SteveDenham
Jade | Level 19

The data will never be "changed".  GCONV has nothing to do with the values you presented, which I see as score values from the logistic regression.  It has to do with the derivative with respect to the parameters of the log likelihood function.  I think at this point you need to familiarize yourself with the maximum likelihood algorithm and how it works.

Steve Denham

Mr_Nobody
Calcite | Level 5

Hi again, I searched maximum likelihood algorithm.

I want to share a web page about the this problem.

http://support.sas.com/resources/papers/proceedings11/343-2011.pdf

On page 7 , the author explain why  c statistic has different values on basic calculation and proc logistic. He says "The PROC LOGISTIC may round the probabilities in a higher decimal position during pairing and counting. "

Maybe I associate wrong about gconv option.

I understand this proc logistic rounding data before the compare. (data is for example: 0,012321732800000000)

I want to know this how proc logistic rounding data ? Why basic calculation output different from proc logistic?

Thank you so much.

SteveDenham
Jade | Level 19

This noves outside of my experience, so I'll defer to those with more practical experience with ROC curves.

Steve Denham

bobderr
SAS Employee

This is getting into the ROC computations, not the optimization.

First divide [0,1] into 500 equal-sized bins.  By default, PROC LOGISTIC computes the c-statistic (an approximation of the area-under-the-ROC-curve) by taking the model-predicted probabilities and putting them into the appropriate bins, then it makes the concordance calculations in the documentation.  Essentially you're rounding the probabilities to the nearest 0.002.  You can change the size of the bins with the BINWIDTH= option in the MODEL statement, which will change the value of "c" because of more-or-fewer ties---if you happen to get one observation per bin, then that will give you the true value of c.  If you specify BINWIDTH=0, then instead of binning the predicted probabilities, the actual AUC computation is performed (see the "ROC Computations" of the Details section in the documentation for the equation).

As Steve explained, GCONV only deals with the optimization.  Since you have to search for the maximum likelihood estimator, GCONV is one way to tell when your parameter estimates are "close enough" to the optimum so you can stop the search.

Mr_Nobody
Calcite | Level 5

5 mins ago I found binwidth option make this Smiley Happy Smiley Happy and You wrote here. Thank you so much. I searched how binwidth works ? I will look at document you suggested. If you know the link of document , can you share with me ?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 3473 views
  • 10 likes
  • 4 in conversation