Dealing With Significant Deviance and Pearson ChiSquare P-Value

Francios · Posted 02-27-2017 10:06 PM

Hello Good People!

I am analysing students academic performance based on the number of years spent in high school. My major predictor variable is years coded in 1 for (5 years) in high school and 2 for (4years). The response variable is students performance in national exams, which has been categorised based on their success at entering the university. This is coded 1-5, with 5 being the best grade with complete success of university admission. The ordinality of the response variable allows for ordinal logistic regression. I also have other variables such as school type i.e. either top tier or lower tier school can also predict your performance. I also have gender and location of school districts as possible predictors.

After runing the logistics, I find that the Score Test for the Proportional Odds Assumption is not held. I did further test (

EMPIRACAL PLOTS) to check the parallelism of all predictors with the respon variable and they show very parallel. So I can use visualization of this accept that that the propostional ODDs is met.

Score Test for the Proportional Odds Assumption
Chi-Square	DF	Pr > ChiSq
3384.2110	1	<.0001

However, the deviance and Person p-values are all significant. See below:

Deviance and Pearson Goodness-of-Fit Statistics
Criterion	Value	DF	Value/DF	Pr > ChiSq
Deviance	3485.9780	3	1161.993	<.0001
Pearson	3376.3413	3	1125.447	<.0001

My question is is there anything I can do to continue with this analysis? Can I just continue with the anlysis and ignore the significant Deviance and Pearson p-values?

I will appreciate your help.

Francios

Ksharp · Posted 02-27-2017 10:21 PM

NO. You should not .

Deviance and Pearson Goodness-of-Fit Statistics

says your model doesn't fit good.

Value/DF should be near 1 if your model fit data very well.

Rick_SAS · Posted 02-28-2017 05:31 AM

How many observations in your data? For very large data sets, the goodness-of-fit statistics will always reject the null hypothesis. You can use other statistics (ROC curves, accuracy of predictions on a hold-out sample,...) to assess the fit.

Francios · Posted 02-28-2017 07:47 PM

Hi Rick,

Thank you for your comment. I have a very large dataset of about 77000 observations. I will implement the ROC curve to see what I get. I will get back to you for further assitance.

Thank you very much!

Best,

Francois

StatDave · Posted 02-28-2017 10:38 AM

The question of sample size here is important. As discussed in this note, the test for proportional odds is known to be liberal with small sample sizes. Your graphical assessment might be more important. Also, as discussed in this note and in the "Details: Overdispersion: Rescaling the Covariance Matrix" section of the LOGISTIC documentation, the Pearson and deviance statistics require replication within the subpopulations in order to be valid. If there is suitable replication, then the similarity of the two statistics suggests they are providing a reasonable test of fit and their significance could be due to overdispersion or an incorrectly specified model. You might want to try adding complexity to the model (interactions, quadratic terms, splines, etc.) as seems reasonable to try to achieve a correctly specified model. If these statistics are still significant, then you might have a problem with overdispersion. The second note mentioned above discusses this.

Francios · Posted 02-28-2017 07:50 PM

Hello

I thank you very much for the detail comments. I will look at this and if I have questions, I will get back to you.

Best,

Francios!

Dealing With Significant Deviance and Pearson ChiSquare P-Value

Re: Dealing With Significant Deviance and Pearson ChiSquare P-Value

Re: Dealing With Significant Deviance and Pearson ChiSquare P-Value

Re: Dealing With Significant Deviance and Pearson ChiSquare P-Value

Re: Dealing With Significant Deviance and Pearson ChiSquare P-Value

Re: Dealing With Significant Deviance and Pearson ChiSquare P-Value