How to Address Significant Proportional Odds, Deviance and Pearson in ...

Francios · Posted 03-12-2017 07:32 PM

Hello

I am trying to predict high school students success of being admitted to university based on the number of years in high school (i.e. either three or four years). Number of years in high school is dichotomized 1 = three years and 2= four years.

The response variable is ordered based on1 to 5, with 5 being the highest level of success to be admitted to a university.

I also have another varable called School Type attended which is dichotomized as top tier = 1 and lower tier = 2. I also have gender, male and female. The sample size is over 56000 representing 24 schools, which were randomly selected from 400 schools. So this is what I have:

Predictor = number of years (coded 1 and 0)

Outcome = Admissibility to university (coded 1 – 5)

Gender = (female and Male, coded 1 and 2)

School Type (A and B, coded 1 and 2)

I run Ordinal Logistics see SAS code below - output attached :

ODS LISTING CLOSE;

ods graphics on;

ODS RTF FILE = '\\Client\C$\SHS_DATA_CURRENT\LOGIT_YEARS.RTF';

proc logistic data=SHS plots(only)=(effect(polybar)oddsratio(range=clip))DESCENDING;

class YEARS(param=ref ref= "4YEARS");

WHERE YEARS NE ('3N4YEARS');

model ACCEPT=YEARS / SCALE=NONE AGGREGATE covb;

oddsratio YEARS ;

ODDSRATIO ACCEPT;

OUTPUT OUT=PREDICTED2 PRED=PRED;

title PREDICTING STUDENT ADMISSIBILITY TO UNIV. BASED ON YEARS IN HIGH SCHOOL;

run;

ods rtf close;

ods graphics off;

ods listing;

The Problem I am having is the Proportional odds assumption is not held, Deviance and Pearson are both significant ---please see my print out. I have also tried other suggested techniques such as empirical test of parallelism of my variables – see sample code below and output on attachment. Since the empirical test lines suggest parallelism, shoud I continue with the analysis? Is there anything that anyone will suggest I do to make sure that I am doing the right thing?

proc freq data=SHS;

table ACCEPT*YEARS / out=os;

WHERE NOT MISSING(ACCEPT);

run;

PROC SORT DATA = OS;

BY YEARS;

RUN;

proc transpose data=os(WHERE=(YEARS NE '3N4YEARS')) out=tran;

by YEARS; var count;

run;

data a; set tran;

const=0;

c1=log((sum(of col1-col1)+const)/(sum(of col2-col5)+const));

c2=log((sum(of col1-col2)+const)/(sum(of col3-col5)+const));

c3=log((sum(of col1-col3)+const)/(sum(of col4-col5)+const));

c4=log((sum(of col1-col4)+const)/(sum(of col5-col5)+const));

run;

ODS RTF FILE = '\\Client\C$\SHS_DATA_CURRENT\LOGIT_YEARS.RTF';

TITLE 'EMPIRACAL PLOTS OF ACCEPT ON YEARS';

proc sgplot;

series y=c1 x=YEARS;

series y=c2 x=YEARS;

series y=c3 x=YEARS;

series y=c4 x=YEARS;

yaxis values=(-6 to 6);

xaxis integer;

run;

My question what do I do next. Is this the end of my analysis. Should I continue interpreting my results based on the fact that the lines are parallel?

Any help will be appreciated.

StatDave · Posted 07-03-2017 02:15 PM

Statistical tests become more powerful at detecting small effects as the sample size increases. So, it is possible that these tests are detecting trivially small departures from fit for your practical purposes. You might want to assess how well the model does at the observation level by seeing how well it classifies observations. The graphical method you used, based on this note, is also a good way to assess the proportional odds assumption without using a test.

How to Address Significant Proportional Odds, Deviance and Pearson in Ordinal Logistics Regression

Re: How to Address Significant Proportional Odds, Deviance and Pearson in Ordinal Logistics Regress