Logistic Regression - Training and validation score exactly the same.

Fae — Tue, 29 Oct 2019 14:20:01 GMT

I tried to use the all subset selection (assess & fitandscore )from the Predictive Modeling Using Logistic Regression course notes.

But for some reason, my training and validation scores are basically the same (Graph below) and the profit plot is an horizontal line, any advice on why is this happening? I double check the codes and the training/validation dataset and I don't see any issues.

The only possible source of concern is that all my input variables are either binary or categorical variables(converted to dummy variables with reference level), would that be a concern?

Code:

%macro assess(data=,inputcount=,inputsinmodel=,index=);

proc sort data=scored&data;
by descending p_1;
run;

data assess;
attrib DATAROLE length=$5;
retain sse 0 csum 0 DATAROLE "&data";
array n[0:1,0:1] _temporary_ (0 0 0 0);
array w[0:1] _temporary_
(%sysevalf(&pi0/&rho0) %sysevalf(&pi1/&rho1));
keep DATAROLE INPUT_COUNT INDEX
TOTAL_PROFIT OVERALL_AVG_PROFIT ASE C;

set scored&data end=last;

d1=&PF11*p_1+&PF01*p_0;
d0=&PF10*p_1+&PF00*p_0;

t=(strip(ischurn)="1");
d=(d1>d0);

n[t,d] + w[t];
sse + (ischurn-p_1)**2;
csum + ((n[1,1]+n[1,0])*(1-t)*w[0]);

if last then do;
INPUT_COUNT=&inputcount;
TOTAL_PROFIT =
sum(&PF11*n[1,1],&PF10*n[1,0],&PF01*n[0,1],&PF00*n[0,0]);
OVERALL_AVG_PROFIT =
TOTAL_PROFIT/sum(n[0,0],n[1,0],n[0,1],n[1,1]);
ASE = sse/sum(n[0,0],n[1,0],n[0,1],n[1,1]);
C = csum/(sum(n[0,0],n[0,1])*sum(n[1,0],n[1,1]));
index=&index;
output;
end;
run;

proc append base=results data=assess force;
run;

%mend assess;

%macro fitandscore();
proc datasets
library=work
nodetails
nolist;
delete results;
run;

%do model_indx=1 %to &lastindx;
%let im=&&inputs&model_indx;
%let ic=&&ic&model_indx;

proc logistic data=churnmod.imputed2 des NAMELEN=50;
model ischurn=&im;
score data=churnmod.imputed2
out=scoredtrain(keep=ischurn p_1 p_0)
priorevent=&pi1;
score data=churnmod.valid2
out=scoredvalid(keep=ischurn p_1 p_0)
priorevent=&pi1;
run;

%assess(data=train,
inputcount=&ic,
inputsinmodel=&im,
index=&model_indx);
%assess(data=VALID,
inputcount=&ic,
inputsinmodel=&im,
index=&model_indx);

%end;
%mend fitandscore;

Re: Logistic Regression - Training and validation score exactly the same.

unison — Sat, 16 Nov 2019 18:58:48 GMT

To help us better assist, please provide sample data that would yield these results.

Thanks,

-unison

Re: Logistic Regression - Training and validation score exactly the same.

PaigeMiller — Sat, 16 Nov 2019 22:03:35 GMT

Clearly, you have created TRAIN and VALID incorrectly.

The first debugging tool for you to try is to actually look at, with your own eyes, the two different data sets named TRAIN and VALID and see if they actually are different. We can't do that for you, because we don't have those data sets.

Also, you need to examine how these data sets were created to make sure you haven't somehow done something that would cause this result.

topic Re: Logistic Regression - Training and validation score exactly the same. in Statistical Procedures

Logistic Regression - Training and validation score exactly the same.

Re: Logistic Regression - Training and validation score exactly the same.

Re: Logistic Regression - Training and validation score exactly the same.