Hi all,
I'm trying to asses non-linearity between a continuous predictor (fer) and the logit of a binary outcome variable (par).
I understand that one way to visualize this is through an empirical logit plot. First off, I'm having trouble trying to find the exact syntax of how to produce these plots (I'm using SAS University edition so I think I can't use macros?).
To compute the empirical logit, my understanding is that you have to bin your continuous variable. Then, I would run a PROC MEANS procedure to produce the number of target event cases (where par=1), and the number of total cases in each bin.
This is the code I have so far:
data want;
set have;
run;
%global var;
%let var=fer;
proc rank data = want groups=100 out=work.ranks;
var &var;
ranks bin;
run;
title1 "Checking fer by bin";
proc print data=work.ranks(obs=10);
var &var bin;
run;
proc means data=work.ranks noprint nway;
class bin;
var par &var;
output out=work.bins sum(par)=par mean(&var)=&var;
run;
data work.bins;
set work.bins;
elogit=log((par+(sqrt(_FREQ_)/2))/
(_FREQ_ -par+(sqrt(_FREQ_)/2)));
run;
title1 "Empirical Logit against &var";
proc sgplot data=work.bins;
reg y=elogit x=&var /
curvelabel="Linear Relationship?"
curvelabelloc=outside
lineattrs=(color=ligr);
series y=elogit x=&var;
run;
title1 "Empirical Logit against Binned &var";
proc sgplot data=work.bins;
reg y=elogit x=bin/
curvelabel="Linear Relationship?"
curvelabelloc=outside
lineattrs=(color=ligr);
series y=elogit x=bin;
run;
My issue is that when I get to the PROC MEANS STEP, the line "var par &var;" shows an error and says that "Variable par in list does not match type prescribed for this list". I've checked the frequencies of my "par" variable and it only contains "1" and "0" 's. Really confused on how to continue with creating the empirical logit plots.
Again, I'm trying to create a plot where the y-axis is the LOGIT(par) and the x-axis is the continuous (fer) variable. If you have any other suggestions / tips / other methods on creating a plot that tests for this non-linearity, it'd be greatly appreciated.
Thanks
P.S. I'm using the "Demo: Creating Empirical Logit Plots" video created by SAS as reference to all of the code I'm using:
https://www.coursera.org/lecture/sas-predictive-modeling-using-logistic-regression/demo-creating-empirical-logit-plots-ouMSH?utm_source=link&utm_medium=page_share&utm_content=vlp&utm_campaign=top_button
... View more