Hi all,
I'm trying to asses non-linearity between a continuous predictor (fer) and the logit of a binary outcome variable (par).
I understand that one way to visualize this is through an empirical logit plot. First off, I'm having trouble trying to find the exact syntax of how to produce these plots (I'm using SAS University edition so I think I can't use macros?).
To compute the empirical logit, my understanding is that you have to bin your continuous variable. Then, I would run a PROC MEANS procedure to produce the number of target event cases (where par=1), and the number of total cases in each bin.
This is the code I have so far:
data want;
set have;
run;
%global var;
%let var=fer;
proc rank data = want groups=100 out=work.ranks;
var &var;
ranks bin;
run;
title1 "Checking fer by bin";
proc print data=work.ranks(obs=10);
var &var bin;
run;
proc means data=work.ranks noprint nway;
class bin;
var par &var;
output out=work.bins sum(par)=par mean(&var)=&var;
run;
data work.bins; set work.bins; elogit=log((par+(sqrt(_FREQ_)/2))/ (_FREQ_ -par+(sqrt(_FREQ_)/2))); run; title1 "Empirical Logit against &var"; proc sgplot data=work.bins; reg y=elogit x=&var / curvelabel="Linear Relationship?" curvelabelloc=outside lineattrs=(color=ligr); series y=elogit x=&var; run; title1 "Empirical Logit against Binned &var"; proc sgplot data=work.bins; reg y=elogit x=bin/ curvelabel="Linear Relationship?" curvelabelloc=outside lineattrs=(color=ligr); series y=elogit x=bin; run;
My issue is that when I get to the PROC MEANS STEP, the line "var par &var;" shows an error and says that "Variable par in list does not match type prescribed for this list". I've checked the frequencies of my "par" variable and it only contains "1" and "0" 's. Really confused on how to continue with creating the empirical logit plots.
Again, I'm trying to create a plot where the y-axis is the LOGIT(par) and the x-axis is the continuous (fer) variable. If you have any other suggestions / tips / other methods on creating a plot that tests for this non-linearity, it'd be greatly appreciated.
Thanks
P.S. I'm using the "Demo: Creating Empirical Logit Plots" video created by SAS as reference to all of the code I'm using:
Would the effect plot from a univariate PROC LOGISTIC be similar enough or provide the same information? It's continuous rather than binned but otherwise seems the same....but I could be missing something.
Shows an example of how to get an effect plot
https://communities.sas.com/t5/New-SAS-User/How-to-Modify-Effect-Plots-In-Proc-Logistic/td-p/595037
@asgee wrote:
Hi all,
I'm trying to asses non-linearity between a continuous predictor (fer) and the logit of a binary outcome variable (par).
I understand that one way to visualize this is through an empirical logit plot. First off, I'm having trouble trying to find the exact syntax of how to produce these plots (I'm using SAS University edition so I think I can't use macros?).
To compute the empirical logit, my understanding is that you have to bin your continuous variable. Then, I would run a PROC MEANS procedure to produce the number of target event cases (where par=1), and the number of total cases in each bin.
This is the code I have so far:
data want; set have; run; %global var; %let var=fer; proc rank data = want groups=100 out=work.ranks; var &var; ranks bin; run; title1 "Checking fer by bin"; proc print data=work.ranks(obs=10); var &var bin; run; proc means data=work.ranks noprint nway; class bin; var par &var; output out=work.bins sum(par)=par mean(&var)=&var; run;
data work.bins; set work.bins; elogit=log((par+(sqrt(_FREQ_)/2))/ (_FREQ_ -par+(sqrt(_FREQ_)/2))); run; title1 "Empirical Logit against &var"; proc sgplot data=work.bins; reg y=elogit x=&var / curvelabel="Linear Relationship?" curvelabelloc=outside lineattrs=(color=ligr); series y=elogit x=&var; run; title1 "Empirical Logit against Binned &var"; proc sgplot data=work.bins; reg y=elogit x=bin/ curvelabel="Linear Relationship?" curvelabelloc=outside lineattrs=(color=ligr); series y=elogit x=bin; run;
My issue is that when I get to the PROC MEANS STEP, the line "var par &var;" shows an error and says that "Variable par in list does not match type prescribed for this list". I've checked the frequencies of my "par" variable and it only contains "1" and "0" 's. Really confused on how to continue with creating the empirical logit plots.
Again, I'm trying to create a plot where the y-axis is the LOGIT(par) and the x-axis is the continuous (fer) variable. If you have any other suggestions / tips / other methods on creating a plot that tests for this non-linearity, it'd be greatly appreciated.
Thanks
P.S. I'm using the "Demo: Creating Empirical Logit Plots" video created by SAS as reference to all of the code I'm using:
Hi @Reeza ! Yes first I thought of that but not sure if that's the same as the Y-axis as the Logit of Outcome instead of the predicted probabilities... This is the sort of graph that I'm trying to replicate using my data:
Again my outcome is binary (0,1) and my variable is continuous. I checked the plots=all function on PROC LOGISTIC and cant seem to find a graph or a way to do this...
I'm assuming perhaps I can just do this manually, but not sure how to logit transform just my binary outcome variable and then just plot that logit(par) and continuous fer separate from running a PROC LOGISTIC....
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.