BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
asgee
Obsidian | Level 7

Hi all,

 

I'm trying to asses non-linearity between a continuous predictor (fer) and the logit of a binary outcome variable (par).

 

I understand that one way to visualize this is through an empirical logit plot. First off, I'm having trouble trying to find the exact syntax of how to produce these plots (I'm using SAS University edition so I think I can't use macros?). 

 

To compute the empirical logit, my understanding is that you have to bin your continuous variable. Then, I would run a PROC MEANS procedure to produce the number of target event cases (where par=1), and the number of total cases in each bin. 

 

This is the code I have so far: 

data want;
set have;
run;

%global var;
%let var=fer;

proc rank data = want groups=100 out=work.ranks;
	var &var;
	ranks bin;
run;

title1 "Checking fer by bin";
proc print data=work.ranks(obs=10);
	var &var bin;
run;

proc means data=work.ranks noprint nway;
	class bin;
	var par &var;
	output out=work.bins sum(par)=par mean(&var)=&var;
run;
data work.bins;
    set work.bins;
    elogit=log((par+(sqrt(_FREQ_)/2))/
       (_FREQ_ -par+(sqrt(_FREQ_)/2)));
run;

title1 "Empirical Logit against &var";
proc sgplot data=work.bins;
   reg y=elogit x=&var / 
      curvelabel="Linear Relationship?"
      curvelabelloc=outside
      lineattrs=(color=ligr);
   series y=elogit x=&var;
run;

title1 "Empirical Logit against Binned &var";
proc sgplot data=work.bins;
   reg y=elogit x=bin/
      curvelabel="Linear Relationship?"
      curvelabelloc=outside
      lineattrs=(color=ligr);
   series y=elogit x=bin;
run;

 

My issue is that when I get to the PROC MEANS STEP, the line "var par &var;" shows an error and says that "Variable par in list does not match type prescribed for this list". I've checked the frequencies of my "par" variable and it only contains "1" and "0" 's. Really confused on how to continue with creating the empirical logit plots.

 

Again, I'm trying to create a plot where the y-axis is the LOGIT(par) and the x-axis is the continuous (fer) variable. If you have any other suggestions / tips / other methods on creating a plot that tests for this non-linearity, it'd be greatly appreciated. 

 

Thanks 

 

P.S. I'm using the "Demo: Creating Empirical Logit Plots" video created by SAS as reference to all of the code I'm using: 

https://www.coursera.org/lecture/sas-predictive-modeling-using-logistic-regression/demo-creating-emp...

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
I don't believe there's a default way in SAS but you need to bin your data and then graph it manually as you're trying. You can use macros in SAS UE by the way, not sure why you think you can't.

Your error is because one of your variables (par/fer) is character when SAS expects a numeric variable to be summarized. Your variable type is the issue. I'm guessing it's with the variable par ...

View solution in original post

4 REPLIES 4
Reeza
Super User

Would the effect plot from a univariate PROC LOGISTIC be similar enough or provide the same information? It's continuous rather than binned but otherwise seems the same....but I could be missing something. 

 

Shows an example of how to get an effect plot

https://communities.sas.com/t5/New-SAS-User/How-to-Modify-Effect-Plots-In-Proc-Logistic/td-p/595037

 


@asgee wrote:

Hi all,

 

I'm trying to asses non-linearity between a continuous predictor (fer) and the logit of a binary outcome variable (par).

 

I understand that one way to visualize this is through an empirical logit plot. First off, I'm having trouble trying to find the exact syntax of how to produce these plots (I'm using SAS University edition so I think I can't use macros?). 

 

To compute the empirical logit, my understanding is that you have to bin your continuous variable. Then, I would run a PROC MEANS procedure to produce the number of target event cases (where par=1), and the number of total cases in each bin. 

 

This is the code I have so far: 

data want;
set have;
run;

%global var;
%let var=fer;

proc rank data = want groups=100 out=work.ranks;
	var &var;
	ranks bin;
run;

title1 "Checking fer by bin";
proc print data=work.ranks(obs=10);
	var &var bin;
run;

proc means data=work.ranks noprint nway;
	class bin;
	var par &var;
	output out=work.bins sum(par)=par mean(&var)=&var;
run;
data work.bins;
    set work.bins;
    elogit=log((par+(sqrt(_FREQ_)/2))/
       (_FREQ_ -par+(sqrt(_FREQ_)/2)));
run;

title1 "Empirical Logit against &var";
proc sgplot data=work.bins;
   reg y=elogit x=&var / 
      curvelabel="Linear Relationship?"
      curvelabelloc=outside
      lineattrs=(color=ligr);
   series y=elogit x=&var;
run;

title1 "Empirical Logit against Binned &var";
proc sgplot data=work.bins;
   reg y=elogit x=bin/
      curvelabel="Linear Relationship?"
      curvelabelloc=outside
      lineattrs=(color=ligr);
   series y=elogit x=bin;
run;

 

My issue is that when I get to the PROC MEANS STEP, the line "var par &var;" shows an error and says that "Variable par in list does not match type prescribed for this list". I've checked the frequencies of my "par" variable and it only contains "1" and "0" 's. Really confused on how to continue with creating the empirical logit plots.

 

Again, I'm trying to create a plot where the y-axis is the LOGIT(par) and the x-axis is the continuous (fer) variable. If you have any other suggestions / tips / other methods on creating a plot that tests for this non-linearity, it'd be greatly appreciated. 

 

Thanks 

 

P.S. I'm using the "Demo: Creating Empirical Logit Plots" video created by SAS as reference to all of the code I'm using: 

https://www.coursera.org/lecture/sas-predictive-modeling-using-logistic-regression/demo-creating-emp...


 

 

 

 

asgee
Obsidian | Level 7

Hi @Reeza ! Yes first I thought of that but not sure if that's the same as the Y-axis as the Logit of Outcome instead of the predicted probabilities... This is the sort of graph that I'm trying to replicate using my data:

 

asgee_0-1604530764236.png

 

Again my outcome is binary (0,1) and my variable is continuous. I checked the plots=all function on PROC LOGISTIC and cant seem to find a graph or a way to do this...

 

I'm assuming perhaps I can just do this manually, but not sure how to logit transform just my binary outcome variable and then just plot that logit(par) and continuous fer separate from running a PROC LOGISTIC....

 

Reeza
Super User
I don't believe there's a default way in SAS but you need to bin your data and then graph it manually as you're trying. You can use macros in SAS UE by the way, not sure why you think you can't.

Your error is because one of your variables (par/fer) is character when SAS expects a numeric variable to be summarized. Your variable type is the issue. I'm guessing it's with the variable par ...
asgee
Obsidian | Level 7
Hi @Reeza! Thanks for your reply. Yeah I completely missed that seemingly small yet very important detail of checking to see whether my variable was character or numeric. Seems like that was def the issue.

I ended up continuing on with the code I created and it seemed to create the plots I want. Thanks again for your help!!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2530 views
  • 1 like
  • 2 in conversation