BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
TakakuraMD
Calcite | Level 5

Hi! 

 

I am having a difficult time plotting the logit (of finding the syntax to plot the logit) of an outcome against a continuous variable.

I am doing this in order to check for linearity for a logistical regression. I have a binary outcome that I modeled against a continuous variable and another binary variable

 

Please help!

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Anytime. If you found your answer, please mark the thread as accepted to help other users navigate the forum 🙂

 

Otherwise, please post in this thread again if you have questions.

View solution in original post

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

Welcome to the SAS communities. Do you simply want to plot the logit function? 

 

Then do something like this

 

data logit;
   do p=0.01 to 0.99 by 0.01;
      logit=log(p/(1-p));
      output;
   end;
run;

title "Plotting the logit function";
proc sgplot data=logit;
   series x=p y=logit;
   xaxis grid;
   yaxis grid label="logit(p)";
run;
title;
TakakuraMD
Calcite | Level 5
Hi,


Thank you for your response. I wanted to do this for my data set.

My data set consists of a binary outcome (dead or alive)

1 continuous variable (age)

1 binary variable Having a heart disease (yes or no)


I want to plot the logit of dead or alive vs the variable age and check for linearity. I think some call this checking for "linear in the logit"


Thanks!
PeterClemmensen
Tourmaline | Level 20

Try a Google search for "sas empirical logit plot". Quite a few examples. 

 

Post example data if you want a usable code answer 🙂

TakakuraMD
Calcite | Level 5
Thank you,


Found this very helpful. You guys are the best.
PeterClemmensen
Tourmaline | Level 20

Anytime. If you found your answer, please mark the thread as accepted to help other users navigate the forum 🙂

 

Otherwise, please post in this thread again if you have questions.

Ksharp
Super User

@Rick_SAS might give you an hand, He wrote a blog about it before.

 

 

data have;

 set have;

 good_bad=ifn(outcome='dead',1,0);

run;

 

proc sgplot data=have;

loess x=age y=good_bad;

run;

FreelanceReinh
Jade | Level 19

Hi @TakakuraMD,

 

One method described in Hosmer/Lemeshow: Applied Logistic Regression, 3rd ed., p. 95 f. (in the 2nd edition: p. 99) involves creating a 4-level categorical version of the continuous variable (based on the quartiles) and using this in place of the original variable in a logistic regression model including the other model variables ("heart disease" in your example).

/* Create test data for demonstration */

data have;
call streaminit(31415926);
do subjid=1 to 500;
  age=int(rand('uniform',18,75));
  heartdis=rand('bern',age/100-0.15);
  p=logistic(-4.56+0.0567*age+1.23*heartdis);
  dead=rand('bern',p);
  output;
end;
run;

Here is an implementation of this method for variable AGE in the above dataset HAVE:

/* Compute age quartiles */

proc summary data=have;
var age;
output out=_qtls(drop=_type_ _freq_) min=_min q1=_q1 median=_q2 q3=_q3 max=_max;
run;

/* Create a categorical version of AGE with four levels */

data _tmpana(rename=(agecat=age));
if _n_=1 then set _qtls;
set have;
if age>_q3 then agecat=4;
else if age>_q2 then agecat=3;
else if age>_q1 then agecat=2;
else if age>.   then agecat=1;
else agecat=.;
drop age;
run;

/* Create a model using the new categorical variable AGE in place of the continuous original */

ods output ParameterEstimates=est(keep=Variable ClassVal0 Estimate where=(Variable="age"));
proc logistic data = _tmpana desc;
class heartdis(ref='0') age(ref='1') / param=ref;
model dead = heartdis age;
run;

/* Combine quartile midpoints and corresponding model coefficients */

data _midp;
set est(drop=Variable);
if _n_=1 then do;
  set _qtls;
  age=(_min+_q1)/2;
  _coeff=0;
  output;
end;
select(ClassVal0);
  when('2') do;
              age=(_q1+_q2)/2;
              _coeff=Estimate;
              output;
            end;
  when('3') do;
              age=(_q2+_q3)/2;
              _coeff=Estimate;
              output;
            end;
  when('4') do;
              age=(_q3+_max)/2;
              _coeff=Estimate;
              output;
            end;
  otherwise;
end;
keep _coeff age;
run;

/* Plot coefficients vs. quartile midpoints to check the linearity assumption */

proc sgplot data=_midp;
series x=age y=_coeff / markers;
run;

The resulting plot supports the assumption that the model is linear in the logit for variable AGE:

linearity_check.png

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 4108 views
  • 2 likes
  • 4 in conversation