Hi,
I am plotting Observed probability and Logistic Fit Mean Predicted using PROC TEMPLATE and regressionplot but unable to plot in the correct way using PROC LOGISTIC I am getting results as :
proc logistic data = in_data plots=effect;
model binary_variable = continuous_variable ;
run;
But when using PROC TEMPLATE ; regressionplot the plot seems to look incorrect and also the what is the option to plot probability ?(as below):
proc template;
scatterplot x=continuous_variable y=binary_variable;
regressionplot x=continuous_variable y=binary_variable / name='line'
run;
Please suggest what is going wrong here and how to plot probability here ?
You can get the graphs to look more similar by modeling the '1' response instead of the '0' response:
proc logistic data = in_data plots=effect;
model binary_variable(event='1') = continuous_variable ;
run;
However, the REGRESSIONPLOT statement performs linear least-square regression, not logistic regression. Thus, you will never completely duplicate the plot by using the raw data.
You can reproduce the plot by using the predicted values from PROC LOGISTIC. For example, the following example uses the OUTPUT statement to write the predicted values to a SAS data set and then plots the results:
proc logistic data = in_data plots=effect;
model binary_variable(event='1') = continuous_variable ;
output out=LogiOut Pred=Pred;
run;
proc sort data=LogiOut;
by continuous_variable;
run;
proc sgplot data=LogiOut;
scatter x=continuous_variable y=binary_variable;
series x=continuous_variable y=Pred;
run;
The two plots show the same data, except the Y-axis is reversed on the 2nd one.
The correct plot is the one from PROC LOGISTIC. I don't think you can get this plot any other way, as other plotting routines will not take into account the logistic function, which is log(p/(1-p)), used in PROC LOGISTIC.
You can get the graphs to look more similar by modeling the '1' response instead of the '0' response:
proc logistic data = in_data plots=effect;
model binary_variable(event='1') = continuous_variable ;
run;
However, the REGRESSIONPLOT statement performs linear least-square regression, not logistic regression. Thus, you will never completely duplicate the plot by using the raw data.
You can reproduce the plot by using the predicted values from PROC LOGISTIC. For example, the following example uses the OUTPUT statement to write the predicted values to a SAS data set and then plots the results:
proc logistic data = in_data plots=effect;
model binary_variable(event='1') = continuous_variable ;
output out=LogiOut Pred=Pred;
run;
proc sort data=LogiOut;
by continuous_variable;
run;
proc sgplot data=LogiOut;
scatter x=continuous_variable y=binary_variable;
series x=continuous_variable y=Pred;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.