I provided the data from the following function to Proc Logistic:
P = 1 / [ 1 + e^(- 0.005 * x) ]
Sigmoid probability distribution has the following form:
P(event) = 1 / [ 1 + e^( - (a + bx) ], where x is the predictor variable.
Therefore, a = 0 and b = 0.005.
Then, I simulated binary response values as follows:
if P >= 0.6 then y = 1; else y = 0.
I used Proc Logistic to model y( event = '1') = x.
The Logistic procedure confidently returned the following parameters:
a = -182.5 and b = 2.2401.
Why are they different from what I provided?
One difference is you'll need to specify the NOINT option to fit a model without intercept.
EDIT (added):
What are you trying to do overall here that you need. If it's to fit an exponential model, there are other methods to do that, PROC UNIVARIATE for example. I'm guessing how the 0/1 is being assigned doesn't align with logistic regression somehow but can't remember enough of the math to say why.
Thank you, here is the code:
data have;
set have (keep=x);
P = 1/(1+2.71828182846**(-0.005*x));
if P >= 0.6 then y = 1;
else y = 0;
drop P;
run;
proc logistic data=have alpha=0.05
plots(only)=(effect oddsratio);
model y(event='1')=x/clodds=pl;
run;
One difference is you'll need to specify the NOINT option to fit a model without intercept.
EDIT (added):
What are you trying to do overall here that you need. If it's to fit an exponential model, there are other methods to do that, PROC UNIVARIATE for example. I'm guessing how the 0/1 is being assigned doesn't align with logistic regression somehow but can't remember enough of the math to say why.
After using NOINT, b = 0.002. This is pretty good, considering that I had to taint the probability distribution a little to avoid the perfect prediction, which prevented the model from converging.
Hi @pink_poodle,
@Reeza is right in that "how the 0/1 is being assigned doesn't align with logistic regression." You missed the (crucial) randomness in the assignment:
y=rand('bern',P);
Actually, you don't need to create a variable P (you drop it anyway), nor type the digits of e or the formula of the logistic function. Simply define
y=rand('bern',logistic(0.005*x));
Then, you may decide not to use the NOINT option and let PROC LOGISTIC find out -- input data permitting -- that the intercept (parameter a) must be (close to) zero. The results are likely similar with the NOINT option because, of course, a is zero (but you know b=0.005 in advance as well, so this shouldn't be the only reason for using NOINT).
Edit: Use CALL STREAMINIT in the DATA step using the RAND function in order to create reproducible results.
Using your method, y = rand('bern',logistic(0.005*x)),
without NOINT option: intercept = 0.00160, slope = 0.00492;
with NOINT option: slope = 0.00494.
Thank you!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.