Hi!
I want to apply the nonlinear regression on the data below. my question is how identify the parameters value?
data accidents;
input number time intervention time_af_int ;
datalines;
17 1 0 0
10 2 0 0
15 3 0 0
14 4 0 0
26 5 0 0
9 6 0 0
11 7 0 0
17 8 0 0
10 9 0 0
15 10 0 0
21 11 0 0
11 12 0 0
14 13 0 0
16 14 0 0
9 15 0 0
11 16 0 0
13 17 1 1
10 18 1 2
12 19 1 3
5 20 1 4
11 21 1 5
7 22 1 6
10 23 1 7
7 24 1 8
9 25 1 9
6 26 1 10
6 27 1 11
6 28 1 12
10 29 1 13
8 30 1 14
11 31 1 15
7 32 1 16
8 33 1 17
5 34 1 18
6 35 1 19
;
run;
proc nlin data=accidents outest=est;
parameters b0=
b1=
b2=
b3=
;
model number=b0+(b1*time**2)+(b2*(intervention**2))+(b3*(time_af_int**2));
run;
This is a linear regression. You do not need PROC NLIN. You can use PROC GLM to obtain the parameter estimates:
proc glm data=accidents;
model number= time*time intervention*intervention time_af_int*time_af_int;
run;
Also, given that the response variable is a count, it is more appropriate to use Poisson regression estimated by maximum likelihood, which btw is a log-linear model - a nonlinear model. For example:
proc genmod;
model number=time*time intervention*intervention time_af_int*time_af_int / dist=poisson;
run;
sorry, the model
number=b0*(time**b1)*(intervention**b2)*(time_af_int**b3)
Take the log of both sides and you get a linear model for log(number).
But in case you come back with yet another model, the answer to your question is that you make an educated guess. When possible, you can use a reduced model to obtain an initial guess. You can also use a grid search to find initial parameter values for regression models.
The Poisson model I showed earlier using PROC GENMOD is essentially that model. The only difference is that you are modeling the log(mean), as Rick suggests, rather than the mean directly and provides estimates of those parameters.
I used the Poisson model PROC GENMOD to estimate the parameters then use the NLIN.
There is an errors in the output. It is segmented regression with breaking point at t=16
NLIN does not make the same assumption about the response distribution as when you specify DIST=POISSON in PROC GENMOD, and the two procedures do not use the same estimation method. I suggest you use the results from PROC GENMOD and don't use NLIN.
If you are determined to fit this using a nonlinear method, you may want to consider NLMIXED. There you can specify various distributions in the MODEL statement. However, I will warn you that NLMIXED, while very powerful and flexible, is one of the most difficult PROCs to implement correctly. Be sure to check all of the Examples in the documentation (including the two in the Getting Started section) and especially Example 82.4 Poisson-Normal Model with Count Data.
SteveDenham
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.