I am trying to get starting values for a non-linear regression problem. When I run my code it gives me an error with the log portion of the code in the proc reg, but I need to run it to get my starting values.
DATA MUSSELS;
INPUT Y X;
X = X;
YI = -LOG(Y - 47.1);
CARDS;
29.2 4
28.6 4
29.4 5
33.0 5
28.2 6
33.9 6
33.1 6
33.2 7
31.4 7
37.8 7
36.9 8
40.2 8
39.2 9
40.6 9
35.2 10
43.3 10
42.3 13
41.4 13
45.2 14
47.1 18
;
PROC REG;
MODEL YI = X;
So, I am doing the Gauss Newton procedure for Non-Linear regression, and this initial code is used to find starting values for the parameters so I can run the non-linear regression on the equation. So the equation is Y = Alpha - EXP[-(Beta + gamma*X)]. You have to find a starting value for alpha from the data by figuring out how to linearize the equation. You can't linearize it, so knowing as X gets large the equation becomes Y = alpha, you pick the largest Y value to throw in for alpha, and then when you take the linearized part of Y with that parameter and run proc reg on it, it should give an intercept and beta value for X that give you starting values for the other parameters.
Here is the code for the in-class example:
DATA MINING;
INPUT W D Y;
X = W/D;
YB = LOG(1 - Y/35);
CARDS;
610 550 33.6
450 500 22.3
450 520 22.0
430 740 18.7
410 800 20.2
500 230 31.0
500 235 30.0
500 240 32.0
450 600 26.6
450 650 15.1
480 230 30.0
475 1400 13.5
485 615 26.8
474 515 25.0
485 700 20.4
600 750 15.0
;
PROC REG;
MODEL YB = X;
PROC NLIN DATA = MINING METHOD = GAUSS;
MODEL Y = A*(1-EXP((-1)*B*X));
PARMS A = 35 B = .8672;
OUTPUT OUT=TWO P=PRED R=RESID;
PROC PRINT;
VAR Y PRED RESID;
and the prof ran the original proc reg to get the B parameter before coding the rest
So I didn't solve the equation correctly, and it should have been alpha - Y..... (47.1 - Y), and that gave me values, however the code dropped an observation, and I'm not sure why
DATA MUSSELS;
INPUT Y X;
YB = -LOG(47.1 - Y);
CARDS;
29.2 4
28.6 4
29.4 5
33.0 5
28.2 6
33.9 6
33.1 6
33.2 7
31.4 7
37.8 7
36.9 8
40.2 8
39.2 9
40.6 9
35.2 10
43.3 10
42.3 13
41.4 13
45.2 14
47.1 18
;
PROC REG;
MODEL YB = X;
PROC NLIN DATA = MINING METHOD = MARQUARDT NOHALVE;
MODEL Y = A - EXP((-1)*(B + C*X));
PARMS A = 47.1 B = -3.72144 C = 0.18315;
OUTPUT OUT=TWO P=PRED R=RESID;
PROC PRINT;
VAR Y PRED RESID;
figured it out - simple miscoding of the data call :/. It works now - it was definitely switching the Y and the 47.1. Thank you!
That is because you have 47.1 at the last record.
DATA MUSSELS; INPUT Y X; YB = -LOG(47.1 - Y); CARDS; 29.2 4 28.6 4 29.4 5 33.0 5 28.2 6 33.9 6 33.1 6 33.2 7 31.4 7 37.8 7 36.9 8 40.2 8 39.2 9 40.6 9 35.2 10 43.3 10 42.3 13 41.4 13 45.2 14 47.1 18 ;
LOG(47.1-47.1) ==> LOG(0) ,that is not right, it should be greater than zero.
You need to change it.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.