Hi
I have a problem estimating a nlin function on weighted data which basically comes down to the problem below:
Consider the following data where clearly y =x² and z aresome weights. Bij adapting the standard proc nlin to include the weights:
data STARTDATA;
input x y z ;
datalines ;
1 2 0.1
2 4 0.2
3 9 0.2
4 16 0.5
;
PROC SQL;
CREATE VIEW WORK.SORTTempTableSorted AS
SELECT T.y, T.x, T.z
FROM WORK.STARTDATA as T;
PROC NLIN DATA=WORK.SORTTempTableSorted
MAXITER=100
CONVERGE=1E-05
SINGULAR=1E-08
MAXSUBIT=30;
_WEIGHT_ = z;
MODEL y = x ** b
;
PARMS
b=0.01;
RUN;
The parameter b is estimated correctly as 2. However the SSE is minimized to the first weight which is 0.1 and is not further decreased to 0.
This makes the R² (here 1) wrongly estimated.
How can I correct my example such that the SSE can be smaller than the first weight?
Thanks in advance
Forgot the mention above, your SS should not be 0 because your y values are not exactly equal to x**2 (the first y should be 1, not 2, if you want an exact match).
Why do you think the results are wrong? I just ran your program with plots turned on, and everything looks fine for the fit. Taking out the weight statement gives a different confidence region, different SE, etc.,as expected. I think you just have a coincidence about the "0.1". For instance, I ran:
data b; set startdata;
pred = x**2;
SS = (y - pred)**2;
WSS = z*SS;
run;
proc means data=b sum;
var SS WSS;
run;
To manually get the sum of squares, weighted and unweighted, based on your parameter estimate. Weighted SS = 0.1, just like the NLIN output. If you run NLIN without the weight, you get SS = 1.0. This is what you get manually (above). I think everything is fine. By the way, put a plots=all option in your NLIN statement for good graphs.
Forgot the mention above, your SS should not be 0 because your y values are not exactly equal to x**2 (the first y should be 1, not 2, if you want an exact match).
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.