Hi everybody!
I have a dataset as attached. The problem is find parameter a, b to minimize weighted error.
I used Excel Solver as illustrated and it worked like a charm. But when I wanted to use PROC HPNLMOD in SAS like below, it stopped at a point in which the error value is very far from the Excel's result. I don't know how to solve the most optimal value from PROC HPNLMOD.
PROC HPNLMOD DATA=TEMP01;
PARAMETERS A=0 B=0 LAMBDA=0;
PRED=EXP(A+B*RANKING);
LL=WEIGHT*ABS(PRED-TARGET);
MODEL Y ~ GENERAL(-LL);
ODS OUTPUT PARAMETERESTIMATES=TEMP_PARAM(KEEP=ESTIMATE);
RUN;
Your syntax has a few errors. You should always check the log to see what errors and warnings are present.
With such a small data set, you can use NLMIXED, which performs MLE estiamation. I think the following code is self-explanatory, but I would be happy to clarify any steps that might be confusing:
data Temp01;
input Pct Ranking ABR Target;
Weight = Pct / 100;
Y = 0;
datalines;
5.74 1 -7.080829537 0.01
7.32 2 -6.224511583 0.04
8.01 3 -5.368193629 0.07
11.15 4 -4.511875675 0.06
13.57 5 -3.655557721 0.10
13.43 6 -2.799239766 0.15
12.69 7 -1.942921812 0.27
9.34 8 -1.086603858 0.48
6.68 9 -0.230285904 1.02
4.32 10 0.62603205 2.10
2.86 11 1.482350004 4.59
1.74 12 2.338667959 8.87
1.14 13 3.194985913 18.40
2.01 14 4.051303867 57.47
run;
/* fit the model to the data */
PROC NLMIXED DATA=TEMP01;
PARAMETERS A=0 B=1;
PRED=EXP(A+B*RANKING);
LL=WEIGHT*ABS(PRED-TARGET);
MODEL Y ~ GENERAL(-LL);
ODS OUTPUT PARAMETERESTIMATES=TEMP_PARAM(KEEP=ESTIMATE);
RUN;
proc print data=Temp_Param; run;
/* visualize the fit */
data Pred;
set Temp01;
eta = -7.5314 + 0.8273 * Ranking;
Pred = exp(eta);
run;
proc sgplot data=Pred;
scatter x=Ranking y=Target;
series x=Ranking y=Pred;
xaxis grid integer;
yaxis grid;
run;
Your syntax has a few errors. You should always check the log to see what errors and warnings are present.
With such a small data set, you can use NLMIXED, which performs MLE estiamation. I think the following code is self-explanatory, but I would be happy to clarify any steps that might be confusing:
data Temp01;
input Pct Ranking ABR Target;
Weight = Pct / 100;
Y = 0;
datalines;
5.74 1 -7.080829537 0.01
7.32 2 -6.224511583 0.04
8.01 3 -5.368193629 0.07
11.15 4 -4.511875675 0.06
13.57 5 -3.655557721 0.10
13.43 6 -2.799239766 0.15
12.69 7 -1.942921812 0.27
9.34 8 -1.086603858 0.48
6.68 9 -0.230285904 1.02
4.32 10 0.62603205 2.10
2.86 11 1.482350004 4.59
1.74 12 2.338667959 8.87
1.14 13 3.194985913 18.40
2.01 14 4.051303867 57.47
run;
/* fit the model to the data */
PROC NLMIXED DATA=TEMP01;
PARAMETERS A=0 B=1;
PRED=EXP(A+B*RANKING);
LL=WEIGHT*ABS(PRED-TARGET);
MODEL Y ~ GENERAL(-LL);
ODS OUTPUT PARAMETERESTIMATES=TEMP_PARAM(KEEP=ESTIMATE);
RUN;
proc print data=Temp_Param; run;
/* visualize the fit */
data Pred;
set Temp01;
eta = -7.5314 + 0.8273 * Ranking;
Pred = exp(eta);
run;
proc sgplot data=Pred;
scatter x=Ranking y=Target;
series x=Ranking y=Pred;
xaxis grid integer;
yaxis grid;
run;
> Moreover, I still wonder what is the difference between PROC HPNLMOD and NLMIXED that result the situation I encountered? Is there a specific method HPNLMOD use and another for NLMIXED.
Different procedures have different default options and methods (see the doc), but you could use HPNLMOD if you prefer. Take my example and replace "NLMIXED" with "HPNLMOD":
PROC HPNLMOD DATA=TEMP01;
PARAMETERS A=0 B=1;
PRED=EXP(A+B*RANKING);
LL=WEIGHT*ABS(PRED-TARGET);
MODEL Y ~ GENERAL(-LL);
ODS OUTPUT PARAMETERESTIMATES=TEMP_PARAM(KEEP=ESTIMATE);
RUN;
proc print data=Temp_Param; run;
/* visualize the fit */
data Pred;
set Temp01;
eta = -9.9997 + 1.0036 * Ranking;
Pred = exp(eta);
run;
proc sgplot data=Pred;
scatter x=Ranking y=Target;
series x=Ranking y=Pred;
xaxis grid integer;
yaxis grid;
run;
I am not sure what target_new(i) means, but perhaps it means the predicted values of the model. There is no reason to expect that the sum(weight[i]*Y[i]) is equal to the sum(weight[i]*Pred[i]). You are minimizing sum(weight[i]*abs(Y[i]-Pred[i])), which does not require that the sum of weights be equal.
Thank you for your great help ^^
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.