You will probably see better performance if you omit the z_CON constraints (which seem to make the problem infeasible) and rewrite the objective function in a way that leads to a sparse Hessian:
*min MSE = sum{r in RGN_I}
(z
var Error {RGN_I};
con Error_CON {r in RGN_I}:
Error
min MSE = sum{r in RGN_I}
Error
BTW, how do I mark this question answered?
Instructions can be found here:
Hi Rob and others,
I extended the Rubix with more dimensions...
data TABLE1;
input RGN $7. MTH $4. CNTRY $3. CLS $3. MKT $7. MKTTYPE $7. DMD VALUE PRICE;
datalines;
GGGGGG APR JJ RR AAACCC OTHOTH 4.072487387 5073.712404 1245.85098
GGGGGG APR LL RR AAACCC OTHOTH 0.759277544 527.855155 695.2071207
ROMANN APR SS RR ARNSSS TTTTTT 13.6940138 20710.54229 1512.379248
GGGGGG APR BB RR BAADDD OTHOTH 3.858234121 1971.730438 511.044788
BUHBUH APR QQ RR CAAHHH TTTTTT 670.3483179 607903.4554 906.8471407
GGGGGG APR KK RR DEEKKK TTTTTT 22.93870547 11473.72103 500.1904334
LIONNN APR UG RR EEEFFF OTHOTH 1.314052632 1776.681947 1352.063002
ROMANN APR IT RR DDMFCC OTHOTH 18.18630498 34948.34188 1921.684582
NOODLE APR AA RR MMMMUC TTTTTT 38.07123286 107524.5245 2824.298464
ROMANN APR IT WW BBRNNN OTHOTH 6.842336232 3142.693913 459.301298
NOODLE FEB PP RR GGGIII OTHOTH 1.240267277 3507.439478 2827.970665
BUHBUH JUL QQ RR AEEPPP UNDISC 0.610535011 282.9336958 463.419281
;
proc sql;
create table TABLE_RGN as
select unique RGN from TABLE1; quit;
proc sql;
create table TABLE_MN as
select unique MTH from TABLE1; quit;
proc sql;
create table TABLE_PS as
select unique CNTRY from TABLE1; quit;
proc sql;
create table TABLE_CM as
select unique CLS from TABLE1; quit;
proc sql;
create table TABLE_ND as
select unique MKT from TABLE1; quit;
proc sql;
create table TABLE_OT as
select unique MKTTYPE from TABLE1; quit;
/*****************************************************************************************/
/* Dummy data set only to provide constant targets */
proc sql;
create table RGN_TGT as
select unique RGN,
sum(VALUE)*1.3 as RGN_VALUE_TGT format dollar16.0,
0.7*(sum(VALUE)/sum(DMD)) as RGN_PRICE_TGT format dollar16.0,
sum(DMD) as RGN_DMD
from TABLE1
group by RGN;
quit;
------ Now add another dimension to the Rubix -------
proc sql;
create table MTH_TGT as
select unique MTH,
sum(VALUE)*1.3 as MTH_VALUE_TGT format dollar16.0,
0.7*(sum(VALUE)/sum(DMD)) as MTH_PRICE_TGT format dollar16.0,
sum(DMD) as MTH_DMD
from TABLE1
group by MTH;
quit;
data MTH_TGT;
set MTH_TGT;
if MTH in ('APR', 'MAY', 'JUN', 'JUL', 'AUG', 'SEP') then MTH_VOL_TGT = MTH_DMD*1.05;
else MTH_VOL_TGT = MTH_DMD*0.95;
run;
/*****************************************************************************************/
options missing=0;
proc optmodel;
/* declare index sets and read values from data sets*/
set <str> RGN_I;
read data TABLE_RGN nomiss into RGN_I = [RGN];
set <str> MTH_I;
read data TABLE_MN nomiss into MTH_I = [MTH];
set <str> CNTRY_I;
read data TABLE_PS nomiss into CNTRY_I = [CNTRY];
set <str> CLS_I;
read data TABLE_CM nomiss into CLS_I = [CLS];
set <str> MKT_I;
read data TABLE_ND nomiss into MKT_I = [MKT];
set <str> MKTTYPE_I;
read data TABLE_OT nomiss into MKTTYPE_I = [MKTTYPE];
/* declare parameters and read from data sets */
num DMD {RGN_I, MTH_I, CNTRY_I, CLS_I, MKT_I, MKTTYPE_I}init .;
read data TABLE1 nomiss into [RGN MTH CNTRY CLS MKT MKTTYPE] DMD;
num PRICE {RGN_I, MTH_I, CNTRY_I, CLS_I, MKT_I, MKTTYPE_I}init .;
read data TABLE1 nomiss into [RGN MTH CNTRY CLS MKT MKTTYPE] PRICE;
num RGN_VALUE_TGT {RGN_I}init .;
read data RGN_TGT nomiss into RGN_I =[RGN] RGN_VALUE_TGT;
num RGN_PRICE_TGT {RGN_I}init .;
read data RGN_TGT nomiss into RGN_I =[RGN] RGN_PRICE_TGT;
num MTH_VOL_TGT {MTH_I}init .;
read data MTH_TGT nomiss into MTH_I =[MTH] MTH_VOL_TGT;
/* test that the data is read*/
print RGN_VALUE_TGT RGN_PRICE_TGT;
print MTH_VOL_TGT;
/*print DMD PRICE; */
/* declare variables */
var x {RGN_I, MTH_I, CNTRY_I, CLS_I, MKT_I, MKTTYPE_I} >= 0;
for {r in RGN_I, m in MTH_I, p in CNTRY_I, c in CLS_I, k in MKT_I, t in MKTTYPE_I: PRICE[r,m,p,c,k,t] = .}
fix x[r,m,p,c,k,t] = 0;
var y {RGN_I, MTH_I, CNTRY_I, CLS_I, MKT_I, MKTTYPE_I} >= 0;
for {r in RGN_I, m in MTH_I, p in CNTRY_I, c in CLS_I, k in MKT_I, t in MKTTYPE_I: PRICE[r,m,p,c,k,t] = .}
fix y[r,m,p,c,k,t] = 0;
impvar z {rc in RGN_I}
= sum{r in RGN_I, m in MTH_I, p in CNTRY_I, c in CLS_I, k in MKT_I, t in MKTTYPE_I}
x[r,m,p,c,k,t]*y[r,m,p,c,k,t];
impvar mn {mt in MTH_I}
= sum{r in RGN_I, m in MTH_I, p in CNTRY_I, c in CLS_I, k in MKT_I, t in MKTTYPE_I}
x[r,m,p,c,k,t];
print x y z mn;
/* define constraints */
con x_CON {r in RGN_I, m in MTH_I, p in CNTRY_I, c in CLS_I, k in MKT_I, t in MKTTYPE_I: PRICE[r,m,p,c,k,t] ne .}:
DMD[r,m,p,c,k,t]*0.7 <= x[r,m,p,c,k,t] <= DMD[r,m,p,c,k,t]*1.3;
con y_CON {r in RGN_I, m in MTH_I, p in CNTRY_I, c in CLS_I, k in MKT_I, t in MKTTYPE_I: PRICE[r,m,p,c,k,t] ne .}:
PRICE[r,m,p,c,k,t]*0.7 <= y[r,m,p,c,k,t] <= PRICE[r,m,p,c,k,t]*1.3;
con mn_CON {m in MTH_I}:
MTH_VOL_TGT
/* objective function */
var Error {RGN_I};
con Error_CON {r in RGN_I}:
Error
min MSE = sum{r in RGN_I}
Error
solve;
/*print x y mn z;*/
quit;
Why am I getting infeasibility? I thought the constraints have wide enough margin.
And also all I seek in the objective function is min MSE.
Why are all the variables ending up at 0.
Do you foresee a performance issue if I scale this is upto 500,000 rows and add 3-4 more dimensions?
Let me know.
Rgds
Praveen
Hi Praveen,
You can try the IIS option. If there are linear constraints that are inconsistent, the solver will find them. In your case, the output I get from:
solve with NLP / iis=on;
expand / iis;
is
...
Constraint mn_CON[APR]: 802.70742685 <= mn[APR] <= 835.47099529
Constraint mn_CON[JUL]: 0.6282405263 <= mn[JUL] <= 0.6538829968
I also added new constants so that you can control the deltas over several solves and see what the minimal feasible range is, e.g.:
num tgtDelta init .02;
con mn_CON {m in MTH_I}:
MTH_VOL_TGT
<= mn
<= MTH_VOL_TGT
;
I have also introduced two intermediary sets so that your code can be a little bit shorter and you only define X and Y variables with the price is non-missing:
set RMCCMM = {RGN_I, MTH_I, CNTRY_I, CLS_I, MKT_I, MKTTYPE_I};
...
set RMCCMMwithPRICE = {<r,m,p,c,k,t> in RMCCMM: price[r,m,p,c,k,t] ~= .};
num dmdAndPriceDelta init 0.3;
var x {<r,m,p,c,k,t> in RMCCMMwithPRICE}
>= DMD[r,m,p,c,k,t] * (1 - dmdAndPriceDelta)
<= DMD[r,m,p,c,k,t] * (1 + dmdAndPriceDelta);
var y {<r,m,p,c,k,t> in RMCCMMwithPRICE}
>= PRICE[r,m,p,c,k,t] * (1 - dmdAndPriceDelta)
<= PRICE[r,m,p,c,k,t] * (1 + dmdAndPriceDelta);
Here is the new complete (optmodel part only) listing:
proc optmodel;
/* declare index sets and read values from data sets*/
set <str> RGN_I;
read data TABLE_RGN nomiss into RGN_I = [RGN];
set <str> MTH_I;
read data TABLE_MN nomiss into MTH_I = [MTH];
set <str> CNTRY_I;
read data TABLE_PS nomiss into CNTRY_I = [CNTRY];
set <str> CLS_I;
read data TABLE_CM nomiss into CLS_I = [CLS];
set <str> MKT_I;
read data TABLE_ND nomiss into MKT_I = [MKT];
set <str> MKTTYPE_I;
read data TABLE_OT nomiss into MKTTYPE_I = [MKTTYPE];
set RMCCMM = {RGN_I, MTH_I, CNTRY_I, CLS_I, MKT_I, MKTTYPE_I};
/* declare parameters and read from data sets */
num DMD {RMCCMM}init .;
read data TABLE1 nomiss into [RGN MTH CNTRY CLS MKT MKTTYPE] DMD;
num PRICE {RMCCMM}init .;
read data TABLE1 nomiss into [RGN MTH CNTRY CLS MKT MKTTYPE] PRICE;
num RGN_VALUE_TGT {RGN_I}init .;
read data RGN_TGT nomiss into RGN_I =[RGN] RGN_VALUE_TGT;
num RGN_PRICE_TGT {RGN_I}init .;
read data RGN_TGT nomiss into RGN_I =[RGN] RGN_PRICE_TGT;
num MTH_VOL_TGT {MTH_I}init .;
read data MTH_TGT nomiss into MTH_I =[MTH] MTH_VOL_TGT;
/* test that the data is read*/
print RGN_VALUE_TGT RGN_PRICE_TGT;
print MTH_VOL_TGT;
/*print DMD PRICE; */
set RMCCMMwithPRICE = {<r,m,p,c,k,t> in RMCCMM: price[r,m,p,c,k,t] ~= .};
/* declare variables */
num dmdAndPriceDelta init 0.3;
var x {<r,m,p,c,k,t> in RMCCMMwithPRICE}
>= DMD[r,m,p,c,k,t] * (1 - dmdAndPriceDelta)
<= DMD[r,m,p,c,k,t] * (1 + dmdAndPriceDelta);
var y {<r,m,p,c,k,t> in RMCCMMwithPRICE}
>= PRICE[r,m,p,c,k,t] * (1 - dmdAndPriceDelta)
<= PRICE[r,m,p,c,k,t] * (1 + dmdAndPriceDelta);
impvar z {rc in RGN_I}
= sum{<r,m,p,c,k,t> in RMCCMMwithPRICE} x[r,m,p,c,k,t] * y[r,m,p,c,k,t];
impvar mn {mt in MTH_I}
= sum{<r,m,p,c,k,t> in RMCCMMwithPRICE} x[r,m,p,c,k,t];
print x y z mn;
/* define constraints */
num tgtDelta init .02;
con mn_CON {m in MTH_I}:
MTH_VOL_TGT
<= mn
<= MTH_VOL_TGT
;
/* objective function */
var Error {RGN_I};
con Error_CON {r in RGN_I}:
Error
min MSE = sum{r in RGN_I}
Error
solve with nlp/iis=on;
expand/iis;
/*print x y mn z;*/
quit;
As for how this would scale, it is always difficult to tell. The best thing to do is experiment. When the problem was feasible the small version was pretty quick, but 500K rows could be different.
When I changed your impvar z and mn declarations to slice by RGN and MTH, I get feasible solutions. See if this makes sense in your application:
impvar z {r in RGN_I}
= sum{<(r),m,p,c,k,t> in RMCCMMwithPRICE} x[r,m,p,c,k,t] * y[r,m,p,c,k,t];
impvar mn {m in MTH_I}
= sum{<r,(m),p,c,k,t> in RMCCMMwithPRICE} x[r,m,p,c,k,t];
I was able to solve but as I scale up I have Out of memory errors !
I even removed a dimension.
How to manage memory for large scale optimizations?
Log:
NOTE: PROCEDURE PRINTTO used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
17
18 proc sql;
19 create table TABLE1
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
NOTE: The query requires remerging summary statistics back with the original data.
NOTE: Table WORK.TABLE1 created, with 396,971 rows and 8 columns.
29 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 1.48 seconds
cpu time 2.07 seconds
30
31 proc sql;
32 create table TABLE_RGN as
33 select unique RGN from TABLE1;
NOTE: Table WORK.TABLE_RGN created, with 7 rows and 1 columns.
33 ! quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.13 seconds
cpu time 0.20 seconds
34
35 proc sql;
36 create table TABLE_MN as
2 The SAS System 14:01 Tuesday, December 23, 2014
37 select unique MTH from TABLE1;
NOTE: Table WORK.TABLE_MN created, with 12 rows and 1 columns.
37 ! quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.11 seconds
cpu time 0.15 seconds
38
39 proc sql;
40 create table TABLE_PS as
41 select unique CNTRY from TABLE1;
NOTE: Table WORK.TABLE_PS created, with 92 rows and 1 columns.
41 ! quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.19 seconds
cpu time 0.23 seconds
41 !
42
43 proc sql;
44 create table TABLE_CM as
45 select unique CLS from TABLE1;
NOTE: Table WORK.TABLE_CM created, with 2 rows and 1 columns.
45 ! quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.11 seconds
cpu time 0.14 seconds
46
47 proc sql;
48 create table TABLE_ND as
49 select unique MKT from TABLE1;
NOTE: Table WORK.TABLE_ND created, with 6952 rows and 1 columns.
49 ! quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.26 seconds
cpu time 0.35 seconds
50
51 /*****************************************************************************************/
52 /* Dummy data set only to provide constant targets */
53
54 proc sql;
55 create table RGN_TGT as
56 select unique RGN,
57 sum(VALUE)*1.1 as RGN_VALUE_TGT format dollar16.0,
58 1.05*(sum(VALUE)/sum(DMD)) as RGN_PRICE_TGT format dollar16.0,
59 sum(DMD) as RGN_DMD
60 from TABLE1
61 group by RGN;
3 The SAS System 14:01 Tuesday, December 23, 2014
NOTE: Table WORK.RGN_TGT created, with 7 rows and 4 columns.
62 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.11 seconds
cpu time 0.10 seconds
63
64 proc sql;
65 create table MTH_TGT as
66 select unique MTH,
67 sum(VALUE) as MTH_VALUE_TGT format dollar16.0,
68 (sum(VALUE)/sum(DMD)) as MTH_PRICE_TGT format dollar16.0,
69 sum(DMD) as MTH_DMD
70 from TABLE1
71 group by MTH;
NOTE: Table WORK.MTH_TGT created, with 12 rows and 4 columns.
72 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.11 seconds
cpu time 0.21 seconds
73
74 data MTH_TGT;
75 set MTH_TGT;
76 if MTH in ('APR', 'MAY', 'JUN', 'JUL', 'AUG', 'SEP') then MTH_VOL_TGT = MTH_DMD*1.05;
77 else MTH_VOL_TGT = MTH_DMD*0.95;
78 run;
NOTE: There were 12 observations read from the data set WORK.MTH_TGT.
NOTE: The data set WORK.MTH_TGT has 12 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
78 !
79 /*****************************************************************************************/
80
81 options missing=0;
82 proc optmodel/*presolver=none*/;
83
84 /* declare index sets and read values from data sets*/
85
86 set <str> RGN_I;
87 read data TABLE_RGN nomiss into RGN_I = [RGN];
NOTE: There were 7 observations read from the data set WORK.TABLE_RGN.
88
89 set <str> MTH_I;
90 read data TABLE_MN nomiss into MTH_I = [MTH];
NOTE: There were 12 observations read from the data set WORK.TABLE_MN.
91
92 set <str> CNTRY_I;
93 read data TABLE_PS nomiss into CNTRY_I = [CNTRY];
NOTE: There were 92 observations read from the data set WORK.TABLE_PS.
4 The SAS System 14:01 Tuesday, December 23, 2014
94
95 set <str> CLS_I;
96 read data TABLE_CM nomiss into CLS_I = [CLS];
NOTE: There were 2 observations read from the data set WORK.TABLE_CM.
97
98 set <str> MKT_I;
99 read data TABLE_ND nomiss into MKT_I = [MKT];
NOTE: There were 6952 observations read from the data set WORK.TABLE_ND.
100
101 set RMCCM = {RGN_I, MTH_I, CNTRY_I, CLS_I, MKT_I};
102
103 /* declare parameters and read from data sets */
104
105 num DMD {RMCCM}init .;
106 read data TABLE1 nomiss into [RGN MTH CNTRY CLS MKT] DMD;
ERROR: Out of memory during expression evaluation.
NOTE: There were 1 observations read from the data set WORK.TABLE1.
107
108 num PRICE {RMCCM}init .;
109 read data TABLE1 nomiss into [RGN MTH CNTRY CLS MKT] PRICE;
ERROR: The symbol 'RMCCM' has no value at line 108 column 12.
NOTE: There were 1 observations read from the data set WORK.TABLE1.
110
111 num RGN_VALUE_TGT {RGN_I}init .;
112 read data RGN_TGT nomiss into RGN_I =[RGN] RGN_VALUE_TGT;
NOTE: There were 7 observations read from the data set WORK.RGN_TGT.
113
114 num RGN_PRICE_TGT {RGN_I}init .;
115 read data RGN_TGT nomiss into RGN_I =[RGN] RGN_PRICE_TGT;
NOTE: There were 7 observations read from the data set WORK.RGN_TGT.
116
117 num MTH_VOL_TGT {MTH_I}init .;
118 read data MTH_TGT nomiss into MTH_I =[MTH] MTH_VOL_TGT;
NOTE: There were 12 observations read from the data set WORK.MTH_TGT.
119
120 /* test that the data is read*/
121
122 print RGN_VALUE_TGT RGN_PRICE_TGT;
14:01 Tuesday, December 23, 2014 1
124 /*print DMD PRICE; */
125
126 set RMCCMMwithPRICE = {<r,m,p,c,k> in RMCCM: price[r,m,p,c,k] ~= .};
127
128 /* declare variables */
129
130 num dmdAndPriceDelta init 0.3;
131
132 var x {<r,m,p,c,k> in RMCCMwithPRICE}
_______________
537 526
ERROR 537-782: The symbol 'RMCCMwithPRICE' is unknown.
ERROR 526-782: An indexing set subexpression may not be a number.
133 >= DMD[r,m,p,c,k] * (1 - dmdAndPriceDelta)
_ _
537 653
ERROR 537-782: The symbol 'm' is unknown.
ERROR 653-782: Subscript 1 must be a string, found a number.
5 The SAS System 14:01 Tuesday, December 23, 2014
133 ! >= DMD[r,m,p,c,k] * (1 - dmdAndPriceDelta)
_ _
537 653
ERROR 537-782: The symbol 'p' is unknown.
ERROR 653-782: Subscript 2 must be a string, found a number.
133 ! >= DMD[r,m,p,c,k] * (1 - dmdAndPriceDelta)
_
537
ERROR 537-782: The symbol 'c' is unknown.
_
653
ERROR 653-782: Subscript 3 must be a string, found a number.
133 ! >= DMD[r,m,p,c,k] * (1 - dmdAndPriceDelta)
_
537
ERROR 537-782: The symbol 'k' is unknown.
_
653
ERROR 653-782: Subscript 4 must be a string, found a number.
133 ! >= DMD[r,m,p,c,k] * (1 - dmdAndPriceDelta)
_
653
ERROR 653-782: Subscript 5 must be a string, found a number.
134 <= DMD[r,m,p,c,k] * (1 + dmdAndPriceDelta);
_ _
537 653
ERROR 537-782: The symbol 'm' is unknown.
ERROR 653-782: Subscript 1 must be a string, found a number.
134 ! <= DMD[r,m,p,c,k] * (1 + dmdAndPriceDelta);
_ _
537 653
ERROR 537-782: The symbol 'p' is unknown.
ERROR 653-782: Subscript 2 must be a string, found a number.
134 ! <= DMD[r,m,p,c,k] * (1 + dmdAndPriceDelta);
_
537
ERROR 537-782: The symbol 'c' is unknown.
_
653
ERROR 653-782: Subscript 3 must be a string, found a number.
134 ! <= DMD[r,m,p,c,k] * (1 + dmdAndPriceDelta);
_
537
ERROR 537-782: The symbol 'k' is unknown.
6 The SAS System 14:01 Tuesday, December 23, 2014
_
653
ERROR 653-782: Subscript 4 must be a string, found a number.
134 ! <= DMD[r,m,p,c,k] * (1 + dmdAndPriceDelta);
_
653
ERROR 653-782: Subscript 5 must be a string, found a number.
135
136 var y {<r,m,p,c,k> in RMCCMwithPRICE}
_______________
537 526
ERROR 537-782: The symbol 'RMCCMwithPRICE' is unknown.
ERROR 526-782: An indexing set subexpression may not be a number.
137 >= PRICE[r,m,p,c,k] * (1 - dmdAndPriceDelta)
_ _
537 653
ERROR 537-782: The symbol 'm' is unknown.
ERROR 653-782: Subscript 1 must be a string, found a number.
137 ! >= PRICE[r,m,p,c,k] * (1 - dmdAndPriceDelta)
_ _
537 653
ERROR 537-782: The symbol 'p' is unknown.
ERROR 653-782: Subscript 2 must be a string, found a number.
137 ! >= PRICE[r,m,p,c,k] * (1 - dmdAndPriceDelta)
_
537
ERROR 537-782: The symbol 'c' is unknown.
_
653
ERROR 653-782: Subscript 3 must be a string, found a number.
137 ! >= PRICE[r,m,p,c,k] * (1 - dmdAndPriceDelta)
_
537
ERROR 537-782: The symbol 'k' is unknown.
_
653
ERROR 653-782: Subscript 4 must be a string, found a number.
137 ! >= PRICE[r,m,p,c,k] * (1 - dmdAndPriceDelta)
_
653
ERROR 653-782: Subscript 5 must be a string, found a number.
138 <= PRICE[r,m,p,c,k] * (1 + dmdAndPriceDelta);
_ _
537 653
ERROR 537-782: The symbol 'm' is unknown.
ERROR 653-782: Subscript 1 must be a string, found a number.
7 The SAS System 14:01 Tuesday, December 23, 2014
138 ! <= PRICE[r,m,p,c,k] * (1 + dmdAndPriceDelta);
_ _
537 653
ERROR 537-782: The symbol 'p' is unknown.
ERROR 653-782: Subscript 2 must be a string, found a number.
138 ! <= PRICE[r,m,p,c,k] * (1 + dmdAndPriceDelta);
_
537
ERROR 537-782: The symbol 'c' is unknown.
_
653
ERROR 653-782: Subscript 3 must be a string, found a number.
138 ! <= PRICE[r,m,p,c,k] * (1 + dmdAndPriceDelta);
_
537
ERROR 537-782: The symbol 'k' is unknown.
_
653
ERROR 653-782: Subscript 4 must be a string, found a number.
138 ! <= PRICE[r,m,p,c,k] * (1 + dmdAndPriceDelta);
_
653
ERROR 653-782: Subscript 5 must be a string, found a number.
139
140 impvar z {r in RGN_I}
141 = sum{<(r),m,p,c,k> in RMCCMwithPRICE} x[r,m,p,c,k] * y[r,m,p,c,k];
_______________ _
537 526 650
637
828
ERROR 537-782: The symbol 'RMCCMwithPRICE' is unknown.
ERROR 526-782: An indexing set subexpression may not be a number.
ERROR 650-782: The subscript count does not match array 'x', 5 NE 1.
ERROR 637-782: The right operand of 'SLICE' must be a set, found a number.
ERROR 828-782: An index tuple must specify at least one dummy parameter.
141 ! = sum{<(r),m,p,c,k> in RMCCMwithPRICE} x[r,m,p,c,k] * y[r,m,p,c,k];
_ _ _
537 537 650
ERROR 537-782: The symbol 'm' is unknown.
ERROR 650-782: The subscript count does not match array 'y', 5 NE 1.
141 ! = sum{<(r),m,p,c,k> in RMCCMwithPRICE} x[r,m,p,c,k] * y[r,m,p,c,k];
_ _
537 537
ERROR 537-782: The symbol 'p' is unknown.
8 The SAS System 14:01 Tuesday, December 23, 2014
141 ! = sum{<(r),m,p,c,k> in RMCCMwithPRICE} x[r,m,p,c,k] * y[r,m,p,c,k];
_ _
537 537
ERROR 537-782: The symbol 'c' is unknown.
141 ! = sum{<(r),m,p,c,k> in RMCCMwithPRICE} x[r,m,p,c,k] * y[r,m,p,c,k];
_ _
537 537
ERROR 537-782: The symbol 'k' is unknown.
142
143 impvar mn {m in MTH_I}
144 = sum{<r,(m),p,c,k> in RMCCMwithPRICE} x[r,m,p,c,k];
_______________ _
537 526 650
637
ERROR 537-782: The symbol 'RMCCMwithPRICE' is unknown.
ERROR 526-782: An indexing set subexpression may not be a number.
ERROR 650-782: The subscript count does not match array 'x', 5 NE 1.
ERROR 637-782: The right operand of 'SLICE' must be a set, found a number.
144 ! = sum{<r,(m),p,c,k> in RMCCMwithPRICE} x[r,m,p,c,k];
_
537
ERROR 537-782: The symbol 'p' is unknown.
144 ! = sum{<r,(m),p,c,k> in RMCCMwithPRICE} x[r,m,p,c,k];
_
537
ERROR 537-782: The symbol 'c' is unknown.
144 ! = sum{<r,(m),p,c,k> in RMCCMwithPRICE} x[r,m,p,c,k];
_
537
ERROR 537-782: The symbol 'k' is unknown.
145
146 /* define constraints */
147 num tgtDelta init .05;
148
149 con mn_CON {m in MTH_I}:
150 MTH_VOL_TGT
151 <= mn
152 <= MTH_VOL_TGT
153
154 /* objective function */
155
156 var Error {RGN_I};
157 con Error_CON {r in RGN_I}:
158 Error
159
160 min MSE = sum{r in RGN_I}
161 Error
162
9 The SAS System 14:01 Tuesday, December 23, 2014
163 /*solve;*/
164
165 solve with nlp/iis=on;
NOTE: The OPTMODEL presolver is disabled when the IIS= option is enabled.
NOTE: Problem generation will use 4 threads.
NOTE: Previous errors might cause the problem to be resolved incorrectly.
ERROR: The implicit variable 'z' has an incomplete declaration.
ERROR: The implicit variable 'mn' has an incomplete declaration.
NOTE: The problem has 7 variables (7 free, 0 fixed).
NOTE: The problem uses 19 implicit variables.
NOTE: The problem has 0 linear constraints (0 LE, 0 EQ, 0 GE, 0 range).
NOTE: The problem has 19 nonlinear constraints (0 LE, 7 EQ, 0 GE, 12 range).
NOTE: Unable to create problem instance due to previous errors.
NOTE: Previous errors might cause the problem to be resolved incorrectly.
14:01 Tuesday, December 23, 2014 2
The OPTMODEL Procedure
Solution Summary
Solver NLP
Objective Function Unknown
Solution Status Semantic Error
Iterations 0
Presolve Time 0.00
Solution Time 0.00
166 /*expand/iis;*/
167
168 print x y mn z;
ERROR: The symbol 'x' has an incomplete declaration at line 168 column 7.
169
170 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: The PROCEDURE OPTMODEL printed pages 1-2.
NOTE: PROCEDURE OPTMODEL used (Total process time):
real time 16.42 seconds
cpu time 16.34 seconds
171
172
173 proc printto;
174 run;
As I tried to reduce the dimensions by dropping MKT MKTTYPE, resulting in just 2000+ rows, it's executing but reaching iteration limit.
How do I increase the limit?
I am currently on a PC Win7, 64bit, 8GB RAM, 400 GB hard-disk.
The OPTMODEL Procedure
Solution Summary
Solver NLP
Algorithm Interior Point
Objective Function MSE
Solution Status Iteration Limit Reached
Objective Value 1.3104746E25
Optimality Error 1.8482552E17
Infeasibility 6922978.3345
Iterations 5000
Presolve Time 0.00
Solution Time 663.41
Thank you for your help.
BRgds
Praveen
To save memory, the first thing I would try is to read table1 directly. That way you will read in ~400K items instead of the ~100M you get by the set cross product. It also makes your code simpler as you can dispose of the withPRICE variant:
data TABLE1;
input RGN $7. MTH $4. CNTRY $3. CLS $3. MKT $7. MKTTYPE $7. DMD VALUE PRICE;
datalines;
GGGGGG APR JJ RR AAACCC OTHOTH 4.072487387 5073.712404 1245.85098
GGGGGG APR LL RR AAACCC OTHOTH 0.759277544 527.855155 695.2071207
ROMANN APR SS RR ARNSSS TTTTTT 13.6940138 20710.54229 1512.379248
GGGGGG APR BB RR BAADDD OTHOTH 3.858234121 1971.730438 511.044788
BUHBUH APR QQ RR CAAHHH TTTTTT 670.3483179 607903.4554 906.8471407
GGGGGG APR KK RR DEEKKK TTTTTT 22.93870547 11473.72103 500.1904334
LIONNN APR UG RR EEEFFF OTHOTH 1.314052632 1776.681947 1352.063002
ROMANN APR IT RR DDMFCC OTHOTH 18.18630498 34948.34188 1921.684582
NOODLE APR AA RR MMMMUC TTTTTT 38.07123286 107524.5245 2824.298464
ROMANN APR IT WW BBRNNN OTHOTH 6.842336232 3142.693913 459.301298
NOODLE FEB PP RR GGGIII OTHOTH 1.240267277 3507.439478 2827.970665
BUHBUH JUL QQ RR AEEPPP UNDISC 0.610535011 282.9336958 463.419281
;
proc optmodel;
set<str,str,str,str,str,str> RMCCMM;
num dmd{RMCCMM}, price{RMCCMM};
read data table1 into RMCCMM=[RGN MTH CNTRY CLS MKT MKTTYPE] dmd price;
print dmd price;
quit;
You can change the iteration limit, as well as experiment with other options using these solver options.
Hi,
I tried the above.
For 12 rows it solved in less than a second and just 185 interations.
However when I use 400K rows it returns out of memory.
Below is the log. Anything I can do to improve?
Would be glad if someone could help.
Rgds
Praveen
NOTE: PROCEDURE PRINTTO used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
NOTE: Table WORK.TABLE1 created, with 396971 rows and 8 columns.
29 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 1.61 seconds
cpu time 2.16 seconds
30
31 /* Dummy data set only to provide constant targets */
32
33 proc sql;
34 create table RGN_TGT as
35 select unique RGN,
36 sum(VALUE)*1.1 as RGN_VALUE_TGT format dollar16.0,
37 1.05*(sum(VALUE)/sum(DMD)) as RGN_PRICE_TGT format dollar16.0,
38 sum(DMD) as RGN_DMD
39 from TABLE1
40 group by RGN;
NOTE: Table WORK.RGN_TGT created, with 7 rows and 4 columns.
41 quit;
NOTE: PROCEDURE SQL used (Total process time):
2 The SAS System 14:44 Wednesday, December 24, 2014
real time 0.07 seconds
cpu time 0.06 seconds
42
43 proc sql;
44 create table MTH_TGT as
45 select unique MTH,
46 sum(VALUE) as MTH_VALUE_TGT format dollar16.0,
47 (sum(VALUE)/sum(DMD)) as MTH_PRICE_TGT format dollar16.0,
48 sum(DMD) as MTH_DMD
49 from TABLE1
50 group by MTH;
NOTE: Table WORK.MTH_TGT created, with 12 rows and 4 columns.
51 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.15 seconds
cpu time 0.28 seconds
52
53 data MTH_TGT;
54 set MTH_TGT;
55 if MTH in ('APR', 'MAY', 'JUN', 'JUL', 'AUG', 'SEP') then MTH_VOL_TGT = MTH_DMD*1.05;
56 else MTH_VOL_TGT = MTH_DMD*0.95;
57 run;
NOTE: There were 12 observations read from the data set WORK.MTH_TGT.
NOTE: The data set WORK.MTH_TGT has 12 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
57 !
58 /*****************************************************************************************/
59
60 proc optmodel;
61
62 set<str,str,str,str,str> RMCCM;
63 num dmd{RMCCM}, price{RMCCM};
64 read data table1 into RMCCM=[RGN MTH CNTRY CLS MKT] dmd price;
NOTE: There were 396971 observations read from the data set WORK.TABLE1.
65
66 /* print dmd price; */
67 set <str> RGN_I;
68 num RGN_VALUE_TGT {RGN_I}init .;
69 read data RGN_TGT nomiss into RGN_I =[RGN] RGN_VALUE_TGT;
NOTE: There were 7 observations read from the data set WORK.RGN_TGT.
70 num RGN_PRICE_TGT {RGN_I}init .;
71 read data RGN_TGT nomiss into RGN_I =[RGN] RGN_PRICE_TGT;
NOTE: There were 7 observations read from the data set WORK.RGN_TGT.
72
73 set <str> MTH_I;
74 num MTH_VOL_TGT {MTH_I}init .;
75 read data MTH_TGT nomiss into MTH_I =[MTH] MTH_VOL_TGT;
NOTE: There were 12 observations read from the data set WORK.MTH_TGT.
3 The SAS System 14:44 Wednesday, December 24, 2014
76
77 /* test that the data is read*/
78
79 /*print RGN_VALUE_TGT RGN_PRICE_TGT;
80 print MTH_VOL_TGT;
81 print DMD PRICE; */
82
83 /* declare variables */
84
85 num dmdAndPriceDelta init 0.3;
86
87 var x {<r,m,p,c,k> in RMCCM}
88 >= DMD[r,m,p,c,k] * (1 - dmdAndPriceDelta)
89 <= DMD[r,m,p,c,k] * (1 + dmdAndPriceDelta);
90
91 var y {<r,m,p,c,k> in RMCCM}
92 >= PRICE[r,m,p,c,k] * (1 - dmdAndPriceDelta)
93 <= PRICE[r,m,p,c,k] * (1 + dmdAndPriceDelta);
94
95 impvar z {r in RGN_I}
96 = sum{<(r),m,p,c,k> in RMCCM} x[r,m,p,c,k] * y[r,m,p,c,k];
97
98 impvar mn {m in MTH_I}
99 = sum{<r,(m),p,c,k> in RMCCM} x[r,m,p,c,k];
100
101 /* define constraints */
102 num tgtDelta init .05;
103
104 con mn_CON {m in MTH_I}:
105 MTH_VOL_TGT
106 <= mn
107 <= MTH_VOL_TGT
108
109 /* objective function */
110
111 var Error {RGN_I};
112 con Error_CON {r in RGN_I}:
113 Error
114
115 min MSE = sum{r in RGN_I}
116 Error
117
118 solve;
NOTE: Problem generation will use 4 threads.
NOTE: The problem has 793949 variables (7 free, 0 fixed).
NOTE: The problem uses 19 implicit variables.
NOTE: The problem has 12 linear constraints (0 LE, 0 EQ, 0 GE, 12 range).
NOTE: The problem has 396971 linear constraint coefficients.
NOTE: The problem has 7 nonlinear constraints (0 LE, 7 EQ, 0 GE, 0 range).
NOTE: The OPTMODEL presolver removed 0 variables, 0 linear constraints, and 0 nonlinear constraints.
14:44 Wednesday, December 24, 2014 1
The OPTMODEL Procedure
Problem Summary
Objective Sense Minimization
Objective Function MSE
Objective Type Quadratic
Number of Variables 793949
Bounded Above 0
Bounded Below 0
Bounded Below and Above 793942
Free 7
Fixed 0
Number of Constraints 19
Linear LE (<=) 0
Linear EQ (=) 0
Linear GE (>=) 0
Linear Range 12
Nonlinear LE (<=) 0
Nonlinear EQ (=) 7
Nonlinear GE (>=) 0
Nonlinear Range 0
Performance Information
Execution Mode Single-Machine
Number of Threads 4
NOTE: Using analytic derivatives for objective.
NOTE: Using analytic derivatives for nonlinear constraints.
NOTE: Using 4 threads for nonlinear evaluation.
NOTE: The NLP solver is called.
NOTE: The Interior Point algorithm is used.
ERROR: Out of memory in solver.
14:44 Wednesday, December 24, 2014 2
The OPTMODEL Procedure
Solution Summary
Solver NLP
Algorithm Interior Point
Objective Function MSE
Solution Status Out of Memory
Objective Value 0
Optimality Error 1822716549.5
Infeasibility 1822716549.5
Iterations 0
Presolve Time 0.15
Solution Time 0.84
119
120 /*solve with nlp/iis=on;*/
4 The SAS System 14:44 Wednesday, December 24, 2014
121 /*expand/iis;*/
122
123 /*print x y mn z;*/
124
125 quit;
NOTE: The PROCEDURE OPTMODEL printed pages 1-2.
NOTE: PROCEDURE OPTMODEL used (Total process time):
real time 7.65 seconds
cpu time 7.66 seconds
126
127 proc printto;
128 run;
Further to the response, as I removed one of the dimensions, thus reducing the rows to 2,200 it solved in 12 mins and stopped at iteration limit which already is a very decent solution. How do I solve 400K rows?
Thanks in anticipation.
Praveen
I am glad that you are happy with the quality of the solution for 2200 rows. Going from 2200 to 400,000 rows can be difficult. You can try some of the solver options, including some that invoke completely different search strategies, and you can also set stopping conditions with which you would be satisfied. You can also try to get more RAM, but it is often difficult to predict the amount needed.
Beyond option tuning, the finest way to improve performance is to carefully tune the model, ideally with some understanding of the underlying business problem. SAS has a team that specializes in that service. Please contact me via pm if you would like to be put in touch with the right people, or you could reach out to your usual SAS point of contact.
Cheers, Leo.
Hi,
Thanks for your response.
My happiness was short lived.
As I started analyzing the results I realized that x & y had -ve values. NOT when I used the smaller data set but in the full scale 400 K input. (Which cannot be posted here.)
The original input of DMD & PRICE are always > 0, so the original variable declaration itself should have prevented -ves.
-------- Previous declaration -----------------------------------------------------
num dmdAndPriceDelta init 0.3;
var x {<r,m,p,c,k> in RMCCM}
>= DMD[r,m,p,c,k] * (1 - dmdAndPriceDelta)
<= DMD[r,m,p,c,k] * (1 + dmdAndPriceDelta);
var y {<r,m,p,c,k> in RMCCM}
>= PRICE[r,m,p,c,k] * (1 - dmdAndPriceDelta)
<= PRICE[r,m,p,c,k] * (1 + dmdAndPriceDelta);
--------------------------------------------------------------------------
I even tried forcing a constraint for x & y
-----------------------------------------------------------------------------
proc optmodel;
set<str,str,str,str> RMCC;
num dmd{RMCC}, price{RMCC};
read data table1 into RMCC=[RGN MTH CNTRY CLS] dmd price;
print dmd price;
set <str> RGN_I;
num RGN_VALUE_TGT {RGN_I}init .;
read data RGN_TGT nomiss into RGN_I =[RGN] RGN_VALUE_TGT;
num RGN_PRICE_TGT {RGN_I}init .;
read data RGN_TGT nomiss into RGN_I =[RGN] RGN_PRICE_TGT;
set <str> MTH_I;
num MTH_VOL_TGT {MTH_I}init .;
read data MTH_TGT nomiss into MTH_I =[MTH] MTH_VOL_TGT;
/* declare variables */
num dmdAndPriceDelta init 0.3;
var x {<r,m,p,c> in RMCC} >= 0;
con x_CON {<r,m,p,c> in RMCC}:
max(DMD[r,m,p,c] * (1 - dmdAndPriceDelta),0)
<= x[r,m,p,c]
<= DMD[r,m,p,c] * (1 + dmdAndPriceDelta);
var y {<r,m,p,c> in RMCC} >= 0;
con y_CON {<r,m,p,c> in RMCC}:
max(PRICE[r,m,p,c] * (1 - dmdAndPriceDelta),0)
<= y[r,m,p,c]
<= PRICE[r,m,p,c] * (1 + dmdAndPriceDelta);
impvar z {r in RGN_I}
= sum{<(r),m,p,c> in RMCC} x[r,m,p,c] * y[r,m,p,c];
impvar mn {m in MTH_I}
= sum{<r,(m),p,c> in RMCC} x[r,m,p,c];
/* define constraints */
num tgtDelta init .05;
con mn_CON {m in MTH_I}:
MTH_VOL_TGT
<= mn
<= MTH_VOL_TGT
/* objective function */
var Error {RGN_I};
con Error_CON {r in RGN_I}:
Error
min MSE = sum{r in RGN_I}
Error
solve/*with nlp/maxiter=10000*/;
/*solve with nlp/iis=on;*/
/*expand/iis;*/
print mn z;
create data resdata from [r m p c]= {<r,m,p,c> in RMCC}x y;
quit;
sample result .....................
x y
2232.35341 | 1599681.4415 |
150190.96533 | 172977.48667 |
128.60057752 | 45299.764902 |
1440.1465745 | 184144.91105 |
13805.955422 | 198567.11325 |
202.04127674 | 21097.559601 |
-92411.95867 | -149976.2581 |
1438.7534535 | 1328218.4977 |
7092.6645369 | 455946.06376 |
1165.2612806 | 98285.518642 |
4866.909865 | 87093.409671 |
676.20959556 | 181475.3109 |
5845.9530754 | 73603.47642 |
66.724159884 | 32526.662853 |
503.07296274 | 196763.72642 |
187.35800758 | 22049.033852 |
281.00044872 | 49734.016545 |
1478.6442372 | -116693.3107 |
16059.923883 | 658422.89405 |
171.18035859 | 48034.748391 |
811.16022955 | 186996.83904 |
166.46531054 | 44548.480668 |
654.15825652 | 231850.25408 |
19802.054651 | 245196.44001 |
9975.8473189 | 128623.8133 |
1076.450784 | 97583.072647 |
3365.0226586 | 105558.49572 |
1640.9166064 | 167176.97321 |
5901.7805195 | 148139.6536 |
859.07957479 | 1408674.6063 |
3350.4378749 | -561854.0849 |
1522.8057028 | 152951.54845 |
16396.024953 | 132977.4923 |
1999.6101672 | 87987.233455 |
-14847.33627 | 627388.89381 |
Any idea how this can happen?
Thanks for your help.
Rgds
Praveen
You might need to check the solver termination status. It should be some variation on infeasible, or some limit exceeded. You can try focusing on feasibility and giving the solver more time.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.