BookmarkSubscribeRSS Feed
SashiK
Calcite | Level 5

I have a working optimization model that I want to run on a really big data set (> 10million observations) and am receiving an out of memory error. I was hoping to export out the model in MPS format so that I can try running the model in other software. But, I can't find a way to create a .mps file from SAS.

 

Can you help me out.

 

Thanks,

Sashi

5 REPLIES 5
RobPratt
SAS Super FREQ

Are you using PROC OPTMODEL?  Can you please show the code and log for the error?

SashiK
Calcite | Level 5

Yes, I'm using PROC OPTMODEL. Below code is for a sample of 400K customers. The full dataset has 30MM customers.

 

Here's the code and I have attached the log file:-

 

proc optmodel;
/* declare sets and data indexed by sets */

set <string> Email;
set <string> Brand;
set <number> Custs;
num Circulation{Email};
num Priority{Email};
num Percentage{Brand};
num ExpectedSpend{Custs,Email,Brand};

/* declare the variable */

var Pick{Custs,Email} binary init 0;

/* maximize objective function (Total Spend) */

maximize TotalSpend = sum{i in Custs, j in Email, k in Brand} ExpectedSpend[i,j,k] * Pick[i,j] * Priority[j];

/* subject to constraints */

con CircLimit {j in Email} : sum {i in Custs} Pick[i,j] <= Circulation[j];
con indvMailconstraint {i in Custs}: sum {j in Email} Pick[i,j] = 1;
con brandSalesLimit {k in Brand}: sum {i in Custs, j in Email} ExpectedSpend[i,j,k] * Pick[i,j] * Priority[j] >= Percentage[k] * TotalSpend / 100;

/* abstract algebraic model that captures the structure of the */
/* optimization problem has been defined without referring */
/* to a single data constant */

/* populate model by reading in the specific data instance */

read data Email into Email=[Email] Circulation Priority;
read data Brand into Brand=[Brand] Percentage;
read data Customer into Custs=[Customer];
read data ExpSpend into [Customer Email Brand] ExpectedSpend[Customer, Email, Brand]=ExpSpend;

/* solve LP using primal simplex solver */
solve with lp / solver = primal_spx;

/* display solution */
print TotalSpend;
create data results.solution from [customer email]={Custs, Email} Assignment=Pick;
quit;

RobPratt
SAS Super FREQ

I see that you are using SAS/OR 12.1, which is four years old.  We have had several releases since then, with performance improvements in each release.  The latest release is SAS/OR 14.1.

 

You declare binary variables but then use the LP solver instead of the MILP solver.  Was this intended?  Also, is there a particular reason that you specify the primal simplex algorithm rather than the default dual simplex algorithm?  In fact, the interior point algorithm might be best for this large-scale instance.

 

Can you please set the SAS FULLSTIMER option, rerun PROC OPTMODEL, and show the new log?

 

option fullstimer;

SashiK
Calcite | Level 5

Thanks Rob... we forced it to use primal simplex for no particular reason. We were reusing older code and didn't think it made a difference. Now that you called it out, I've taken it out and just replaced it with a solve statement with the default solver. See attached log with the fullstimer option enabled.

 

What options do I give if I need to make it use the interior point algorithm as you suggested? I tried the following statement but it said that the IPNLP solver does not allow integer variables.

 

Solve with IPNLP / Solver = IPKRYLOV;

 

Are you able to tell me what resources does my system need in order to solve a problem that is this big? Since this is on a sample(400K), what about the full dataset of 30MM?

 

Can SAS generate a MPS file for my model that I can load to other solvers on the clod to take advantage of more system resources (Amazon AWS for example).. does SAS have an equivalent?

 

Let me know if you have any other suggestions that I can try out.

 

Thanks so much for your help.

 

Sashi

RobPratt
SAS Super FREQ

LP algorithm choice is documented here:

http://support.sas.com/documentation/cdl/en/ormpug/66107/HTML/default/viewer.htm#ormpug_lpsolver_syn...

 

For example, to relax the integer variables and use the interior point algorithm:

solve with LP relaxint / algorithm=interiorpoint;

 

You might also try the default LP algorithm (dual simplex):

solve with LP relaxint;

 

 

Or network simplex:

solve with LP relaxint / algorithm=network;

 

 

For information about memory requirements, see:

http://support.sas.com/documentation/cdl/en/ormpug/68156/HTML/default/viewer.htm#ormpug_concepts_sec...

 

You can save an MPS data set as described here:

http://support.sas.com/documentation/cdl/en/ormpug/68156/HTML/default/viewer.htm#ormpug_optmodel_syn...

 

SAS/OR does support parallel processing on a single machine or on a grid:

http://support.sas.com/documentation/cdl/en/ormpug/68156/HTML/default/viewer.htm#ormpug_lpsolver_det...

http://support.sas.com/documentation/cdl/en/ormpug/68156/HTML/default/viewer.htm#ormpug_milpsolver_d...

 

You should also consider trying the decomposition algorithm:

http://support.sas.com/documentation/cdl/en/ormpug/68156/HTML/default/viewer.htm#ormpug_decomp_toc.h...

 

Note that some of this functionality has been added since SAS/OR 12.1, so please upgrade to take advantage of the added features and numerous performance improvements.

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1399 views
  • 0 likes
  • 2 in conversation