BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
opti_miser
Calcite | Level 5

I am optimizing using Proc IML, but I think my optimization would be faster in Proc OPTMODEL.  It is a concern because the data set I am using is quite large, much larger than the files I have attached.

I have attached data sets and code for the optimization I run in Proc IML.  I would like to be able to do the same optimization in Proc OPTMODEL, but I can’t find much help from examples or from the documentation.  If you could point me in the right direction, I would appreciate it.

My data set is made up of stocks and dates.  In Proc Optmodel, I would like to sum over all stocks each date, and then finally sum over all dates.  The number of stocks can change from date to date.

I have attached a paper that describes the methodology.  I am estimating theta in equation 6 of the paper.

Any ideas on how to get started would be great!

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

Is the data so large that the CHAR, MKT, RET, and STOCKS variables don't fit into memory?  If they do fit, then read those variables once, include them as global variables to the opt() module, and index into them inside the DO loop over dates.

View solution in original post

14 REPLIES 14
RobPratt
SAS Super FREQ

Please see the attached file for a way to solve this problem using PROC OPTMODEL.

Rob Pratt

opti_miser
Calcite | Level 5

Thank you, Rob!

Here is the code if you only have SAS 9.2. 

Rick_SAS
SAS Super FREQ

One reason that your IML code is slow is that you are re-reading the wgtfracsas and equalsas data sets at every iteration of the optimization, even though those values are unchanged during the optimization.

opti_miser
Calcite | Level 5

I noticed that when I wrote the code, but I couldn't figure out a way to get around it.  Any suggestions?  I feel more comfortable working in IML than I do in OPTMODEL, but speed is of the essence.

Rick Wicklin wrote:

One reason that your IML code is slow is that you are re-reading the wgtfracsas and equalsas data sets at every iteration of the optimization, even though those values are unchanged during the optimization.

Rick_SAS
SAS Super FREQ

Is the data so large that the CHAR, MKT, RET, and STOCKS variables don't fit into memory?  If they do fit, then read those variables once, include them as global variables to the opt() module, and index into them inside the DO loop over dates.

opti_miser
Calcite | Level 5

Thank you for the suggestions.  The slow (old) and fast (new) versions of the code are attached.

Rick Wicklin wrote:

Is the data so large that the CHAR, MKT, RET, and STOCKS variables don't fit into memory?  If they do fit, then read those variables once, include them as global variables to the opt() module, and index into them inside the DO loop over dates.

opti_miser
Calcite | Level 5

I ran both the IML code and the OPTMODEL code on the larger data set. 

In one run, the IML code returned an objective function equal to -0.218421215, compared to the OPTMODEL objective function of -0.218421214.  The IML procedure took 4.13 seconds in real time and 3.99 seconds in cpu time, compared to OPTMODEL, which took  48.82 seconds of real time and 48.43 seconds of cpu time.

In another run, the IML code returned an objective function equal to -0.218605342, compared to the OPTMODEL objective function of -0.218605307. The IML procedure took 3.77 seconds in real time and 3.77 seconds in cpu time, compared to OPTMODEL, which took 51.44 seconds of real time and 51.43 seconds of cpu time.

So, the gains from using OPTMODEL appear small in terms of getting a better objective function, and there is a big loss in time.  I will have to find out why I was told that there were such large gains to using OPTMODEL. 

The output files and log files are attached in pdf format.

RobPratt
SAS Super FREQ

You should see some performance gains from using IMPVAR (available starting in 9.22) and the new NLP solvers (available in 9.3).

RobPratt
SAS Super FREQ

Even in 9.2, you might try moving the declarations of DATES and STOCKS until after the READ DATA statement.

opti_miser
Calcite | Level 5

Thank you for your suggestions. I didn't get any gains from moving the statements around.  I guess I will have to wait for the new version of SAS.  I have attached the code and here is a link to the large data set, if you are still curious about the gains from moving to 9.3.

https://bearspace.baylor.edu/xythoswfs/webui/_xy-7387784_1-t_uQw6Wj58

RobPratt wrote:

Even in 9.2, you might try moving the declarations of DATES and STOCKS until after the READ DATA statement.

RobPratt
SAS Super FREQ

Thanks for providing the larger data set.  I found that your OPTMODEL code runs roughly 8 times as fast in 9.3 as compared to 9.2.  Although the NLP solvers have improved significantly in 9.3, this problem has only 3 variables, and you will typically not see much gain from the solvers when there is a small number of variables and constraints.  Instead, the 8-fold speedup here is due to OPTMODEL problem generation improvements.

opti_miser
Calcite | Level 5

That is great to hear!  I hope I can get 9.3 soon then. I'll have to see what the holdup on getting it for us is.

Rick_SAS
SAS Super FREQ

Just for reference, how much improvemnt was there from the "slow version" to the "fast" version of the IML program? The slow version took ____ seconds; the fast one takes 4 seconds.

opti_miser
Calcite | Level 5

The slow version took 20 seconds.  I have attached code, output, a log file, and the extra data set needed.  As above, the other large data set can be found here: https://bearspace.baylor.edu/xythoswfs/webui/_xy-7387784_1-t_uQw6Wj58 .

If I run a simulation, I could be doing this 20,000 or even 100,000 times.  So, this makes a big difference.

Rick Wicklin wrote:

Just for reference, how much improvemnt was there from the "slow version" to the "fast" version of the IML program? The slow version took ____ seconds; the fast one takes 4 seconds.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 14 replies
  • 4447 views
  • 6 likes
  • 3 in conversation