BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

I have a simulation code (ATTACHED) that is functional and works well, but is very slow. We are talking 4-5 days of continuous run time.

I think there is an input/output bottleneck in appending matrices (vertically). Is there something that could be done to reduce time, memory, permanent instead of work of temporary That might make this code run faster.

I am saving qualifying records as vectors and appending them to a matrix called RESULT. At the end of loop grid, I am appending the matrix to a permanent SAS data set for further analysis.

I have done all I can with my limited knowledge of IML and I am desperate for help.

Many thanks,

Jamil

1 ACCEPTED SOLUTION

Accepted Solutions
IanWakeling
Barite | Level 11

I haven't run your program, but I believe that you are dealing with a 2x2 matrix, so the the time taken for the matrix inversion is not going to be significant no matter how you do it.  Have you addressed Rick's comment in bold?  I think it is in bold for a good reason.   For every iteration you make, you are effectively declaring a new data structure Result, copying all of the old results to it as well as the most recent result.  So as the iteration count rises, larger and larger amounts of data are shifted around killing the performance.  Perhaps try to structure the program so that you declare a matrix at the start to hold the results for every iteration of your inner most loop, and then add each result to the matrix with a statement like Result[m,]=R;   After the innermost loop has finished then you could dump the contents of Result to a SAS data set.

View solution in original post

11 REPLIES 11
Rick_SAS
SAS Super FREQ

1) move all allocations outside the do loops. you only need to allocate X at the top of the program, not inside four nested DO loops

2) You don't need to allocate zm, T, or any other arrays that you are going to assign. If you execute "zm = equation", do not allocate zm first, since zm just gets overwritten

3) You don't need x1 and x2. Just send X into the RAND function to fill up both columns.

3) Get rid of the lines

  COV_INV_Z=inv (COV_Z);

  T=  Z * COV_INV_Z *t(Z);

and use the SOLVE function instead as shown in

http://blogs.sas.com/content/iml/2011/08/10/do-you-really-need-to-compute-that-matrix-inverse/

and

http://blogs.sas.com/content/iml/2011/08/17/solving-linear-systems-which-technique-is-fastest/

4) The line that is REALLY killing your performance is

RESULT=RESULT // R;

Read Section 2.9 on p. 34-35 of my book Statistical Programming with SAS/IML Software. That chapter is available as a free download .

Start with (4); the performance will improve incredibly.

jhzeinab_yahoo_com
Calcite | Level 5

Thank you Rick. I will work on these changes and see if perfrmance improves.

jhzeinab_yahoo_com
Calcite | Level 5

Rick, by the way, I hav your book, which is excellent, I am on teh other hand is a novice in IML.

How would the SOLVE function work in this case. I have an original X (n x p) matrix and a Z matrix (1 x p)?

Is the following remotely correct instead of the above. I need to calculate T (scalar) from  a quadratic form T=  Z * COV_INV_Z *t(Z)

T= SOLVE (X,Z); ???

IanWakeling
Barite | Level 11

I haven't run your program, but I believe that you are dealing with a 2x2 matrix, so the the time taken for the matrix inversion is not going to be significant no matter how you do it.  Have you addressed Rick's comment in bold?  I think it is in bold for a good reason.   For every iteration you make, you are effectively declaring a new data structure Result, copying all of the old results to it as well as the most recent result.  So as the iteration count rises, larger and larger amounts of data are shifted around killing the performance.  Perhaps try to structure the program so that you declare a matrix at the start to hold the results for every iteration of your inner most loop, and then add each result to the matrix with a statement like Result[m,]=R;   After the innermost loop has finished then you could dump the contents of Result to a SAS data set.

Rick_SAS
SAS Super FREQ

Good one! I didn't even notice these were 2x2 matrices!  (But it's good to learn how to use SOLVE anyway!)

jhzeinab_yahoo_com
Calcite | Level 5

Ian,

Does it matter where I declare the matrix in this case? Should I declare it outside of the first loop or before the inner most loop.

Thanks,

Jamil

IanWakeling
Barite | Level 11

Jamil,

Outside the first loop declare something like RESULT=j(15000,4);  Then move your append just after the inner most loop.  Of course you could make RESULT big enough to contain all the results from all the loops, but I think then you might run in to memory problems.

Ian.

jhzeinab_yahoo_com
Calcite | Level 5

Ian,

I will try it. I think each iteration will have 10000 rows. The inner most loop generates one and only one row once T > h.

I have a lot of memory on this machine, 64GB, but you never really know with SAS.

Thank you for your suggestions and help.

Jamil

jhzeinab_yahoo_com
Calcite | Level 5

I made the changes and it seems to be running faster. Although I am not sure if I have the optimal append location.

I am appending every 10,000 rows for each lambda and K combination.

I will post another update and my find code soon.

Thanks again.

Rick_SAS
SAS Super FREQ

try T=z*solve(COV_Z, z`);

jhzeinab_yahoo_com
Calcite | Level 5

Thank you folks. I will make these changes and see if overll perfomance gets better.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 11 replies
  • 1695 views
  • 10 likes
  • 3 in conversation