turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-23-2012 08:49 PM

I have a simulation code (ATTACHED) that is functional and works well, but is very slow. We are talking 4-5 days of continuous run time.

I think there is an input/output bottleneck in appending matrices (vertically). Is there something that could be done to reduce time, memory, permanent instead of work of temporary That might make this code run faster.

I am saving qualifying records as vectors and appending them to a matrix called RESULT. At the end of loop grid, I am appending the matrix to a permanent SAS data set for further analysis.

I have done all I can with my limited knowledge of IML and I am desperate for help.

Many thanks,

Jamil

Accepted Solutions

Solution

05-24-2012
06:06 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jhzeinab_yahoo_com

05-24-2012 06:06 AM

**all** of the old results to it as well as the most recent result. So as the iteration count rises, larger and larger amounts of data are shifted around killing the performance. Perhaps try to structure the program so that you declare a matrix at the start to hold the results for every iteration of your inner most loop, and then add each result to the matrix with a statement like Result[m,]=R; After the innermost loop has finished then you could dump the contents of Result to a SAS data set.

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jhzeinab_yahoo_com

05-23-2012 09:28 PM

1) move all allocations outside the do loops. you only need to allocate X at the top of the program, not inside four nested DO loops

2) You don't need to allocate zm, T, or any other arrays that you are going to assign. If you execute "zm = equation", do not allocate zm first, since zm just gets overwritten

3) You don't need x1 and x2. Just send X into the RAND function to fill up both columns.

3) Get rid of the lines

COV_INV_Z=inv (COV_Z);

T= Z * COV_INV_Z *t(Z);

and use the SOLVE function instead as shown in

http://blogs.sas.com/content/iml/2011/08/10/do-you-really-need-to-compute-that-matrix-inverse/

and

http://blogs.sas.com/content/iml/2011/08/17/solving-linear-systems-which-technique-is-fastest/

4) The line that is **REALLY** killing your performance is

RESULT=RESULT // R;

Read Section 2.9 on p. 34-35 of my book *Statistical Programming with SAS/IML Software.* That chapter is available as a free download .

Start with (4); the performance will improve incredibly.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

05-23-2012 10:14 PM

Thank you Rick. I will work on these changes and see if perfrmance improves.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

05-23-2012 10:33 PM

Rick, by the way, I hav your book, which is excellent, I am on teh other hand is a novice in IML.

How would the SOLVE function work in this case. I have an original X (n x p) matrix and a Z matrix (1 x p)?

Is the following remotely correct instead of the above. I need to calculate T (scalar) from a quadratic form T= Z * COV_INV_Z *t(Z)

T= SOLVE (X,Z); ???

Solution

05-24-2012
06:06 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jhzeinab_yahoo_com

05-24-2012 06:06 AM

**all** of the old results to it as well as the most recent result. So as the iteration count rises, larger and larger amounts of data are shifted around killing the performance. Perhaps try to structure the program so that you declare a matrix at the start to hold the results for every iteration of your inner most loop, and then add each result to the matrix with a statement like Result[m,]=R; After the innermost loop has finished then you could dump the contents of Result to a SAS data set.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to IanWakeling

05-24-2012 06:10 AM

Good one! I didn't even notice these were 2x2 matrices! (But it's good to learn how to use SOLVE anyway!)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to IanWakeling

05-24-2012 09:39 AM

Ian,

Does it matter where I declare the matrix in this case? Should I declare it outside of the first loop or before the inner most loop.

Thanks,

Jamil

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jhzeinab_yahoo_com

05-24-2012 10:03 AM

Jamil,

Outside the first loop declare something like RESULT=j(15000,4); Then move your append just after the inner most loop. Of course you could make RESULT big enough to contain all the results from all the loops, but I think then you might run in to memory problems.

Ian.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to IanWakeling

05-24-2012 10:16 AM

Ian,

I will try it. I think each iteration will have 10000 rows. The inner most loop generates one and only one row once T > h.

I have a lot of memory on this machine, 64GB, but you never really know with SAS.

Thank you for your suggestions and help.

Jamil

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to IanWakeling

05-24-2012 12:11 PM

I made the changes and it seems to be running faster. Although I am not sure if I have the optimal append location.

I am appending every 10,000 rows for each *lambda *and* K* combination.

I will post another update and my find code soon.

Thanks again.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jhzeinab_yahoo_com

05-24-2012 06:07 AM

try T=z*solve(COV_Z, z`);

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jhzeinab_yahoo_com

05-24-2012 09:23 AM

Thank you folks. I will make these changes and see if overll perfomance gets better.