Several improvements have been made to the SAS/IML language in recent years so that the underlying routines become faster without requiring a separate procedure. For example, older single-threaded linear algebra routines have been replaced by multithreaded routines that take advantage of multicore processors for large matrices. These changes should be transparent for most users and require no changes to existing programs.
As for the future, I'll turn the question back to you. What applications do you have in mind? What are the matrix operations that you would like to see "HP-ized"? Do you have data distributed across several machines, or do you have a large amount of data on a single machine? Do you have a grid of separate machines that you want to use for computations, or do you want to take advantage of multiple CPUs on a single machine? Are there features that are available in competing languages that you would like to see in the SAS/IML language?