Hi~~
I'm a stat-centric risk manager working on lots of financial models. Gladly IML community caught my eyes because I've got many questions with respect to IML usages.
I'm dealing with a very big matrix whose size is 1,000,000-by-20 (row-by-column). For ease of explanation, the matrix name is XXX. 1st column is number of defaults, 2nd column is grade(1,2,3~,20), 3rd column is region number(100, 200, 300, 400), 4th column is corporate size(1: large company, 2:Small and Medium, 3: Small etc)...
The matrix format in display is like this;
8 2 300 2 ....
15 1 100 3 ....
3 4 100 1 ....
2 1 200 1 ....
....
I'd like to get sums of defaults by each of grade, region number, corporate size, which is usually done by proc means, proc summary procedures with "data sets". For example,
proc summary data=XXX;
var defaults
class grade region_number corporate_size;
output out=getsum sum=;
run;
Then I get many sub-sums for various combination of class variables. I found an iml equivalent "summary" statement which functions almost same as proc summary.
But the problem is that in order to use "summary" statement, I need to make the matrix XXX a data set and then I can use the "summary" statement.
My question is, Is there any other way that without making the matrix XXX a data set, I can get sub-sums of first column by each combination of 2, 3, 4 column values. The point is that huge number of temporary big matrices are generated and sub-sumbs are extactred and big matrices are discarded. In the meantime, big time loss occurs during making data sets. I could tolerate if the number of those operations small, but the number could be up to tens of thousand, which is considered to be no solution at all.
I've been trying to find possible solutions for it, searching through internet and in vain....
Thank you in advance~