What worked and what didn’t Using the sql approach did not “work” with the large dataset—usingone quarter of actual data (2643 investors and 3685 stocks), the code ran for sixhours before I stopped it. Given I have 50 quarters of data, it would just taketoo long. I also tried my own (actually one of my co-authors) “old school” codewhere we ran loops for every manager with every other manager. That too, tookway to long (my estimate was it would take 37 days for the code to run). I finally went back to the IML. That really seemed to workwell – thanks both Ksharp and Rick. It took about 3 minutes to handle the matrixcalculations with the large dataset. Also using the IML ncol (number ofcolumns), vecdiag (creates a vector out of the diagonal of the matrix), and sumfunctions allowed me to easily estimate the sum of the elements of the matrixand subtract the sum of the diagonal. The average off diagonal element is thenthe [sum(all)-sum(diagonal)]/(N*(N-1)). What I can’t seem to get to work is to output the results toa sas dataset. I tried inserting the code suggested by Rick, but with Ksharp’smacro, SAS was unhappy (the createstatements are commented out below). I’dlike to be able to do this, because in my real data, I’ll set a loop aroundthis whole thing and run it for each of the 50 quarters of data. Thanks again for all your help and happy 2012! Rick optionsreplace; optionsnocenter; procdatasets kill; dataa; input investor $ companyID $ wt; cards; A IBM 0.5 A MSFT 0.2 A GOOG 0.1 A GRPN 0 A F 0.2 B IBM 0.4 B MSFT 0.5 B GOOG 0 B GRPN 0 B F 0.1 C IBM 0.5 C MSFT 0 C GOOG 0 C GRPN 0.5 C F 0 D IBM 0.1 D MSFT 0.25 D GOOG 0.25 D GRPN 0.1 D F 0.3 ; run; procsort data=a;by companyID;run; proctranspose data=aout=temp(drop=_name_); bycompanyid; idinvestor; varwt; run; data_null_; settemp; callsymputx(cats('list',_n_),catx('',of _numeric_)); callsymputx('end',_n_); run; %macroacross; proc iml; x = { %doi=1 %to &end ; &&list&i %if&i ne &end %then %do;,%end; %end; }; G = x /sqrt(x[##,]); /* standardize*/ m = G`*G; /*cross products*/ number_columns=ncol(m); d=vecdiag(m); sum_all=sum(m); sum_diag=sum(d); ave_off_diag=(sum_all-sum_diag)/(number_columns*(number_columns-1)); printnumber_columns sum_all sum_diag ave_off_diag; /* create MyDatavar {number_columns sum_all sum_diag}; append; close MyData; */ quit; %mendacross; %across run;
... View more