First, months may have 30 to 31 calendar days,but they typically have only 20 trading days. Three years of trading with daily data is actually closer to 3*12*20=720 daily records, not 1116.
Second you only want PERMNO's with three years or more of trading. Don't use the daily returns with the array to find out which permno's qualify. Go to the other CRSP datasets. There's one out there that has the earliest and latest trading dates for each permno. You can use that to identify the ones you want to keep.
Or use PROC SUMMARY on your returns data to get the date range and count of records for each PERMNO, as in
proc summary data=mydata noprint n min max;
class permno;
var date;
output n(date)=ndates min(date)=mindate max(date)=maxdate out=permno_stats;
run;
Then you can merge that with your data and filter which permno's to process using conditions on NDATES or on MINDATE/MAXDATE.
But most important - don't drop the view technique. Instead drop the brute force calculation of correlations by providing complete windows of original returns data. Instead use proc expand to create rolling windows of SSCP (preceded by a very helpful data set view). From the sscp data, it's easy to calculate correlations.
My example using sashelp.stocks shows my recommendation. It has 5 steps:
PROC SORT: The data in my example is monthly, sorted by STOCK (like PERMNO) and DATE.
PROC SUMMARY: The proc summary generates the dataset stock_stats, with one record per stock, and variables NDATES,MINDATE,MAXDATE. This will allow you to filter either on date count (ndates), or on years covered.(mindate, maxdate).
DATA NEED/VIEW=NEED. Because proc expand can't directly generate rolling cross products, a preceding data set VIEW generates record-by-record cross products for proc expand to make rolling window SSCP's. And this is where I put a filter on which stocks to process.
PROC EXPAND: generate rolling windows (12 months size in my case) to make all the sscp values, means, and std's to calculate correlations. Note each window is identified by stock and date, where date is the last date of the window. Change all the 12's in my proc expand to whatever window size you want to analyse.
DATA CORRDATA: Calculate correlations.
This should eliminate a lot of calculations - primarily because proc expand will generate rolling sscp's by adding the latest squares and cross-products while subtracting the 12-month old equivalents. Not by reprocessing all the records in the new window. The larger the window size, the more beneficial this will be.
/* Get monthly stock data for ibm, intel,msft */
proc sort data=sashelp.stocks (keep=stock date open close)
out=stocks (rename=(open=x close=y));
by stock date;
run;
/* Get the profiles of each stock */
proc summary data=sashelp.stocks noprint n min max nway;
class stock;
var date;
output n(date)=ndates min(date)=mindate max(date)=maxdate
out=stock_stats (keep=stock ndates mindate maxdate);
run;
/* Prepare to use PROC EXPAND to generate rolling SSCP values */
/* and simultaneously only keep stock with the desired date range */
data need /view=need;
merge stocks stock_stats;
by stock;
if ndates>200;
xx=x*x; xy=x*y; yx=y*x; yy=y*y;
run;
/* Now make rolling 12-month windows with sums of square and cross products*/
/* and also rolling means and std */
proc expand data=need out=sscpdata (where=(_n=12)) method= none;
by stock;
id date;
convert y=_n / transformin=(*0) transformout=(+1 MOVSUM 12);
convert x=xsum y=ysum xy=xysum xx=xxsum yy=yysum / transformout=(MOVSUM 12);
convert x=xmean y=ymean / transformout=(MOVAVE 12);
convert x=xstd y=ystd / transformout=(MOVSTD 12);
run;
/* Now get the correlation values */
data corrdata;
set sscpdata;
corrxy=(xysum-_n*xmean*ymean)/((_n-1)*xstd*ystd);
run;
... View more