🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Lapis Lazuli | Level 10

Calculating correlation/covariance for stocks

I have a large dataset that contains daily returns for a many stocks. The dataset has 3 columns: date, stocks, and returns. How can I calculate correlation and covariance between stocks for each quarter or year?

Instead of having a covariance matrix for each period, my desired output would have 5 columns: period ( which can be quarter or year), stock1, stock 2, correlation, and covariance.

Thank you very much for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
Super User

Re: Calculating correlation/covariance for stocks

Unfortunately you'll need to restructure your dataset to a wide type format first, i.e.

Date Stock1 Stock2 Stock3 .....

Then you can use PROC CORR to get the correlations and covariances.
The PROC CORR step is easier.

To reformat your data set you can use PROC TRANSPOSE. You'll like first need to sort your data. See a fully worked example below, it should run fine on your machine. I'll leave the final formatting/customization to you, you can use PROC TRANSPOSE as well to reformat it to the long format you were requesting.

``````*Sort and create stock data set to work with;
proc sort data=sashelp.stocks out=stocks_long;
by date;
run;

*transpose to a wide format;
proc transpose data=stocks_long out=stocks_wide prefix=STOCK_;
by date;
id stock;
var Open;
run;

*Run a correlation on the stocks, include only variables of interest with the STOCK_: from the prefix above;
proc corr data=stocks_wide cov outp=want;
var stock_:;
run;``````

@somebody wrote:

I have a large dataset that contains daily returns for a many stocks. The dataset has 3 columns: date, stocks, and returns. How can I calculate correlation and covariance between stocks for each quarter or year?

Instead of having a covariance matrix for each period, my desired output would have 5 columns: period ( which can be quarter or year), stock1, stock 2, correlation, and covariance.

Thank you very much for your help!

3 REPLIES 3
Super User

Re: Calculating correlation/covariance for stocks

Unfortunately you'll need to restructure your dataset to a wide type format first, i.e.

Date Stock1 Stock2 Stock3 .....

Then you can use PROC CORR to get the correlations and covariances.
The PROC CORR step is easier.

To reformat your data set you can use PROC TRANSPOSE. You'll like first need to sort your data. See a fully worked example below, it should run fine on your machine. I'll leave the final formatting/customization to you, you can use PROC TRANSPOSE as well to reformat it to the long format you were requesting.

``````*Sort and create stock data set to work with;
proc sort data=sashelp.stocks out=stocks_long;
by date;
run;

*transpose to a wide format;
proc transpose data=stocks_long out=stocks_wide prefix=STOCK_;
by date;
id stock;
var Open;
run;

*Run a correlation on the stocks, include only variables of interest with the STOCK_: from the prefix above;
proc corr data=stocks_wide cov outp=want;
var stock_:;
run;``````

@somebody wrote:

I have a large dataset that contains daily returns for a many stocks. The dataset has 3 columns: date, stocks, and returns. How can I calculate correlation and covariance between stocks for each quarter or year?

Instead of having a covariance matrix for each period, my desired output would have 5 columns: period ( which can be quarter or year), stock1, stock 2, correlation, and covariance.

Thank you very much for your help!

Lapis Lazuli | Level 10

Re: Calculating correlation/covariance for stocks

thanks. your code pretty much does the job. just one last thing, how can I calculate the CORR and COV between stocks for each year?

SAS Super FREQ

Re: Calculating correlation/covariance for stocks

Just put a BY YEAR statement everywhere:

``````data Stock1;
set sashelp.stocks;
year = year(date);  /* create year variable in data set */
run;

*Sort and create stock data set to work with;
proc sort data=stock1 out=stocks_long;
by year date;
run;

*transpose to a wide format;
proc transpose data=stocks_long out=stocks_wide prefix=STOCK_;
by year date;
id stock;
var Open;
run;

*Run a correlation on the stocks, include only variables of interest with the STOCK_: from the prefix above;
proc corr data=stocks_wide cov outp=want;
by year;
var stock_:;
run;

proc print data=want; run;``````
Discussion stats
• 3 replies
• 2381 views
• 2 likes
• 3 in conversation