Solved
Contributor
Posts: 65

# Calculating correlation/covariance for stocks

I have a large dataset that contains daily returns for a many stocks. The dataset has 3 columns: date, stocks, and returns. How can I calculate correlation and covariance between stocks for each quarter or year?

Instead of having a covariance matrix for each period, my desired output would have 5 columns: period ( which can be quarter or year), stock1, stock 2, correlation, and covariance.

Thank you very much for your help!

Accepted Solutions
Solution
‎01-07-2018 10:36 PM
Super User
Posts: 24,014

## Re: Calculating correlation/covariance for stocks

[ Edited ]
Posted in reply to ducman1611

Unfortunately you'll need to restructure your dataset to a wide type format first, i.e.

Date Stock1 Stock2 Stock3 .....

Then you can use PROC CORR to get the correlations and covariances.
The PROC CORR step is easier.

To reformat your data set you can use PROC TRANSPOSE. You'll like first need to sort your data. See a fully worked example below, it should run fine on your machine. I'll leave the final formatting/customization to you, you can use PROC TRANSPOSE as well to reformat it to the long format you were requesting.

``````*Sort and create stock data set to work with;
proc sort data=sashelp.stocks out=stocks_long;
by date;
run;

*transpose to a wide format;
proc transpose data=stocks_long out=stocks_wide prefix=STOCK_;
by date;
id stock;
var Open;
run;

*Run a correlation on the stocks, include only variables of interest with the STOCK_: from the prefix above;
proc corr data=stocks_wide cov outp=want;
var stock_:;
run;``````

ducman1611 wrote:

I have a large dataset that contains daily returns for a many stocks. The dataset has 3 columns: date, stocks, and returns. How can I calculate correlation and covariance between stocks for each quarter or year?

Instead of having a covariance matrix for each period, my desired output would have 5 columns: period ( which can be quarter or year), stock1, stock 2, correlation, and covariance.

Thank you very much for your help!

All Replies
Solution
‎01-07-2018 10:36 PM
Super User
Posts: 24,014

## Re: Calculating correlation/covariance for stocks

[ Edited ]
Posted in reply to ducman1611

Unfortunately you'll need to restructure your dataset to a wide type format first, i.e.

Date Stock1 Stock2 Stock3 .....

Then you can use PROC CORR to get the correlations and covariances.
The PROC CORR step is easier.

To reformat your data set you can use PROC TRANSPOSE. You'll like first need to sort your data. See a fully worked example below, it should run fine on your machine. I'll leave the final formatting/customization to you, you can use PROC TRANSPOSE as well to reformat it to the long format you were requesting.

``````*Sort and create stock data set to work with;
proc sort data=sashelp.stocks out=stocks_long;
by date;
run;

*transpose to a wide format;
proc transpose data=stocks_long out=stocks_wide prefix=STOCK_;
by date;
id stock;
var Open;
run;

*Run a correlation on the stocks, include only variables of interest with the STOCK_: from the prefix above;
proc corr data=stocks_wide cov outp=want;
var stock_:;
run;``````

ducman1611 wrote:

I have a large dataset that contains daily returns for a many stocks. The dataset has 3 columns: date, stocks, and returns. How can I calculate correlation and covariance between stocks for each quarter or year?

Instead of having a covariance matrix for each period, my desired output would have 5 columns: period ( which can be quarter or year), stock1, stock 2, correlation, and covariance.

Thank you very much for your help!

Contributor
Posts: 65

## Re: Calculating correlation/covariance for stocks

thanks. your code pretty much does the job. just one last thing, how can I calculate the CORR and COV between stocks for each year?

SAS Super FREQ
Posts: 4,277

## Re: Calculating correlation/covariance for stocks

Posted in reply to ducman1611

Just put a BY YEAR statement everywhere:

``````data Stock1;
set sashelp.stocks;
year = year(date);  /* create year variable in data set */
run;

*Sort and create stock data set to work with;
proc sort data=stock1 out=stocks_long;
by year date;
run;

*transpose to a wide format;
proc transpose data=stocks_long out=stocks_wide prefix=STOCK_;
by year date;
id stock;
var Open;
run;

*Run a correlation on the stocks, include only variables of interest with the STOCK_: from the prefix above;
proc corr data=stocks_wide cov outp=want;
by year;
var stock_:;
run;

proc print data=want; run;``````
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
• 3 replies
• 265 views
• 2 likes
• 3 in conversation