BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
somebody
Lapis Lazuli | Level 10

I have a large dataset that contains daily returns for a many stocks. The dataset has 3 columns: date, stocks, and returns. How can I calculate correlation and covariance between stocks for each quarter or year? 

Instead of having a covariance matrix for each period, my desired output would have 5 columns: period ( which can be quarter or year), stock1, stock 2, correlation, and covariance.

Thank you very much for your help!

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Unfortunately you'll need to restructure your dataset to a wide type format first, i.e. 

 

Date Stock1 Stock2 Stock3 .....

 

Then you can use PROC CORR to get the correlations and covariances. 
The PROC CORR step is easier. 

 

To reformat your data set you can use PROC TRANSPOSE. You'll like first need to sort your data. See a fully worked example below, it should run fine on your machine. I'll leave the final formatting/customization to you, you can use PROC TRANSPOSE as well to reformat it to the long format you were requesting.

 

*Sort and create stock data set to work with;
proc sort data=sashelp.stocks out=stocks_long;
    by date;
run;

*transpose to a wide format;
proc transpose data=stocks_long out=stocks_wide prefix=STOCK_;
    by date;
    id stock;
    var Open;
run;

*Run a correlation on the stocks, include only variables of interest with the STOCK_: from the prefix above;
proc corr data=stocks_wide cov outp=want;
    var stock_:;
run;

 

 


@somebody wrote:

I have a large dataset that contains daily returns for a many stocks. The dataset has 3 columns: date, stocks, and returns. How can I calculate correlation and covariance between stocks for each quarter or year? 

Instead of having a covariance matrix for each period, my desired output would have 5 columns: period ( which can be quarter or year), stock1, stock 2, correlation, and covariance.

Thank you very much for your help!

 


 

View solution in original post

3 REPLIES 3
Reeza
Super User

Unfortunately you'll need to restructure your dataset to a wide type format first, i.e. 

 

Date Stock1 Stock2 Stock3 .....

 

Then you can use PROC CORR to get the correlations and covariances. 
The PROC CORR step is easier. 

 

To reformat your data set you can use PROC TRANSPOSE. You'll like first need to sort your data. See a fully worked example below, it should run fine on your machine. I'll leave the final formatting/customization to you, you can use PROC TRANSPOSE as well to reformat it to the long format you were requesting.

 

*Sort and create stock data set to work with;
proc sort data=sashelp.stocks out=stocks_long;
    by date;
run;

*transpose to a wide format;
proc transpose data=stocks_long out=stocks_wide prefix=STOCK_;
    by date;
    id stock;
    var Open;
run;

*Run a correlation on the stocks, include only variables of interest with the STOCK_: from the prefix above;
proc corr data=stocks_wide cov outp=want;
    var stock_:;
run;

 

 


@somebody wrote:

I have a large dataset that contains daily returns for a many stocks. The dataset has 3 columns: date, stocks, and returns. How can I calculate correlation and covariance between stocks for each quarter or year? 

Instead of having a covariance matrix for each period, my desired output would have 5 columns: period ( which can be quarter or year), stock1, stock 2, correlation, and covariance.

Thank you very much for your help!

 


 

somebody
Lapis Lazuli | Level 10

thanks. your code pretty much does the job. just one last thing, how can I calculate the CORR and COV between stocks for each year?

Rick_SAS
SAS Super FREQ

Just put a BY YEAR statement everywhere:

 

data Stock1;
set sashelp.stocks;
year = year(date);  /* create year variable in data set */
run;

*Sort and create stock data set to work with;
proc sort data=stock1 out=stocks_long;
    by year date;
run;

*transpose to a wide format;
proc transpose data=stocks_long out=stocks_wide prefix=STOCK_;
    by year date;
    id stock;
    var Open;
run;

*Run a correlation on the stocks, include only variables of interest with the STOCK_: from the prefix above;
proc corr data=stocks_wide cov outp=want;
    by year;
    var stock_:;
run;

proc print data=want; run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 2375 views
  • 2 likes
  • 3 in conversation