BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi all,

Firstly, I'm slightly new to SAS. I would like to compute correlation coefficients on a large data set X with a single common variable Y.

The data set X is now sorted such that daily observations run down the table, stock names run across the top, with returns in the table. i.e.,

stock_a stock_b stock_c
day 1 . . .
day 2 . . .
day 3 . . .

The difficulty I'm having is that I would like to compute the correlation between the X's and the Y on a monthly basis (where I have daily data). In Matlab for instance I would do this by looping over the rows then the columns and filling up containers then computing the correlations for each individual month.

Does SAS have an easy way to do this given my X is large (60 million observations) and contains missing data.

Thank a bundle. I'm learning fast and I'm liking SAS so far.

Message was edited by: thepowertoknow? Message was edited by: thepowertoknow?
4 REPLIES 4
art297
Opal | Level 21
You have to tell us more! If you have 60 million records, one for each day and with each row containing some kind of values for a series of stocks, then you have 164,271 years worth of data. I didn't know that stocks have been around that long.

You can build loops into SAS code but, more likely, you will want to use something like proc summary to calculate the desired averages for each month and, if necessary, transpose the file.

However, for anyone to help, they would have to know what your data really are, and whic variables you want to obtain correlations for.

Art
> Hi all,
>
> Firstly, I'm slightly new to SAS. I would like to
> compute correlation coefficients on a large data set
> X with a single common variable Y.
>
> The data set X is now sorted such that daily
> observations run down the table, stock names run
> across the top, with returns in the table. i.e.,
>
> stock_a stock_b stock_c
> . . .
> day 2 . . .
> day 3 . . .
>
> The difficulty I'm having is that I would like to
> compute the correlation between the X's and the Y on
> a monthly basis (where I have daily data). In Matlab
> for instance I would do this by looping over the rows
> then the columns and filling up containers then
> computing the correlations for each individual month.
>
>
> Does SAS have an easy way to do this given my X is
> large (60 million observations) and contains missing
> data.
>
> Thank a bundle. I'm learning fast and I'm liking SAS
> so far.
>
> Message was edited by: thepowertoknow?
>
> Message was edited by: thepowertoknow?
Ksharp
Super User
Proc corr can give you correlation coefficients ( Pearson or Spearman ).




Ksharp
ballardw
Super User
If I understand it sounds like you are wanting to use your date in a BY group.
If the "day" variable is a SAS date variable and the data is sorted by that variable, then try:

PROC CORR DATA=;
by date;
var x y;
format date monyy7.;
run;

This will create one correlation output table for each month and year that appears in the data. You may want to direct the output to a data set.
deleted_user
Not applicable
Hi,

yes - this is exactly how I did it in the end. Thanks a lot.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1574 views
  • 0 likes
  • 4 in conversation