<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic correlation on large data set in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/correlation-on-large-data-set/m-p/33780#M8189</link>
    <description>Hi all, &lt;BR /&gt;
&lt;BR /&gt;
Firstly, I'm slightly new to SAS. I would like to compute correlation coefficients on a large data set X with a single common variable Y. &lt;BR /&gt;
&lt;BR /&gt;
The data set X is now sorted such that daily observations run down the table, stock names run across the top, with returns in the table. i.e., &lt;BR /&gt;
&lt;BR /&gt;
          stock_a   stock_b   stock_c&lt;BR /&gt;
day 1     .               .                . &lt;BR /&gt;
day 2     .               .                .&lt;BR /&gt;
day 3     .               .                .&lt;BR /&gt;
&lt;BR /&gt;
The difficulty I'm having is that I would like to compute the correlation between the X's and the Y on a monthly basis (where I have daily data). In Matlab for instance I would do this by looping over the rows then the columns and filling up containers then computing the correlations for each individual month. &lt;BR /&gt;
&lt;BR /&gt;
Does SAS have an easy way to do this given my X is large (60 million observations) and contains missing data. &lt;BR /&gt;
&lt;BR /&gt;
Thank a bundle. I'm learning fast and I'm liking SAS so far.&lt;BR /&gt;
&lt;BR /&gt;
Message was edited by: thepowertoknow?

Message was edited by: thepowertoknow?</description>
    <pubDate>Sat, 19 Mar 2011 17:52:40 GMT</pubDate>
    <dc:creator>deleted_user</dc:creator>
    <dc:date>2011-03-19T17:52:40Z</dc:date>
    <item>
      <title>correlation on large data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/correlation-on-large-data-set/m-p/33780#M8189</link>
      <description>Hi all, &lt;BR /&gt;
&lt;BR /&gt;
Firstly, I'm slightly new to SAS. I would like to compute correlation coefficients on a large data set X with a single common variable Y. &lt;BR /&gt;
&lt;BR /&gt;
The data set X is now sorted such that daily observations run down the table, stock names run across the top, with returns in the table. i.e., &lt;BR /&gt;
&lt;BR /&gt;
          stock_a   stock_b   stock_c&lt;BR /&gt;
day 1     .               .                . &lt;BR /&gt;
day 2     .               .                .&lt;BR /&gt;
day 3     .               .                .&lt;BR /&gt;
&lt;BR /&gt;
The difficulty I'm having is that I would like to compute the correlation between the X's and the Y on a monthly basis (where I have daily data). In Matlab for instance I would do this by looping over the rows then the columns and filling up containers then computing the correlations for each individual month. &lt;BR /&gt;
&lt;BR /&gt;
Does SAS have an easy way to do this given my X is large (60 million observations) and contains missing data. &lt;BR /&gt;
&lt;BR /&gt;
Thank a bundle. I'm learning fast and I'm liking SAS so far.&lt;BR /&gt;
&lt;BR /&gt;
Message was edited by: thepowertoknow?

Message was edited by: thepowertoknow?</description>
      <pubDate>Sat, 19 Mar 2011 17:52:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/correlation-on-large-data-set/m-p/33780#M8189</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2011-03-19T17:52:40Z</dc:date>
    </item>
    <item>
      <title>Re: correlation on large data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/correlation-on-large-data-set/m-p/33781#M8190</link>
      <description>You have to tell us more!  If you have 60 million records, one for each day and with each row containing some kind of values for a series of stocks, then you have 164,271 years worth of data.  I didn't know that stocks have been around that long.&lt;BR /&gt;
&lt;BR /&gt;
You can build loops into SAS code but, more likely, you will want to use something like proc summary to calculate the desired averages for each month and, if necessary, transpose the file.&lt;BR /&gt;
&lt;BR /&gt;
However, for anyone to help, they would have to know what your data really are, and whic variables you want to obtain correlations for.&lt;BR /&gt;
&lt;BR /&gt;
Art&lt;BR /&gt;
&amp;gt; Hi all, &lt;BR /&gt;
&amp;gt; &lt;BR /&gt;
&amp;gt; Firstly, I'm slightly new to SAS. I would like to&lt;BR /&gt;
&amp;gt; compute correlation coefficients on a large data set&lt;BR /&gt;
&amp;gt; X with a single common variable Y. &lt;BR /&gt;
&amp;gt; &lt;BR /&gt;
&amp;gt; The data set X is now sorted such that daily&lt;BR /&gt;
&amp;gt; observations run down the table, stock names run&lt;BR /&gt;
&amp;gt; across the top, with returns in the table. i.e., &lt;BR /&gt;
&amp;gt; &lt;BR /&gt;
&amp;gt;           stock_a   stock_b   stock_c&lt;BR /&gt;
&amp;gt; .               .                . &lt;BR /&gt;
&amp;gt; day 2     .               .                .&lt;BR /&gt;
&amp;gt; day 3     .               .                .&lt;BR /&gt;
&amp;gt; &lt;BR /&gt;
&amp;gt; The difficulty I'm having is that I would like to&lt;BR /&gt;
&amp;gt; compute the correlation between the X's and the Y on&lt;BR /&gt;
&amp;gt; a monthly basis (where I have daily data). In Matlab&lt;BR /&gt;
&amp;gt; for instance I would do this by looping over the rows&lt;BR /&gt;
&amp;gt; then the columns and filling up containers then&lt;BR /&gt;
&amp;gt; computing the correlations for each individual month.&lt;BR /&gt;
&amp;gt; &lt;BR /&gt;
&amp;gt; &lt;BR /&gt;
&amp;gt; Does SAS have an easy way to do this given my X is&lt;BR /&gt;
&amp;gt; large (60 million observations) and contains missing&lt;BR /&gt;
&amp;gt; data. &lt;BR /&gt;
&amp;gt; &lt;BR /&gt;
&amp;gt; Thank a bundle. I'm learning fast and I'm liking SAS&lt;BR /&gt;
&amp;gt; so far.&lt;BR /&gt;
&amp;gt; &lt;BR /&gt;
&amp;gt; Message was edited by: thepowertoknow?&lt;BR /&gt;
&amp;gt; &lt;BR /&gt;
&amp;gt; Message was edited by: thepowertoknow?</description>
      <pubDate>Sun, 20 Mar 2011 16:43:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/correlation-on-large-data-set/m-p/33781#M8190</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2011-03-20T16:43:55Z</dc:date>
    </item>
    <item>
      <title>Re: correlation on large data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/correlation-on-large-data-set/m-p/33782#M8191</link>
      <description>Proc corr can give you correlation coefficients ( Pearson or Spearman ).&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
Ksharp</description>
      <pubDate>Mon, 21 Mar 2011 01:50:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/correlation-on-large-data-set/m-p/33782#M8191</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2011-03-21T01:50:44Z</dc:date>
    </item>
    <item>
      <title>Re: correlation on large data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/correlation-on-large-data-set/m-p/33783#M8192</link>
      <description>If I understand it sounds like you are wanting to use your date in a BY group.&lt;BR /&gt;
If the "day" variable is a SAS date variable and the data is sorted by that variable, then try: &lt;BR /&gt;
&lt;BR /&gt;
PROC CORR DATA=&lt;YOUR data="https://communities.sas.com/" set=""&gt;;&lt;BR /&gt;
   by date;&lt;BR /&gt;
   var x y;&lt;BR /&gt;
   format date monyy7.;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
This will create one correlation output table for each month and year that appears in the data. You may want to direct the output to a data set.&lt;/YOUR&gt;</description>
      <pubDate>Mon, 21 Mar 2011 14:57:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/correlation-on-large-data-set/m-p/33783#M8192</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2011-03-21T14:57:12Z</dc:date>
    </item>
    <item>
      <title>Re: correlation on large data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/correlation-on-large-data-set/m-p/33784#M8193</link>
      <description>Hi, &lt;BR /&gt;
&lt;BR /&gt;
yes - this is exactly how I did it in the end. Thanks a lot.</description>
      <pubDate>Mon, 21 Mar 2011 18:26:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/correlation-on-large-data-set/m-p/33784#M8193</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2011-03-21T18:26:25Z</dc:date>
    </item>
  </channel>
</rss>

