Calculating covariances with "cov" - memory management "proc corr" versus "proc iml"

Accepted Solution Solved
Reply
Contributor
Posts: 50
Accepted Solution

Calculating covariances with "cov" - memory management "proc corr" versus "proc iml"

[ Edited ]

Dear community,

 

I would be glad if someone could help me understand (and, in fact assure that there is indeed nothing dubious going on behind the scene, which I, in my fondness, am not capable of getting aware of) the following issue.

 

I have a data set of roughly 8,000 monthly stock return observations.

For them, I would like to calculate the covariance.

Using "proc corr", I receive an "ERROR: The SAS System stopped processing this step because of insufficient memory." (64 GB RAM).

Desperately, I loaded the data into a matrix using "proc iml" with its "cov" and in less than two seconds, I get a covariance matrix, which is about 500MB in size.

Using two stocks only, "proc corr" and "proc iml" yield the same results (de facto, "proc iml" displays one decimal place more).

 

I specify "noprint" and "just in case" suppress ODS with the aid of this wonderful tool I am deeply greatful for (or "trying to use", rather) when using "proc corr".

Further, I only use stocks with non-missing data over the whole considered time span.

 

In the process, using "proc iml" means a detour.

Is this detour correct? If so, why is "proc iml" capable for handling the data while "proc corr" is not?

 

Yours sincerely,

Sinistrum


Accepted Solutions
Solution
‎02-09-2018 02:02 PM
Super User
Posts: 23,315

Re: Calculating covariances with "cov" - memory management "proc corr" versus &a

Posted in reply to Sinistrum

SAS says " MEMSIZE=2147483648".

 

That's only 2GB. If you have a system with 64 GB and you want to use more you need to increase your memsize option. 

I don't think you can do that via an OPTION statement though, you need to modify it in the config file.

View solution in original post


All Replies
Super User
Posts: 23,315

Re: Calculating covariances with "cov" - memory management "proc corr" versus &q

Posted in reply to Sinistrum

What's your MEMSIZE set to?

Although you have 64 GB of RAM do you have SAS set up to use it?

 

*display memsize option in the log;
proc options option=memsize;
run;

 

How big is your data, number of rows/cols? 

You say 8000 monthly stock returns so that's 8000 columns with months in the rows? How many months?

 


Sinistrum wrote:

Dear community,

 

I would be glad if someone could help me understand (and, in fact assure that there is indeed nothing dubious going on behind the scene, which I, in my fondness, am not capable of getting aware of) the following issue.

 

I have a data set of roughly 8,000 monthly stock return observations.

For them, I would like to calculate the covariance.

Using "proc corr", I receive an "ERROR: The SAS System stopped processing this step because of insufficient memory." (64 GB RAM).

Desperately, I loaded the data into a matrix using "proc iml" with its "cov" and in less than two seconds, I get a covariance matrix, which is about 500MB in size.

Using two stocks only, "proc corr" and "proc iml" yield the same results (de facto, "proc iml" displays one decimal place more).

 

I specify "noprint" and "just in case" suppress ODS with the aid of this wonderful tool I am deeply greatful for (or "trying to use", rather) when using "proc corr".

Further, I only use stocks with non-missing data over the whole considered time span.

 

In the process, using "proc iml" means a detour.

Is this detour correct? If so, why is "proc iml" capable for handling the data while "proc corr" is not?

 

Yours sincerely,

Sinistrum


 

 

SAS Super FREQ
Posts: 4,172

Re: Calculating covariances with "cov" - memory management "proc corr" versus &q

Posted in reply to Sinistrum

You didn't show the code, but I suspect the issue is that PROC CORR is trying to display the huge correlation and covariance matrices in ODS.  You don't say what you are trying to do with these huge matrices, but I'm guessing you don't need them displayed to the screen, so use OUTP=dataset to save the correlation/covariances to a SAS data set and use the NOPRINT option to suppress the output:

 

 

proc corr data=sashelp.class cov outp=CorrCov noprint;
run;

You can use WHERE clauses such as 

where _Type_ = "COV";

 

to access only the relevant data.

 

The above code saves the CORR and COV. If you only want the COV, use

 

proc corr data=sashelp.class cov NOPRINT
        outp=CorrCov(where=(_TYPE_^="CORR"));
run;

PS. Glad you liked my macro to suppress ODS output!

SAS Super FREQ
Posts: 4,172

Re: Calculating covariances with "cov" - memory management "proc corr" versus &q

Posted in reply to Sinistrum

Also, if you know the stocks have nonmissing data, you can speed up the computation by using the NOMISS option:

 

proc corr nomiss noprint data=...;

Contributor
Posts: 50

Re: Calculating covariances with "cov" - memory management "proc corr" versus &a

[ Edited ]

Thank you for the quick responses!

 

Reeza worte:

What's your MEMSIZE set to?

Although you have 64 GB of RAM do you have SAS set up to use it?

 

SAS says " MEMSIZE=2147483648".

Reeza worte:

How big is your data, number of rows/cols? 

I started playing with six months only  to write the program and see how it works, as later on I want to employ daily data in this interval. Thus, the data set I used is only 1.1MB in size. The number of rows is equal to six. The number of rows is equal to 6, the number of columns is equal to roughly 8,000.

 

 

Rick_SAS worte:

You didn't show the code, but I suspect the issue is that PROC CORR is trying to display the huge correlation and covariance matrices in ODS.  You don't say what you are trying to do with these huge matrices, but I'm guessing you don't need them displayed to the screen

 

You are completely right. I need the estimated covariances as input parameters to calculate portfolio variances, given different sets of weights.

It would bee convenient to have the whole covariance matrix, such that for each portfolio I need to compute the variance for I would "just" need to pick among the matrix the elements relevant for that particular portfolio.

I could loop over all different portfolios, id est, pick, from the list of stock returns, only those stocks actually included in the respective portfolio and then calculate the covariance matrix for this particular loop step.

But as there are 52 points in time where portfolios are built and roughly 500 portfolios each point in time, I would like to avoid it.

With what I have in my mind I would just need to estimate 52 covariance matrices.

 

My code was this (thank you for posting yours):

 

 

%ODSoff;
proc	corr
	data		=		in
noprint cov; ods output cov = cov; run; %ODSon;

 

 

I am deeply grateful for your blog and stunned that I am indeed in a conversation with you.

How much your "proc iml" guidance helped me to get into this facility is hard to quantify.

Solution
‎02-09-2018 02:02 PM
Super User
Posts: 23,315

Re: Calculating covariances with "cov" - memory management "proc corr" versus &a

Posted in reply to Sinistrum

SAS says " MEMSIZE=2147483648".

 

That's only 2GB. If you have a system with 64 GB and you want to use more you need to increase your memsize option. 

I don't think you can do that via an OPTION statement though, you need to modify it in the config file.

Contributor
Posts: 50

Re: Calculating covariances with "cov" - memory management "proc corr" versus &a

Thank you, I have tried it immediately and it worked.

SAS Super FREQ
Posts: 4,172

Re: Calculating covariances with "cov" - memory management "proc corr" versus &a

Posted in reply to Sinistrum

If you are using NOPRINT, you don't need to use %ODSOFF.

The ODS OUTPUT statement is not doing anything because no output is produced.

Use the OUTP= option, as I showed.

Contributor
Posts: 50

Re: Calculating covariances with "cov" - memory management "proc corr" versus &a

[ Edited ]

Yes, sorry, I posted the wrong code. I actually was not running with "NOPRINT" on.

Your "OUTP=" solution is much quicker than suppressing ODS and using "ods output" (takes 6 times longer in this special setting).

Still, it needs the set up of MEMSIZE while "proc iml" does not.

 

Thank you very much indeed, both of you!

SAS Super FREQ
Posts: 4,172

Re: Calculating covariances with "cov" - memory management "proc corr" versus &a

Posted in reply to Sinistrum

Yes. As I say in my article "What is the best way to suppress ODS output in SAS?", "the NOPRINT option is the most efficient way to suppress output."

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 215 views
  • 8 likes
  • 3 in conversation