I'm trying to create a dataset that lists the correlation between pairs of company's returns and the identifies the companies. Basically, I compute the correlation matrix of three companies' (A, B, and C) returns and can output the data, but I don't know how to later identify which correlation is which pair of companies. Any ideas on how to identify each pair? Here's an example:
data companies;
infile cards;
input permno $ date $ ret;
cards;
A 199901 0.10
A 199902 0.20
A 199903 -0.10
A 199904 -0.20
A 199905 0.15
A 199906 -0.15
A 199907 0.12
A 199908 -0.12
A 199909 0.05
A 199910 -0.05
B 199901 -0.05
B 199902 -0.20
B 199903 0.09
B 199904 -0.10
B 199905 0.05
B 199906 0.05
B 199907 -0.07
B 199908 0.07
B 199909 0.01
B 199910 -0.01
C 199901 0.20
C 199902 0.10
C 199903 -0.05
C 199904 -0.05
C 199905 0.19
C 199906 -0.11
C 199907 0.12
C 199908 -0.12
C 199909 0.05
C 199910 -0.05
;
proc contents; run;
proc sort data=companies; by date;
proc transpose data=companies out=want; id permno;
by date;
var ret;
run;
proc iml;
use want;
read all var _NUM_ into x[colname=permno];
close want;
corr=corr(x);
mattrib corr rowname=permno colname=permno;
print corr;
create rj var {corr};
append ;
close rj;
quit;
run;
data junk; set rj; proc print; run;
/**********
data rj lists all 9 correlations but I can't identify the individual pairs.
Specifically, I get
Obs CORR
1 1.00000
2 -0.43578
3 0.88046
4 -0.43578
5 1.00000
6 -0.37909
7 0.88046
8 -0.37909
9 1.0000
And I'd like to get
A A 1
A B -0.43578
A C 0.88046
B A -0.43578
B B 1
B C -0.379087
C A 0.8804574
C B -0.379087
C C 1
And really, I'd just like the bottom half of the matrix (i.e., just the 3 unique correlations)
A B -0.43578
A C 0.88046
B C -0.379087
Any ideas? Thanks!
You have lots of choices:
1) use PROC TRANSPOSE again
2) use the DATA step
3) write a loop in SAS/IML
4) Use the ideas in http://blogs.sas.com/content/iml/2012/02/29/defining-banded-and-triangular-matrices/
Depending upon how you intend to use the correlations, you might want to output a TYPE=CORR data set:
create R(type=corr) from Corr[r=permno c=permno];
append from Corr[r=permno];
close R;
If you really want the lower triangular data in a "flat" format, use the SYMSQR function .
Thanks Rick -
Not quite what I need, however. I want to use the correlations as dependent variables in a later regression (hence, I need it in "flat" format - each correlation is an observation). So I need to have each correlation as a single row with the identifiers of the two companies. I also tried symsqr, but again can't seem to figure out how to identify the pairs.
e.g., I'd like the output to look like this:
firm1_firm2 corr
A_B -0.43578
A_C 0.88046
B_C -0.379087
Thanks,
Rick
You have lots of choices:
1) use PROC TRANSPOSE again
2) use the DATA step
3) write a loop in SAS/IML
4) Use the ideas in http://blogs.sas.com/content/iml/2012/02/29/defining-banded-and-triangular-matrices/
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.