BookmarkSubscribeRSS Feed
awesome_opossum
Obsidian | Level 7

Hi, I'm not sure there's an easy answer to this one, but let's see.  I'm trying to generate a correlation matrix like that one that comes from proc corr outp=, with the full (mirrored) correlations of the variables and 1's on the diagonal:  

awesome_opossum_1-1663696654700.png

 

However, I'm trying to get there from a column list as such:

awesome_opossum_0-1663696513074.png

 

I can't wrap my head around a relatively simple or painless way to do this.  Does anyone have advice? 

2 REPLIES 2
ballardw
Super User

For your consideration, an example with a small data set created in a datastep.

data have;
   input varx $ with $ corr;
datalines;
var1 var2  0.096
var1 var3  -0.062
var2 var3  -0.41
;

data need;
   set have;
   length temp $ 10;
   output;
   temp=var;
   var=with;
   with=temp;
   output;
   drop temp;
run;

proc tabulate data=need;
   class varx with;
   var corr;
   table varx='',
         with=''*corr=''*max=''
         /misstext='1'
   ;
run;

First, please provide data in the form of a data step, such as shown above, or at least as text. We cannot code against pictures of values and I'm very unwilling to retype much.

 

Familiarity with reporting procedures like Tabulate says that if I have Class variable holding a row label and another with a column label I can get statistics of a numeric variable. So that tells me, since you have the same values of the row and column labels that you need a data set with all the values in the row label with a corresponding column. So the Need data step duplicates the data with the row/column positions switched to allow the "mirror".

The Proc Tabulate option Misstext allows me to show what I want when the cell would be missing such as when the row and column labels are the same. The Tabulate approach could likely use other statistics but I chose max. Note that the =' ' bits are to suppress the variable names from appearing in the output.

 

If you actually want the word "corr" that's another issue.

Rick_SAS
SAS Super FREQ

Sounds like you want to start with correlations in "long form" and convert to a symmetric correlation matrix in "wide form". If you have access to SAS/IML software, you can

  • figure out the number of variables from the number of rows in the data set (solve the equation p*(p-1)/2 = N, where N is the number of rows)
  • allocate a p x p matrix
  • Find the names of the variables
  • Fill in the upper- and lower-triangular portions of the matrix from the CORR values
  • Optionally, write the information as a TYPE=CORR data set
data Have;
length Var WithVar $32;
input Var WithVar Corr;
datalines;
var1 var2  0.09
var1 var3 -0.62
var1 var4  0.1
var2 var3 -0.41
var2 var4 -0.003
var3 var4 0.07
;

proc iml;
use Have;
read all var {'Var' 'WithVar' 'Corr'};
close;

/* how many vars? Solve the equation 
   p*(p-1) = N for p, where N is the number of obs
*/
N = nrow(Corr);
p = (sqrt(8*N+1)+1)/2;
varNames = j(1, p, BlankStr(32));

/* fill up the varNames */
varNames[1] = Var[1];
varNames[2:p] = WithVar[1:(p-1)];

/* convert the numbers into a symmetric array */
C = j(p,p,0);       /* create p x p matrix  */
upperIdx = loc(row(C)<col(C));  /* assign lower-triangula elements */
C[upperIdx] = Corr;
C = C + C`;                     /* assign upper-triangular elements */
diagIdx = loc(row(C)=col(C));
C[diagIdx] = 1;                 /* put 1s on the diagonal */

print C[r=VarNames c=VarNames];

/* optional: write to data set in TYPE=CORR format */
_TYPE_ = j(p, 1, "CORR");
_NAME_ = colvec(varNames);
/* The following statement is for 9.4M6 and later.
   For earlier releases, use two CREATE statements: 
   One for the _TYPE_ and _NAME_ vars and the other for the corr matrix */
create Want from _TYPE_ _NAME_ C[c=({'_TYPE_' '_NAME_'}||varNames)];
append from _TYPE_ _NAME_ C;
close;
quit;

proc print data=Want; run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1050 views
  • 3 likes
  • 3 in conversation