BookmarkSubscribeRSS Feed
sas_kms
Calcite | Level 5

I have four data sets D1, D2, D3 and D4 which have same variable names. These data sets share variables of same name. Goal is to perform correlation between variables of one data with that of another data set.

Example: if Dataset has data D1 has variables: var1 var2 var3

and dataset D2 has variables: var1 var2 var3

Inorder to perform correlation between the variables var1 to var3 with those of dataset D2, I  renamed the variables of data set D2 as var1_d2 var2_d2 var3_d2. Then merge D1 and D2 and the final data set is D_New

Using proc corr data = D_New;

         var var1;

with var1_d2 var2_d2 var_d3;

run;

Can anyone please suggest any other way of obtaining correlations across data sets?

2 REPLIES 2
Rick_SAS
SAS Super FREQ

Did you intend to post this to the SAS/IML Support Community?

I assume the data sets all have the same number of observations and they are all ordered the same way so that the i_th observation of each data set is related to the i_th of the other data sets. If so, what you are doing is fine.

You can do the analysis in a single call: rename the variables, merge the data sets into one, and then call PROC CORR. The resulting correlation matrix is a block matrix. The block diagonal elements represent the correlations within each data set. The off-diagonal block represent  correlations between data sets.

If you use PROC IML, you can just horizontally concatenate the data and take the correlation. You don't even need to rename the variables:

proc iml;

use D1;

read all var _NUM_ into X[colname=varNames];

close D1;

DSNames = {D2 D3 D4};

do i = 1 to ncol(DSNames);

   use (DSNames);

   read all var _NUM_ into D[colname=v];

   close (DSNames);

   X = X || D;

   varNames = varNames || v;

end;

C = corr(X);

print C[c=varNames r=varNames];

Region Capture.png

stat_sas
Ammonite | Level 13

This seems to be a tough using proc corr. Try proc anova or proc glm.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 2 replies
  • 929 views
  • 0 likes
  • 3 in conversation