turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2014 02:28 PM

I have four data sets D1, D2, D3 and D4 which have same variable names. These data sets share variables of same name. Goal is to perform correlation between variables of one data with that of another data set.

Example: if Dataset has data D1 has variables: var1 var2 var3

and dataset D2 has variables: var1 var2 var3

Inorder to perform correlation between the variables var1 to var3 with those of dataset D2, I renamed the variables of data set D2 as var1_d2 var2_d2 var3_d2. Then merge D1 and D2 and the final data set is D_New

Using proc corr data = D_New;

var var1;

with var1_d2 var2_d2 var_d3;

run;

Can anyone please suggest any other way of obtaining correlations across data sets?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2014 03:12 PM

Did you intend to post this to the SAS/IML Support Community?

I assume the data sets all have the same number of observations and they are all ordered the same way so that the i_th observation of each data set is related to the i_th of the other data sets. If so, what you are doing is fine.

You can do the analysis in a single call: rename the variables, merge the data sets into one, and then call PROC CORR. The resulting correlation matrix is a block matrix. The block diagonal elements represent the correlations within each data set. The off-diagonal block represent correlations between data sets.

If you use PROC IML, you can just horizontally concatenate the data and take the correlation. You don't even need to rename the variables:

proc iml;

use D1;

read all var _NUM_ into X[colname=varNames];

close D1;

DSNames = {D2 D3 D4};

do i = 1 to ncol(DSNames);

use (DSNames*);*

read all var _NUM_ into D[colname=v];

close (DSNames*);*

X = X || D;

varNames = varNames || v;

end;

C = corr(X);

print C[c=varNames r=varNames];

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

08-12-2014 03:14 PM

This seems to be a tough using proc corr. Try proc anova or proc glm.