08-11-2014 01:03 AM
I have four data sets D1, D2, D3 and D4 which have same variable names. These data sets share variables of same name. Goal is to perform correlation between variables of one data with that of another data set.
Example: if Dataset has data D1 has variables: var1 var2 var3
and dataset D2 has variables: var1 var2 var3
Inorder to perform correlation between the variables var1 to var3 with those of dataset D2, I renamed the variables of data set D2 as var1_d2 var2_d2 var3_d2. Then merge D1 and D2 and the final data set is D_New
Using proc corr data = D_New;
with var1_d2 var2_d2 var_d3;
Can anyone please suggest any other way of obtaining correlations across data sets?
08-11-2014 11:09 AM
If there are 3gb or larger than this size of data, the problem I the code will take long time to run. I was thinking how I could make it efficient. Please advise me if you have any suggestion?
08-11-2014 04:21 PM
I agree with you but if have a data whose story is not known, then what would you recommend. Won't the idea of correlation be better for all the variables ?