Home
- /
Programming
- /
SAS Procedures
- /
Correlation with summed and log transformed variables

This topic is solved and locked.
Posted 08-09-2018 08:17 PM
Hi

Using SAS studio on Mac and have a few questions using Correlation task:

I have a total of 8 analysis variables (below) that I wish to correlate with several other variables.

MRI_mean

ser_lut1

ser_lut2

ser_lut3

ser_lut4

ser_lut5

ser_zea

diet_lut_zea

In addition, I also want to correlate the **sum** of ser_lut1-5 and call it "ser_total_lut" and the **sum** of ser_lut1 and ser_zea and call it "ser_lut1_zea". After submitting the program for correlation analysis. I went to the edit the code and tried the function

**ser_total_lut=sum (ser_lut1, ser_lut2, ser_lut3, ser_lut4, ser_lut5) **and

**ser_lut1_zea=sum (ser_lut1, ser_zea)**

after the** var **statement like below:

proc corr data=LIBREF.FILENAME pearson spearman rank plots=matrix(histogram);

var MRI_mean ser_lut1 ser_lut2 ser_lut3 ser_lut4 ser_lut5 **ser_total_lut=sum (ser_lut1, ser_lut2, ser_lut3, ser_lut4, ser_lut5) ser_lut1_zea=sum (ser_lut1, ser_zea) **ser_zea diet_lut_zea;

with OCL1 OCL2 OCL3 OCL4;

run;

I wasn't sure whether or not to add commas in between variables in the parentheses so I ran with and without and got the same error message: Any help on what I'm doing wrong?

ERROR 22-322: Syntax error, expecting one of the following: a name, ;, -, :, _ALL_, _CHARACTER_, _CHAR_, _NUMERIC_.

ERROR 200-322: The symbol is not recognized and will be ignored.

Lastly, I wish to log transform 2 variables (ser_lut1 and ser_lut2) before running correlation analysis. How would I do there? Where does the code go? Sorry for all the questions...still a newbie here.

THANKS! J

You must create the new variables in a new dataset before calling proc corr. The syntax is

newVar = sum(var1, var2, var3);

or equivalently

newVar = sum(of var1 var2 var3);

PG

Thanks for the assistance. I tried putting the syntax

**ser_total_lut=sum (ser_lut1, ser_lut2, ser_lut3, ser_lut4, ser_lut5);**

**ser_lut1_zea=sum (ser_lut1, ser_zea);**

before the proc corr step as seen below

ods noproctitle;

ods graphics / imagemap=on;

ser_total_lut=sum (ser_lut1, ser_lut2, ser_lut3, ser_lut4, ser_lut5);

ser_lut1_zea=sum (ser_lut1, ser_zea);

proc corr data=WORK.BGA_88 pearson spearman nosimple

plots=matrix(histogram);

var ser_total_lut ser_lut1_zea var1 var2;

with var9 var10;

run;

but there was an ERROR: Statement is not valid or it is used out of proper order.

?

You must create the new variables __in a new dataset__ before calling proc corr.

```
/* Create a new dataset from BGA_88 called BGA_88_new */
data BGA_88_new;
set BGA_88;
ser_total_lut=sum (ser_lut1, ser_lut2, ser_lut3, ser_lut4, ser_lut5);
ser_lut1_zea=sum (ser_lut1, ser_zea);
run;
/* Call proc corr with the new dataset */
proc corr data=BGA_88_new pearson spearman nosimple
plots=matrix(histogram);
var ser_total_lut ser_lut1_zea var1 var2;
with var9 var10;
run;
```

PG

Great that worked, thank you! I'm still a novice at SAS and trying learning all the terminology.

One more thing, could you assist with the syntax for log transforming variables to be used in correlation analysis?

ex:

`ser_lut1`

`ser_lut2`

Just the same, add them to your new dataset

```
/* Create a new dataset from BGA_88 called BGA_88_new */
data BGA_88_new;
set BGA_88;
ser_total_lut=sum (ser_lut1, ser_lut2, ser_lut3, ser_lut4, ser_lut5);
ser_lut1_zea=sum (ser_lut1, ser_zea);
log_ser_lut1 = log(ser_lut1);
log_ser_lut2 = log(ser_lut2);
run;
```

PG

