Solved
New Contributor
Posts: 3

# Pearson correlation coefficient in SAS vs Excel

Hi all,

Using the same dataset with no missing values, I get a different value for the Pearson correlation coefficient from SAS Proc Corr than my colleague and I get using the Correl function in Excel.  Does anyone know what might be the reason?

Thanks

proc corr data=all pearson;
var bb;
with aa;

run;

Accepted Solutions
Solution
‎10-25-2016 11:18 AM
Super User
Posts: 11,752

## Re: Pearson correlation coefficient in SAS vs Excel

Does your Excel data have "missing" values? If so, how are they represented. Sometimes folks use a 0 or -99 or such in Excel which actually gets included in calculations.

All Replies
SAS Super FREQ
Posts: 3,831

## Re: Pearson correlation coefficient in SAS vs Excel

Can you provide an example? The following SAS data set gives the same results up to the format being used.

SAS Output

 0.99891

Excel Output

 0.998906

Except for the fact that Excel prints more digits, this is the same result.

SAS Code:

``````data all;
input bb aa;
datalines;
3	9
2	.
4	12
. 	15
6	17
;

proc corr data=all pearson;
var bb;
with aa;
run;``````
New Contributor
Posts: 3

## Re: Pearson correlation coefficient in SAS vs Excel

Thank you for looking into my issue.  We figured out from the below that there was a -99 in the excel file that was causing the problem.

Solution
‎10-25-2016 11:18 AM
Super User
Posts: 11,752

## Re: Pearson correlation coefficient in SAS vs Excel

Does your Excel data have "missing" values? If so, how are they represented. Sometimes folks use a 0 or -99 or such in Excel which actually gets included in calculations.

New Contributor
Posts: 3

## Re: Pearson correlation coefficient in SAS vs Excel

Thanks for the suggestion.  Upon further examination, indeed it was the -99 that was causing the problem.  Thanks for your help!

SAS Super FREQ
Posts: 3,831

## Re: Pearson correlation coefficient in SAS vs Excel

Although using -99 to indicate a missing value was popular in the 1970s, it is usually discouraged nowadays because of problems like this. If the spreadsheet contains empty cells, I think most Excel statistical functions correctly treat the empty cell as a missing value.

☑ This topic is solved.

Discussion stats
• 5 replies
• 419 views
• 2 likes
• 3 in conversation