BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JenniferB
Calcite | Level 5

Hi all,

Using the same dataset with no missing values, I get a different value for the Pearson correlation coefficient from SAS Proc Corr than my colleague and I get using the Correl function in Excel.  Does anyone know what might be the reason? 

Thanks

 

proc corr data=all pearson;
var bb;
with aa;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Does your Excel data have "missing" values? If so, how are they represented. Sometimes folks use a 0 or -99 or such in Excel which actually gets included in calculations.

View solution in original post

5 REPLIES 5
Rick_SAS
SAS Super FREQ

Can you provide an example? The following SAS data set gives the same results up to the format being used. 

SAS Output

0.99891

Excel Output

0.998906107

 

Except for the fact that Excel prints more digits, this is the same result.

Excel directions: https://support.office.com/en-us/article/CORREL-function-995dcef7-0c0a-4bed-a3fb-239d7b68ca92

SAS Code:

data all;
input bb aa;
datalines;
3	9
2	. 
4	12
. 	15
6	17
;

proc corr data=all pearson;
var bb;
with aa;
run;
JenniferB
Calcite | Level 5

Thank you for looking into my issue.  We figured out from the below that there was a -99 in the excel file that was causing the problem.

ballardw
Super User

Does your Excel data have "missing" values? If so, how are they represented. Sometimes folks use a 0 or -99 or such in Excel which actually gets included in calculations.

JenniferB
Calcite | Level 5

Thanks for the suggestion.  Upon further examination, indeed it was the -99 that was causing the problem.  Thanks for your help!

Rick_SAS
SAS Super FREQ

Glad you figured it out.

 

Although using -99 to indicate a missing value was popular in the 1970s, it is usually discouraged nowadays because of problems like this. If the spreadsheet contains empty cells, I think most Excel statistical functions correctly treat the empty cell as a missing value.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 2720 views
  • 2 likes
  • 3 in conversation