New Contributor
Posts: 3

# proc summary calculating mean

Hi all,

I'm wondering why I'm getting diffrent values for mean using proc summary - example below:

1) when input dataset is sorted by variable 'a' then I get  mean=0.15525

2) when input dataset is not sorted , I get  mean=0.15524999999999

data b;
format a best20.;
input a;
datalines;
0.187
0.171
0.183
0.08
;run;

/*proc sort data=b;by a;run;*/

proc summary data=b nway missing noprint ;
var a;
output out = out_b mean=mean ;
run;

Posts: 3,294

## Re: proc summary calculating mean

[ Edited ]

SAS mathematically has about 14 digits of precision, so these are the same answers. You will drive yourself crazy trying to understand the effects of machine precision.

--
Paige Miller
Posts: 1,849

## Re: proc summary calculating mean

The only difference between the two results is the format.

You sort dataset B where you added the format a best20.;  - which results into mean=0.15524999999999.

If you round that result to 8 characters, which is the default, you get the mean=0.155250 (= 0.15525)

New Contributor
Posts: 3

## Re: proc summary calculating mean

The only difference between the two results is the format.

You sort dataset B where you added the format a best20.;  - which results into mean=0.15524999999999.

If you round that result to 8 characters, which is the default, you get the mean=0.155250 (= 0.15525)

thanks for answer Shmuel, but:

a) which step is rounding it to 8 characters and why in first scenario this 'default' rounding didn't work ?

b) why you think there is different format? dataset B has best20. format, after sorting there is still best20. format, and when proc summary is creating output there is again best20. format.

final dataset 'out_b' in both scenarios has still the same format best20.

SAS Super FREQ
Posts: 508

## Re: proc summary calculating mean

Try this to see just how similar your two results are.  As others correctly pointed out, small floating point differences are common, expected, and do not indicate anything went wrong.

``````data b;
format a best20.;
input a;
datalines;
0.187
0.171
0.183
0.08
;

proc summary data=b nway missing noprint ;
var a;
output out = out_b mean=mean1;
run;

proc sort data=b;by a;run;

proc summary data=b nway missing noprint ;
var a;
output out = out_c mean=mean2;
run;

data all(drop=_:);
merge out_b out_c;
diff = mean1 - mean2;
format _numeric_ 20.18;
run;

proc print; run;``````
``````
Obs                   mean1                   mean2                    diff

1     0.155249999999990000    0.155250000000000000    -.000000000000000028
N``````
New Contributor
Posts: 3

## Re: proc summary calculating mean

Posted in reply to WarrenKuhfeld

thanks for answer WarrenKuhfeld.

I'm not saying that something went wrong or difference is huge.

Question is why sorting has influence on  small floating point differences?

SAS Super FREQ
Posts: 508

## Re: proc summary calculating mean

It changes the order of the floating point arithmetic.  Try fiddling around with programs like this, and you will see that different orders give different results.

``````data x;
x = 100;
x = x + 1/10;
x = x + 1/3;
y = 1/10;
y = y + 1/3;
y = y + 100;
diff = x - y;
format _numeric_ 20.16;
run;

proc print; run;   ``````
Posts: 3,294

## Re: proc summary calculating mean

m491_2 wrote:

I'm not saying that something went wrong or difference is huge.

Question is why sorting has influence on  small floating point differences?

I doubt SAS is going to release their underlying code to us so we can see how this happens. As I said, I think the whole idea of trying to figure out why machine precision gives one answer in one situation and a different answer in another situation is not worth the time and effort.

--
Paige Miller
Posts: 1,849

## Re: proc summary calculating mean

You are right.

It seems that the sort changes somehow the precision of data so that proc summary (proc means too)

calulates the mean into a round value.

By the way, I have changed one value - from 0.08 into 0.080001

and got the same mean (=0.15525025)  value before and after sort.

I have no better answer.

SAS Super FREQ
Posts: 508

## Re: proc summary calculating mean

Intermediate results get stored for each sum.  They can change slightly depending on which numbers get added to which other numbers.  So yes, sorting affects the results.

SAS Super FREQ
Posts: 508

## Re: proc summary calculating mean

Posted in reply to WarrenKuhfeld

support.sas.com/resources/papers/proceedings11/275-2011.pdf

http://go.documentation.sas.com/?docsetId=lrcon&docsetTarget=p0ji1unv6thm0dn1gp4t01a1u0g6.htm&docset...

Here are some sources of more information. @PaigeMiller is right though; I would not spend a lot of time worrying about such things.

Discussion stats
• 10 replies
• 187 views
• 3 likes
• 4 in conversation