<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: proc summary calculating mean in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414852#M101680</link>
    <description>&lt;P&gt;Try this to see just how similar your two results are.&amp;nbsp; As others correctly pointed out, small floating point differences are common, expected, and do not indicate anything went wrong.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data b;
   format a best20.;
   input a;
   datalines;
0.187
0.171
0.183
0.08
;
 
proc summary data=b nway missing noprint ;
   var a;
   output out = out_b mean=mean1;
run;

proc sort data=b;by a;run;
 
proc summary data=b nway missing noprint ;
   var a;
   output out = out_c mean=mean2;
run;

data all(drop=_:);
   merge out_b out_c;
   diff = mean1 - mean2;
   format _numeric_ 20.18;
run;   

proc print; run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
  Obs                   mean1                   mean2                    diff

   1     0.155249999999990000    0.155250000000000000    -.000000000000000028
N&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Mon, 20 Nov 2017 14:37:16 GMT</pubDate>
    <dc:creator>WarrenKuhfeld</dc:creator>
    <dc:date>2017-11-20T14:37:16Z</dc:date>
    <item>
      <title>proc summary calculating mean</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414821#M101672</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm wondering why I'm getting diffrent values for mean using proc summary - example below:&lt;/P&gt;&lt;P&gt;1) when input dataset is sorted by variable 'a' then I get&amp;nbsp; mean=0.15525&lt;/P&gt;&lt;P&gt;2) when input dataset is not sorted , I get&amp;nbsp; mean=0.15524999999999&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data b;&lt;BR /&gt;format a best20.;&lt;BR /&gt;input a;&lt;BR /&gt;datalines;&lt;BR /&gt;0.187&lt;BR /&gt;0.171&lt;BR /&gt;0.183&lt;BR /&gt;0.08&lt;BR /&gt;;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/*proc sort data=b;by a;run;*/&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc summary data=b nway missing noprint ;&lt;BR /&gt;var a;&lt;BR /&gt;output out = out_b mean=mean ;&lt;BR /&gt;run;&lt;/P&gt;</description>
      <pubDate>Mon, 20 Nov 2017 13:01:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414821#M101672</guid>
      <dc:creator>m491_2</dc:creator>
      <dc:date>2017-11-20T13:01:28Z</dc:date>
    </item>
    <item>
      <title>Re: proc summary calculating mean</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414836#M101676</link>
      <description>&lt;P&gt;SAS mathematically has about 14 digits of precision, so these are the same answers. You will drive yourself crazy trying to understand the effects of machine precision.&lt;/P&gt;</description>
      <pubDate>Mon, 20 Nov 2017 14:48:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414836#M101676</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2017-11-20T14:48:51Z</dc:date>
    </item>
    <item>
      <title>Re: proc summary calculating mean</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414837#M101677</link>
      <description>&lt;P&gt;The only difference between the two results is the format.&lt;/P&gt;
&lt;P&gt;You sort dataset B where you added the &lt;STRONG&gt;format a best20.;&amp;nbsp;&lt;/STRONG&gt;&amp;nbsp;- which results into&amp;nbsp;mean=0.15524999999999.&lt;/P&gt;
&lt;P&gt;If you round that result to 8 characters, which is the default, you get the&amp;nbsp;&lt;SPAN&gt;mean=0.155250 (= 0.15525)&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 20 Nov 2017 13:56:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414837#M101677</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2017-11-20T13:56:19Z</dc:date>
    </item>
    <item>
      <title>Re: proc summary calculating mean</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414846#M101678</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;P&gt;The only difference between the two results is the format.&lt;/P&gt;&lt;P&gt;You sort dataset B where you added the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;format a best20.;&amp;nbsp;&lt;/STRONG&gt;&amp;nbsp;- which results into&amp;nbsp;mean=0.15524999999999.&lt;/P&gt;&lt;P&gt;If you round that result to 8 characters, which is the default, you get the&amp;nbsp;&lt;SPAN&gt;mean=0.155250 (= 0.15525)&lt;/SPAN&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;thanks for answer Shmuel, but:&lt;/P&gt;&lt;P&gt;a) which step is rounding it to 8 characters and why in first scenario this 'default' rounding didn't work ?&lt;/P&gt;&lt;P&gt;b) why you think there is different format? dataset B has best20. format,&amp;nbsp;after sorting there is still best20. format, and when proc summary is creating output there is again best20. format.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;final dataset 'out_b' in both scenarios has still the same format best20.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 20 Nov 2017 14:23:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414846#M101678</guid>
      <dc:creator>m491_2</dc:creator>
      <dc:date>2017-11-20T14:23:09Z</dc:date>
    </item>
    <item>
      <title>Re: proc summary calculating mean</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414852#M101680</link>
      <description>&lt;P&gt;Try this to see just how similar your two results are.&amp;nbsp; As others correctly pointed out, small floating point differences are common, expected, and do not indicate anything went wrong.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data b;
   format a best20.;
   input a;
   datalines;
0.187
0.171
0.183
0.08
;
 
proc summary data=b nway missing noprint ;
   var a;
   output out = out_b mean=mean1;
run;

proc sort data=b;by a;run;
 
proc summary data=b nway missing noprint ;
   var a;
   output out = out_c mean=mean2;
run;

data all(drop=_:);
   merge out_b out_c;
   diff = mean1 - mean2;
   format _numeric_ 20.18;
run;   

proc print; run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
  Obs                   mean1                   mean2                    diff

   1     0.155249999999990000    0.155250000000000000    -.000000000000000028
N&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 20 Nov 2017 14:37:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414852#M101680</guid>
      <dc:creator>WarrenKuhfeld</dc:creator>
      <dc:date>2017-11-20T14:37:16Z</dc:date>
    </item>
    <item>
      <title>Re: proc summary calculating mean</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414855#M101681</link>
      <description>&lt;P&gt;thanks for answer WarrenKuhfeld.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm not saying that something went wrong or difference is huge.&lt;/P&gt;&lt;P&gt;Question is why sorting has influence on &amp;nbsp;small floating point differences?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 20 Nov 2017 14:55:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414855#M101681</guid>
      <dc:creator>m491_2</dc:creator>
      <dc:date>2017-11-20T14:55:12Z</dc:date>
    </item>
    <item>
      <title>Re: proc summary calculating mean</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414857#M101682</link>
      <description>&lt;P&gt;It changes the order of the floating point arithmetic.&amp;nbsp; Try fiddling around with programs like this, and you will see that different orders give different results.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data x;
   x = 100;
   x = x + 1/10;
   x = x + 1/3;
   y = 1/10;
   y = y + 1/3;
   y = y + 100;
   diff = x - y;
   format _numeric_ 20.16;
   run;
   
proc print; run;   &lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 20 Nov 2017 15:05:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414857#M101682</guid>
      <dc:creator>WarrenKuhfeld</dc:creator>
      <dc:date>2017-11-20T15:05:00Z</dc:date>
    </item>
    <item>
      <title>Re: proc summary calculating mean</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414858#M101683</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/173924"&gt;@m491_2&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm not saying that something went wrong or difference is huge.&lt;/P&gt;
&lt;P&gt;Question is why sorting has influence on &amp;nbsp;small floating point differences?&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I doubt SAS is going to release their underlying code to us so we can see how this happens. As I said, I think the whole idea of trying to figure out why machine precision gives one answer in one situation and a different answer in another situation is not worth the time and effort.&lt;/P&gt;</description>
      <pubDate>Mon, 20 Nov 2017 15:06:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414858#M101683</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2017-11-20T15:06:01Z</dc:date>
    </item>
    <item>
      <title>Re: proc summary calculating mean</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414859#M101684</link>
      <description>&lt;P&gt;You are right.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It seems that the sort changes somehow the precision of data so that proc summary (proc means too)&lt;/P&gt;
&lt;P&gt;calulates the mean into a round value.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;By the way, I have changed one value - from 0.08 into 0.080001&lt;/P&gt;
&lt;P&gt;and got the &lt;STRONG&gt;same&lt;/STRONG&gt; mean (=0.15525025)&amp;nbsp; value before and after sort.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have no better answer.&lt;/P&gt;</description>
      <pubDate>Mon, 20 Nov 2017 15:07:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414859#M101684</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2017-11-20T15:07:32Z</dc:date>
    </item>
    <item>
      <title>Re: proc summary calculating mean</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414862#M101686</link>
      <description>&lt;P&gt;Intermediate results get stored for each sum.&amp;nbsp; They can change slightly depending on which numbers get added to which other numbers.&amp;nbsp; So yes, sorting affects the results.&lt;/P&gt;</description>
      <pubDate>Mon, 20 Nov 2017 15:15:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414862#M101686</guid>
      <dc:creator>WarrenKuhfeld</dc:creator>
      <dc:date>2017-11-20T15:15:45Z</dc:date>
    </item>
    <item>
      <title>Re: proc summary calculating mean</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414915#M101697</link>
      <description>&lt;P&gt;&lt;SPAN&gt;support.sas.com/resources/papers/proceedings11/275-2011.pdf&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&lt;A href="http://go.documentation.sas.com/?docsetId=lrcon&amp;amp;docsetTarget=p0ji1unv6thm0dn1gp4t01a1u0g6.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en&amp;nbsp;" target="_blank"&gt;http://go.documentation.sas.com/?docsetId=lrcon&amp;amp;docsetTarget=p0ji1unv6thm0dn1gp4t01a1u0g6.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en&amp;nbsp;&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Here are some sources of more information.&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;is right though; I would not spend a lot of time worrying about such things.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 20 Nov 2017 18:03:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-summary-calculating-mean/m-p/414915#M101697</guid>
      <dc:creator>WarrenKuhfeld</dc:creator>
      <dc:date>2017-11-20T18:03:37Z</dc:date>
    </item>
  </channel>
</rss>

