<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: calculate average in data step in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592438#M169868</link>
    <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data a;
input product$ sales day$9. week;
cards;
a 100 01-jan-19 1
a 20 01-jan-19 1
b 200 02-jan-19 1
h 100 09-jan-19 2
i 300 09-jan-19 2
;
run;

data want;
 do _n_=1 by 1 until(last.week);
  set a;
  by product week;
  sum=sum(sales,sum);
 end;
 avg=sum/_n_;
run;

proc print noobs;run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Sat, 28 Sep 2019 22:26:10 GMT</pubDate>
    <dc:creator>novinosrin</dc:creator>
    <dc:date>2019-09-28T22:26:10Z</dc:date>
    <item>
      <title>calculate average in data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592430#M169862</link>
      <description>&lt;P&gt;Hi im calculating average using data step method but unable to do so.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data :&lt;/P&gt;&lt;P&gt;data a;&lt;BR /&gt;input product$ sales day$9. week;&lt;BR /&gt;cards;&lt;BR /&gt;a 100 01-jan-19 1&lt;BR /&gt;a 20 01-jan-19 1&lt;BR /&gt;b 200 02-jan-19 1&lt;BR /&gt;h 100 09-jan-19 2&lt;BR /&gt;i 300 09-jan-19 2&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;however i can do it using proc means:&lt;/P&gt;&lt;P&gt;proc means data=a nway;&lt;BR /&gt;var sales;&lt;BR /&gt;class product;&lt;BR /&gt;output out=s mean=avg;&lt;BR /&gt;;run;&lt;BR /&gt;Results:&lt;BR /&gt;product _type_ _freq_ avg&lt;BR /&gt;a 1 2 60&lt;BR /&gt;b 1 1 200&lt;BR /&gt;h 1 1 100&lt;BR /&gt;i 1 1 300&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to get this output using data step way:&lt;/P&gt;&lt;P&gt;proc sort data=a ; by product week; run;&lt;BR /&gt;data want;&lt;BR /&gt;set a;&lt;BR /&gt;by product week;&lt;BR /&gt;if first.product then sumsales=0;&lt;BR /&gt;sumsales+sales;&lt;BR /&gt;&lt;BR /&gt;if last.product;&lt;BR /&gt;avg=mean(sumsales);&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;the mean function is not working and returning the exact values of sumsales column&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Plz help!!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 28 Sep 2019 21:54:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592430#M169862</guid>
      <dc:creator>adi121</dc:creator>
      <dc:date>2019-09-28T21:54:41Z</dc:date>
    </item>
    <item>
      <title>Re: calculate average in data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592433#M169865</link>
      <description>&lt;P&gt;The MEAN function averages variables, not rows. This should do it:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
set a;
by product week;
if first.product then do;
  sumsales=0;
  numsales = 0;
end;
sumsales+sales;
numsales+1;
if last.product then avg=sumsales/numsales;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 28 Sep 2019 22:01:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592433#M169865</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2019-09-28T22:01:40Z</dc:date>
    </item>
    <item>
      <title>Re: calculate average in data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592438#M169868</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data a;
input product$ sales day$9. week;
cards;
a 100 01-jan-19 1
a 20 01-jan-19 1
b 200 02-jan-19 1
h 100 09-jan-19 2
i 300 09-jan-19 2
;
run;

data want;
 do _n_=1 by 1 until(last.week);
  set a;
  by product week;
  sum=sum(sales,sum);
 end;
 avg=sum/_n_;
run;

proc print noobs;run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 28 Sep 2019 22:26:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592438#M169868</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-09-28T22:26:10Z</dc:date>
    </item>
    <item>
      <title>Re: calculate average in data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592439#M169869</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13976"&gt;@SASKiwi&lt;/a&gt;:&lt;/P&gt;
&lt;P&gt;Like. But to emulate the mean stat to a tee, i.e. ignore the missing values:&lt;/P&gt;
&lt;LI-CODE lang="sas"&gt;numsales &lt;SPAN class="token operator"&gt;+ N(sales) &lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;is more precise than:&lt;/P&gt;
&lt;LI-CODE lang="sas"&gt;numsales&lt;SPAN class="token operator"&gt;+&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;1&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;Also, I find that tasks of this sort yield better to the DoW-loop, e.g.:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want ;                              
  do until (last.product) ;              
    set a ;                              
    by product ;                         
    numsales = sum (numsales, N(sales)) ;
    sumsales = sum (sumsales, sales) ;   
  end ;                                  
  mean = divide (sumsales, numsales) ;   
run ;                                    
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;because it (a) uses the implicit DATA step actions to reinitialize the aggregates before each BY group and (b) uses one tests for LAST.product instead of two for FIRST.product and LAST.product. The implicit initialization of NUMSALES to missing at the top of the step also guards against the division by 0 in case when all SALES in a BY group are missing.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 28 Sep 2019 22:39:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592439#M169869</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-09-28T22:39:13Z</dc:date>
    </item>
    <item>
      <title>Re: calculate average in data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592441#M169870</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/21262"&gt;@hashman&lt;/a&gt; - Agreed, but I'd still use PROC MEANS or SQL unless there was some extra requirement justifying doing it on-the-fly in a DATA step. &lt;/P&gt;</description>
      <pubDate>Sat, 28 Sep 2019 22:51:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592441#M169870</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2019-09-28T22:51:28Z</dc:date>
    </item>
    <item>
      <title>Re: calculate average in data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592446#M169872</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13976"&gt;@SASKiwi&lt;/a&gt;:&lt;/P&gt;
&lt;P&gt;'Nuff said! especially since with those, one doesn't have to have missing values explicitly.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 28 Sep 2019 23:56:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592446#M169872</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-09-28T23:56:32Z</dc:date>
    </item>
    <item>
      <title>Re: calculate average in data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592466#M169880</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;although all 3 answers are giving desired output but &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/21262"&gt;@hashman&lt;/a&gt;&amp;nbsp;code is simple and giving only 1 row per product .&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for the help.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 29 Sep 2019 05:38:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592466#M169880</guid>
      <dc:creator>adi121</dc:creator>
      <dc:date>2019-09-29T05:38:30Z</dc:date>
    </item>
    <item>
      <title>Re: calculate average in data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592467#M169881</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/211005"&gt;@adi121&lt;/a&gt;:&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt;'s code is basically the same (sans handling the missing values), and it also gives 1 row per product.&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13976"&gt;@SASKiwi&lt;/a&gt;&amp;nbsp;code will also output 1 row per product if you add:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;  if last.product ;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;before the RUN statement.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D.&amp;nbsp; &amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 29 Sep 2019 06:00:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592467#M169881</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-09-29T06:00:52Z</dc:date>
    </item>
    <item>
      <title>Re: calculate average in data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592476#M169882</link>
      <description>&lt;P&gt;Strongly agree with the above: do it in PROC MEANS or PROC SUMMARY instead of writing your own data step code. As we have seen many times in this forum, writing your own code can easily produce wrong results in the presence of missing values (although &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/21262"&gt;@hashman&lt;/a&gt; has avoided this issue, the answers by &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt; and &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13976"&gt;@SASKiwi&lt;/a&gt; do not handle missing values properly). PROC MEANS and PROC SUMMARY handle missing values properly, all the time, 24/7. Ice Ice Baby.&lt;/P&gt;</description>
      <pubDate>Sun, 29 Sep 2019 12:00:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592476#M169882</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-09-29T12:00:59Z</dc:date>
    </item>
    <item>
      <title>Re: calculate average in data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592497#M169888</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/211005"&gt;@adi121&lt;/a&gt;:&lt;/P&gt;
&lt;P&gt;As&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13976"&gt;@SASKiwi&lt;/a&gt;&amp;nbsp;pointed out, you cannot apply aggregate functions like MEAN across rows in the DATA step. The environment where they naturally work across rows is SQL, so If you want to do that, use it instead; for example:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have ;                   
  input product :$1. sales ;  
cards;                        
A 100                         
A   .                         
A  20                         
B 200                         
B 300                         
B   .                         
B 500                         
H 100                         
H 200                         
I 300                         
;                             
run ;                         
                              
proc sql ;                    
  create table want_sql as    
  select product              
       , N    (sales) as N    
       , sum  (sales) as sum  
       , mean (sales) as mean 
  from   have group product   
  ;                           
quit ;                        
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;With respect to the aggregate functions, SQL handles missing values as properly as proc MEANS.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D.&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 29 Sep 2019 19:04:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592497#M169888</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-09-29T19:04:31Z</dc:date>
    </item>
    <item>
      <title>Re: calculate average in data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592498#M169889</link>
      <description>&lt;P&gt;Guru&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/21262"&gt;@hashman&lt;/a&gt;&amp;nbsp; Neat. Would you &lt;STRONG&gt;rank&amp;nbsp;&lt;/STRONG&gt;SQL approach slightly higher than datastep if you had SQL pass through ACCESS and also make it the gold standard? The reason being the performance is prolly not compromised either?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Trust me for people like you and&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13976"&gt;@SASKiwi&lt;/a&gt;&amp;nbsp; the question might seem obvious having worked and seen tons but for the benefit of the folks&lt;EM&gt;(like me &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; )&lt;/EM&gt; who think they know everything yet only know to use SAS like a playstation/Xbox console and know nothing beyond a few notes will help.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;PS Only when you have a moment and when you can. Cheers!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also I wanted to reach out offline for something else if you remember. I shall touch base on that later. Bye for now!&lt;/P&gt;</description>
      <pubDate>Sun, 29 Sep 2019 19:15:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592498#M169889</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-09-29T19:15:26Z</dc:date>
    </item>
    <item>
      <title>Re: calculate average in data step</title>
      <link>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592509#M169893</link>
      <description>&lt;P&gt;Disciple&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt;:&lt;/P&gt;
&lt;P&gt;Naturally, SQL is the choice when getting data from external data bases and/or aggregating it. Normally I'd use explicit pass-through, though I've seen cases when the SAS engine working via implicit pass-through was smarter than me in terms of translating my SAS SQL into the data base specific SQL. I think it's also smart enough to translate simple DATA step logic as well. However, I'm leery of setting any "gold standards" when it comes to programming - there're too many confounding factors and specific circumstances.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Kind regards&lt;/P&gt;
&lt;P&gt;Paul D.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;p.s. Yes, I do remember; thanks for reminding me.&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 29 Sep 2019 22:37:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/calculate-average-in-data-step/m-p/592509#M169893</guid>
      <dc:creator>hashman</dc:creator>
      <dc:date>2019-09-29T22:37:36Z</dc:date>
    </item>
  </channel>
</rss>

