<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Averages within a data set in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92803#M26424</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;No, the OP's example data was already sorted.&amp;nbsp; However, to make coding even more complex, even if it weren't sorted, I would have to think that including a sort-via-hash would still be faster.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'm not recommending the code-your-own approach, just answering the original question whether data step processing is more efficient (processing wise) than other methods.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Thu, 11 Oct 2012 22:57:12 GMT</pubDate>
    <dc:creator>art297</dc:creator>
    <dc:date>2012-10-11T22:57:12Z</dc:date>
    <item>
      <title>Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92792#M26413</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;I found the following code to calculate the average (below).&amp;nbsp; I'm a beginner and would like to know if this code is optimal for datasources with over 4 million records?&amp;nbsp; The goal is to have the capability to write a data set that calculates the averages.&amp;nbsp; Again, this is my first time so I might be in the wrong forum.&amp;nbsp; Sorry.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data have; &lt;/P&gt;&lt;P&gt;input id month packs @@;&lt;/P&gt;&lt;P&gt;datalines&lt;/P&gt;&lt;P&gt;1 1 10 1 2 20 1 8 99 1 5 30 1 1 30&lt;/P&gt;&lt;P&gt;2 1 100 2 3 200 2 7 999 2 3 300 2 8 888&lt;/P&gt;&lt;P&gt;3 10 999 3 11 999&lt;/P&gt;&lt;P&gt;;&lt;/P&gt;&lt;P&gt;run;&lt;BR /&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;data must;&lt;/P&gt;&lt;P&gt;total = 0;&lt;/P&gt;&lt;P&gt;n = 0;&lt;/P&gt;&lt;P&gt;do until(last.id)&lt;/P&gt;&lt;P&gt;set have;&lt;/P&gt;&lt;P&gt; by id&lt;/P&gt;&lt;P&gt; total + month;&lt;/P&gt;&lt;P&gt; n+ month;&lt;/P&gt;&lt;P&gt;end;&lt;/P&gt;&lt;P&gt;if n then average = total / n;&lt;/P&gt;&lt;P&gt; do until (last.id)&lt;/P&gt;&lt;P&gt; set have;&lt;/P&gt;&lt;P&gt; by id;&lt;/P&gt;&lt;P&gt;output;&lt;/P&gt;&lt;P&gt;end;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 11 Oct 2012 19:16:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92792#M26413</guid>
      <dc:creator>SAS_dj1999</dc:creator>
      <dc:date>2012-10-11T19:16:18Z</dc:date>
    </item>
    <item>
      <title>Re: Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92793#M26414</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Is there a particular reason that you choose not to use Proc summary or Proc SQL?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Haikuo&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 11 Oct 2012 19:24:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92793#M26414</guid>
      <dc:creator>Haikuo</dc:creator>
      <dc:date>2012-10-11T19:24:28Z</dc:date>
    </item>
    <item>
      <title>Re: Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92794#M26415</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Depends on if your data is sorted and you need it re-merged. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If the data is not sorted and you don't need it remerged then this is not the most efficient way.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 11 Oct 2012 19:29:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92794#M26415</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2012-10-11T19:29:16Z</dc:date>
    </item>
    <item>
      <title>Re: Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92795#M26416</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;aren't total and n holding the same value throughout the entire program?&lt;/P&gt;&lt;P&gt;wouldn't he be better off using proc means mean?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 11 Oct 2012 19:52:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92795#M26416</guid>
      <dc:creator>Tal</dc:creator>
      <dc:date>2012-10-11T19:52:54Z</dc:date>
    </item>
    <item>
      <title>Re: Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92796#M26417</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hai.kuo,&lt;/P&gt;&lt;P&gt;I'd like to have the flexibility view various variables like:&lt;/P&gt;&lt;P&gt;Employee&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Department&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Salary&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Type&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Number of Employees&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Employee Average Salary&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Last year's Average Salary&amp;nbsp;&amp;nbsp; Difference/Ratio change&lt;/P&gt;&lt;P&gt;John Doe&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Finance&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 25000&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Mid Manager&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 20&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 20000&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 17000&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Plus, using a data set procedure allows me to double check my work against a SQL procedure.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Vincent&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 11 Oct 2012 21:03:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92796#M26417</guid>
      <dc:creator>SAS_dj1999</dc:creator>
      <dc:date>2012-10-11T21:03:38Z</dc:date>
    </item>
    <item>
      <title>Re: Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92797#M26418</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;As a beginner, an important lesson to learn is NOT to omit the necessary semi-colons.&amp;nbsp; Your code wouldn't even run as posted.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Just as important, do you really want to calculate the average month or are you really trying to calculate the average packs?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regardless, given that your data is already in ID order, yes it will run quicker than any of the procs.&amp;nbsp; However, the caveat is that you have to type all of the code.&amp;nbsp; When you are doing either multiple and/or more complex calculations, writing the code will probably take you longer than the running time you will save.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 11 Oct 2012 21:06:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92797#M26418</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2012-10-11T21:06:52Z</dc:date>
    </item>
    <item>
      <title>Re: Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92798#M26419</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Arthur,&lt;/P&gt;&lt;P&gt;This was just an example.&amp;nbsp; I was more so looking to see if I could run code like this for a datasource that contain millions of records.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 11 Oct 2012 21:29:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92798#M26419</guid>
      <dc:creator>SAS_dj1999</dc:creator>
      <dc:date>2012-10-11T21:29:04Z</dc:date>
    </item>
    <item>
      <title>Re: Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92799#M26420</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Prior to posting my response, I expanded your example to have 4.5 million records.&amp;nbsp; Your code (with semicolons added) ran faster than proc means/summary and proc sql.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 11 Oct 2012 21:34:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92799#M26420</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2012-10-11T21:34:32Z</dc:date>
    </item>
    <item>
      <title>Re: Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92800#M26421</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Arthur,&lt;/P&gt;&lt;P&gt;Just to clarify your response, the code above does run faster?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 11 Oct 2012 21:42:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92800#M26421</guid>
      <dc:creator>SAS_dj1999</dc:creator>
      <dc:date>2012-10-11T21:42:44Z</dc:date>
    </item>
    <item>
      <title>Re: Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92801#M26422</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Yes, the code you originally proposed runs faster.&amp;nbsp; The problem is the time required to write it which, for anything complicated, could easily take longer than the time saved and introduce more opportunity for error.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 11 Oct 2012 22:24:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92801#M26422</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2012-10-11T22:24:29Z</dc:date>
    </item>
    <item>
      <title>Re: Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92802#M26423</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Including a sort?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 11 Oct 2012 22:39:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92802#M26423</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2012-10-11T22:39:28Z</dc:date>
    </item>
    <item>
      <title>Re: Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92803#M26424</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;No, the OP's example data was already sorted.&amp;nbsp; However, to make coding even more complex, even if it weren't sorted, I would have to think that including a sort-via-hash would still be faster.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'm not recommending the code-your-own approach, just answering the original question whether data step processing is more efficient (processing wise) than other methods.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 11 Oct 2012 22:57:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92803#M26424</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2012-10-11T22:57:12Z</dc:date>
    </item>
    <item>
      <title>Re: Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92804#M26425</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Is your data originally in a SAS data set, or are you first reading in a sorted raw data file?&amp;nbsp; If you are starting with the sorted raw data file, you could modify the first DATA step to produce your averages.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 11 Oct 2012 23:32:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92804#M26425</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2012-10-11T23:32:07Z</dc:date>
    </item>
    <item>
      <title>Re: Averages within a data set</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92805#M26426</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;First.&amp;nbsp; Arthur, Thanks!&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Astounding, my data is a SQL server.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 12 Oct 2012 14:23:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Averages-within-a-data-set/m-p/92805#M26426</guid>
      <dc:creator>SAS_dj1999</dc:creator>
      <dc:date>2012-10-12T14:23:26Z</dc:date>
    </item>
  </channel>
</rss>

