<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Big Data Module 2 MapReducers in SAS Academy for Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-MapReducers/m-p/591045#M466</link>
    <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/66330"&gt;@odesh&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Not sure how the answer could be narrower. Hive Sort By results in a sort of rows within a reducer. If you've got more than one reducer then the data isn't sorted over the whole file but only within the chunks per reducer.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 23 Sep 2019 20:59:14 GMT</pubDate>
    <dc:creator>Patrick</dc:creator>
    <dc:date>2019-09-23T20:59:14Z</dc:date>
    <item>
      <title>Big Data Module 2 MapReducers</title>
      <link>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-MapReducers/m-p/590670#M457</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;Please refer to the attached question.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am not sure what the suggested answer means by "SORT BY provides&amp;nbsp;reducer level sorting instead of job level sorting".&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I know what ORDER BY means in the context of PROC SQL and DBMS's in general.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;&lt;P&gt;Odesh.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 21 Sep 2019 23:36:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-MapReducers/m-p/590670#M457</guid>
      <dc:creator>odesh</dc:creator>
      <dc:date>2019-09-21T23:36:23Z</dc:date>
    </item>
    <item>
      <title>Re: Big Data Module 2 MapReducers</title>
      <link>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-MapReducers/m-p/590680#M458</link>
      <description>&lt;P&gt;This is not SAS but Hive SQL syntax and you would need to ask such a question in a Hadoop/Hive forum. But Googling a bit here what's&amp;nbsp;&lt;A href="https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy" target="_self"&gt;documented&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;&lt;EM&gt;Difference between Sort By and Order By&lt;/EM&gt;&lt;/STRONG&gt;&lt;BR /&gt;&lt;EM&gt;Hive supports SORT BY which sorts the data per reducer. The difference between "order by" and "sort by" is that the former guarantees total order in the output while the latter only guarantees ordering of the rows within a reducer. If there are more than one reducer, "sort by" may give partially ordered final results.&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As far as I understand things Hive SQL gets translated into MapReduce for execution. It appears that Hive Sort By and Order By will result in different MapReduce code logic.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://stackoverflow.com/questions/29959845/understanding-the-mapper-and-reducer-in-a-hive-database" target="_self"&gt;Understanding the mapper and reducer in a HIVE database&lt;/A&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 22 Sep 2019 01:39:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-MapReducers/m-p/590680#M458</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2019-09-22T01:39:46Z</dc:date>
    </item>
    <item>
      <title>Re: Big Data Module 2 MapReducers</title>
      <link>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-MapReducers/m-p/591004#M463</link>
      <description>Thanks Patrick. Yes Google does access the general SAS documentation. I was&lt;BR /&gt;hoping for a more narrow focussed answer.&lt;BR /&gt;&lt;BR /&gt;But thanks again .&lt;BR /&gt;Odesh.&lt;BR /&gt;</description>
      <pubDate>Mon, 23 Sep 2019 17:32:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-MapReducers/m-p/591004#M463</guid>
      <dc:creator>odesh</dc:creator>
      <dc:date>2019-09-23T17:32:32Z</dc:date>
    </item>
    <item>
      <title>Re: Big Data Module 2 MapReducers</title>
      <link>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-MapReducers/m-p/591045#M466</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/66330"&gt;@odesh&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Not sure how the answer could be narrower. Hive Sort By results in a sort of rows within a reducer. If you've got more than one reducer then the data isn't sorted over the whole file but only within the chunks per reducer.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 23 Sep 2019 20:59:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-MapReducers/m-p/591045#M466</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2019-09-23T20:59:14Z</dc:date>
    </item>
    <item>
      <title>Re: Big Data Module 2 MapReducers</title>
      <link>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-MapReducers/m-p/591147#M467</link>
      <description>JUst making sure that I understand the difference between ORDER BY and SORT&lt;BR /&gt;BY in HIveQL:&lt;BR /&gt;&lt;BR /&gt;1. ORDER BY sorts the entire result set ( which can be be very resource&lt;BR /&gt;intensive with a large result set)&lt;BR /&gt;2. SORT BY sorts within each reducer which should be more efficient in&lt;BR /&gt;terms of processing time.&lt;BR /&gt;&lt;BR /&gt;Am I correct ?&lt;BR /&gt;&lt;BR /&gt;Thanks.&lt;BR /&gt;Odesh.&lt;BR /&gt;</description>
      <pubDate>Tue, 24 Sep 2019 12:30:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-MapReducers/m-p/591147#M467</guid>
      <dc:creator>odesh</dc:creator>
      <dc:date>2019-09-24T12:30:32Z</dc:date>
    </item>
    <item>
      <title>Re: Big Data Module 2 MapReducers</title>
      <link>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-MapReducers/m-p/591314#M469</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/66330"&gt;@odesh&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;JUst making sure that I understand the difference between ORDER BY and SORT&lt;BR /&gt;BY in HIveQL:&lt;BR /&gt;&lt;BR /&gt;1. ORDER BY sorts the entire result set ( which can be be very resource&lt;BR /&gt;intensive with a large result set)&lt;BR /&gt;2. SORT BY sorts within each reducer which should be more efficient in&lt;BR /&gt;terms of processing time.&lt;BR /&gt;&lt;BR /&gt;Am I correct ?&lt;BR /&gt;&lt;BR /&gt;Thanks.&lt;BR /&gt;Odesh.&lt;BR /&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/66330"&gt;@odesh&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Yes, that's how I understand what's explained under the links I've posted earlier.&lt;/P&gt;</description>
      <pubDate>Tue, 24 Sep 2019 21:46:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-MapReducers/m-p/591314#M469</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2019-09-24T21:46:08Z</dc:date>
    </item>
  </channel>
</rss>

