<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Big Data Module 2 - HIVE QL Syntax  Select statement in SAS Academy for Data Science</title>
    <link>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-HIVE-QL-Syntax-Select-statement/m-p/583762#M427</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;Two (2) questions about the general HIVE QL&amp;nbsp; SELECT statement ( please see attachment):&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. Why SELECT ALL ?&amp;nbsp; Should this not be SELECT * ...&amp;nbsp; ?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. Is the usual ORDER BY&amp;nbsp; now accounted for by A COMBINATION of 3 HIVE QL constructs ?&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;(CLUSTER BY, DISTRIBUTE BY AND SORT BY ) ?&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;Parsimony ?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;&lt;P&gt;Odesh.&lt;/P&gt;</description>
    <pubDate>Sun, 25 Aug 2019 15:39:36 GMT</pubDate>
    <dc:creator>odesh</dc:creator>
    <dc:date>2019-08-25T15:39:36Z</dc:date>
    <item>
      <title>Big Data Module 2 - HIVE QL Syntax  Select statement</title>
      <link>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-HIVE-QL-Syntax-Select-statement/m-p/583762#M427</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;Two (2) questions about the general HIVE QL&amp;nbsp; SELECT statement ( please see attachment):&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. Why SELECT ALL ?&amp;nbsp; Should this not be SELECT * ...&amp;nbsp; ?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. Is the usual ORDER BY&amp;nbsp; now accounted for by A COMBINATION of 3 HIVE QL constructs ?&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;(CLUSTER BY, DISTRIBUTE BY AND SORT BY ) ?&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;Parsimony ?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks.&lt;/P&gt;&lt;P&gt;Odesh.&lt;/P&gt;</description>
      <pubDate>Sun, 25 Aug 2019 15:39:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-HIVE-QL-Syntax-Select-statement/m-p/583762#M427</guid>
      <dc:creator>odesh</dc:creator>
      <dc:date>2019-08-25T15:39:36Z</dc:date>
    </item>
    <item>
      <title>Re: Big Data Module 2 - HIVE QL Syntax  Select statement</title>
      <link>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-HIVE-QL-Syntax-Select-statement/m-p/584016#M428</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/66330"&gt;@odesh&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;SELECT ALL&lt;/STRONG&gt; - is the default. It means to return all rows. The alternative is specify DISTINCT (SELECT DISTINCT ...) which removes duplicates. You seldom encounter the ALL keyword.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;SELECT ALL * --includes duplicate rows

&amp;nbsp; &amp;nbsp;FROM mytable;

SELECT DISTINCT * -- removes duplicate rows

&amp;nbsp; &amp;nbsp;FROM mytable&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The syntax on the slide shows the &lt;A href="https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select" target="_self"&gt;Common Table Expression (CTE) statement&lt;/A&gt;. Since Hive, and Hadoop, are weird, you have more choices.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;ORDER BY&lt;/STRONG&gt; - if hive.mapred.mode=strict must include a LIMIT clause. If it doesn't, you will get an error.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;SORT BY&lt;/STRONG&gt; - similar to ORDER BY,&amp;nbsp; sorts the rows before feeding it to the MapReduce reducers.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy" target="_self"&gt;SORT BY vs. ORDER BY&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;CLUSTER BY&lt;/STRONG&gt;&amp;nbsp;and &lt;STRONG&gt;DISTRIBUTE BY&lt;/STRONG&gt; -&amp;nbsp; &lt;A href="https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy" target="_self"&gt;May as well read this in the doc&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As with all things Hive/Hadoop, there is nothing like practicing and suffering (unfortunately).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best wishes,&lt;/P&gt;
&lt;P&gt;Jeff&lt;/P&gt;</description>
      <pubDate>Tue, 27 Aug 2019 10:19:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Academy-for-Data-Science/Big-Data-Module-2-HIVE-QL-Syntax-Select-statement/m-p/584016#M428</guid>
      <dc:creator>JBailey</dc:creator>
      <dc:date>2019-08-27T10:19:35Z</dc:date>
    </item>
  </channel>
</rss>

