<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Odd Error When Running Hive Queries in SAS in SAS Data Management</title>
    <link>https://communities.sas.com/t5/SAS-Data-Management/Odd-Error-When-Running-Hive-Queries-in-SAS/m-p/461761#M14338</link>
    <description>Just go be clear, you mean explicit pass through, right? Since I dont see any ...from connection to/execute (...) by in your code. &lt;BR /&gt;If you do a create table in an execute block nothing really pulled to SAS. And nothing here has actually nothing to do with SAS.&lt;BR /&gt;If your end game is to pull data to SAS I don't see the benefit of tampering with formats in Hive. That is simply done in SAS at report time, and requires almost no resources.</description>
    <pubDate>Sat, 12 May 2018 07:21:09 GMT</pubDate>
    <dc:creator>LinusH</dc:creator>
    <dc:date>2018-05-12T07:21:09Z</dc:date>
    <item>
      <title>Odd Error When Running Hive Queries in SAS</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Odd-Error-When-Running-Hive-Queries-in-SAS/m-p/461734#M14337</link>
      <description>&lt;P&gt;&lt;STRONG&gt;Environment:&lt;/STRONG&gt;&lt;BR /&gt;I'm running&amp;nbsp;SAS (r) Proprietary Software Release 9.4&amp;nbsp; TS1M4.&amp;nbsp; I'm executing, as I understand it, on a Linux box hosted by another company.&amp;nbsp; I don't have command line access, so I can't do my usual tricks to determine what exactly I'm running on, but it's in the UNIX/Linux family.&amp;nbsp; When I check the server properties in SAS EG, I see OS=Linux OS Family=Lin x64 OS Version 3.10.0-693.17.1el7x86_64.&lt;BR /&gt;&lt;BR /&gt;I'm pulling data from a Hadoop Cluster using Hive QL executed as a pass through query from a SAS program running under SAS EG 7.1.5 HF3..&amp;nbsp; We're using the Cloudera distribution of Hadoop.&amp;nbsp; I want to format a number into a string.&amp;nbsp; For display purposes, I want to put commas in the number.&amp;nbsp; For example:&amp;nbsp; 999999999 becomes 999,999,999 when formatted.&amp;nbsp; Looking into the Apache Hadoop documentation (&lt;A href="https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF" target="_blank"&gt;https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF&lt;/A&gt;), I see a function called format_number.&amp;nbsp; Seems like a winner.&amp;nbsp; It wants a number and an integer as arguments.&amp;nbsp; The integer specifies the number of decimal places.&amp;nbsp; It returns a string, supposedly formatted with commas separating the digits in the standard fashion.&lt;BR /&gt;&lt;BR /&gt;However, when I use the function in a query, I can't get it to run.&amp;nbsp; I get the following "highly descriptive" error message:&lt;BR /&gt;&lt;FONT face="courier new,courier" color="#FF0000"&gt;ERROR: Execute error: Error while processing statement: FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask&lt;/FONT&gt;&lt;BR /&gt;&lt;BR /&gt;The query is included, below.&amp;nbsp; The query runs fine if I remove the line that has the format_number function. &lt;BR /&gt;&lt;BR /&gt;My guess, and it's not much more than that, is that this is some kind of a memory issue, but I'm not really sure.&amp;nbsp; I've done a Google search on the error message, but it seems to be fairly generic and not indicative of any one specific issue if I'm understanding what I'm reading correctly.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;Has anyone encountered this before?&amp;nbsp; Can anyone shed any light on this?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Jim&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;	CREATE	TABLE	&amp;amp;Target_DB..&amp;amp;Prefix.&amp;amp;Source_Table._&amp;amp;Secondary_Table.&amp;amp;Version		AS
			SELECT	
				Primary.consumer_key
				,Primary.transaction_key
				,record_ident
				,record_len
				,hr_profiledate
				,hr_mkey_len
				,hr_mkey_text
				,tr_spccmt
				,tr_evaluation
				,tr_opendate
				,tr_statdate
				,tr_type
				,tr_terms
				,f_tr_trms
				,tr_amount1
				,format_number(f_tr_amt1_n,	0)	AS	f_tr_amt1
				,f_tr_amt1_n
				,tr_amount1_qual
			FROM	&amp;amp;Target_DB..&amp;amp;Prefix.&amp;amp;Source_Table.&amp;amp;Version				Primary
			FULL	JOIN	&amp;amp;Target_DB..&amp;amp;Prefix.&amp;amp;Secondary_Table.&amp;amp;Version	Secondary
				ON	Primary.consumer_key		=	Secondary.consumer_key
				AND	Primary.transaction_key		=	Secondary.transaction_key&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 12 May 2018 00:41:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Odd-Error-When-Running-Hive-Queries-in-SAS/m-p/461734#M14337</guid>
      <dc:creator>jimbarbour</dc:creator>
      <dc:date>2018-05-12T00:41:37Z</dc:date>
    </item>
    <item>
      <title>Re: Odd Error When Running Hive Queries in SAS</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Odd-Error-When-Running-Hive-Queries-in-SAS/m-p/461761#M14338</link>
      <description>Just go be clear, you mean explicit pass through, right? Since I dont see any ...from connection to/execute (...) by in your code. &lt;BR /&gt;If you do a create table in an execute block nothing really pulled to SAS. And nothing here has actually nothing to do with SAS.&lt;BR /&gt;If your end game is to pull data to SAS I don't see the benefit of tampering with formats in Hive. That is simply done in SAS at report time, and requires almost no resources.</description>
      <pubDate>Sat, 12 May 2018 07:21:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Odd-Error-When-Running-Hive-Queries-in-SAS/m-p/461761#M14338</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2018-05-12T07:21:09Z</dc:date>
    </item>
    <item>
      <title>Re: Odd Error When Running Hive Queries in SAS</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Odd-Error-When-Running-Hive-Queries-in-SAS/m-p/461777#M14340</link>
      <description>&lt;P&gt;Actually, I think you're right.&amp;nbsp; The best idea is to maintain the data as INT in Hive and then format it as needed only when it is to be displayed.&amp;nbsp; It makes little sense (to me) to store the data in both display format and as INT.&amp;nbsp;&amp;nbsp;It's poor practice.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, I've been asked to essentially replicate in Hive an existing SAS dataset&amp;nbsp;such that the new Hive data is "backward compatible" for old programs.&amp;nbsp; The idea that we can "unplug" an existing SAS program and then "plug" said program into Hive seems simplistic to me.&amp;nbsp; I think it's going to take some serious re-design.&amp;nbsp; Running 4.5 billion rows through a SAS/Access LIBNAME from Hive into a SAS program just isn't going to work, at least from what I can see.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'm still curious, though, why the format_number function shuts my query down.&amp;nbsp; I've all but verbatim copied the syntax from the manual.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Oh, and, yes.&amp;nbsp; By "pass through" I meant "explicit pass through".&amp;nbsp; I redacted the pass through "wrapper" since I know the CONNECT TO/EXECUTE/DISCONNECT wrapper is working fine.&amp;nbsp; It is the query itself that is causing the error and that only when I try to use the format_number function.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Jim&lt;/P&gt;</description>
      <pubDate>Sat, 12 May 2018 14:55:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Odd-Error-When-Running-Hive-Queries-in-SAS/m-p/461777#M14340</guid>
      <dc:creator>jimbarbour</dc:creator>
      <dc:date>2018-05-12T14:55:24Z</dc:date>
    </item>
  </channel>
</rss>

