hi, I'm fairly new to querying in hadoop. when running a query using sql pass thru, getting an out of memory error. Does someone know of an option i could use to perhaps by-pass this, or has a suggestion other than my having to break my query into different parts. There was no error when I limited the query time frame to 4 years (from date, to date). However, when broadening to 7 years, I am getting out of memory error. Below is the bit of code that is scanning through hundreds of millions of records, that is resource intensive that I believe causes the error. LEFT JOIN
(SELECT
CRNT_BAL_AMT,eff_from_dt ,eff_TO_dt,ACCT_ID ,PRD_CD
FROM TRANS_HIST
Where eff_to_dt>='2010-12-31' and eff_to_dt < '9999-12-31'
) g ON
A.ACCT_ID=g.ACCT_ID AND date_sub(A.txn_dt,1)=g.EFF_TO_DT
LEFT JOIN
(SELECT
CRNT_BAL_AMT,eff_from_dt ,eff_TO_dt,ACCT_ID ,PRD_CD
FROM TRANS_HIST
Where eff_to_dt>='2011-01-01' and eff_to_dt < '9999-12-31'
) h ON
A. ACCT_ID=h.ACCT_ID AND date_sub(g.eff_from_dt,1)=h.EFF_TO_DT
ORDER BY a.ACCT_ID)
;
DISCONNECT FROM hadcon;
quit; Error message: ERROR: Prepare error: Error while processing statement: FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1517179256891_507352_1_09,
diagnostics=[Task failed, taskId=task_1517179256891_507352_1_09_000003, diagnostics=[TaskAttempt 0 failed, info=[Error:
Failure while running task:java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173) Thanks in advance
... View more