BookmarkSubscribeRSS Feed
sivaranjani
Fluorite | Level 6

Hello everyone,

I am trying to read into SAS from a Hadoop Apache Hive table and facing the following error.

ERROR: Fetch error: java.lang.RuntimeException: Error retrieving next row: This may be due to very long strings in the actual data
and the use of DBSASTYPE to limit string length. Consider using "SAS_HADOOP_HIVE1815_WORKAROUND=YES" to force Hive fetchSize
to 1. Contact SAS Technical Support for details.

 

Has anyone faced a similar error? The script and table definition is given below

 

The script I am running is this:

proc sql;
connect to hadoop (
uri='url'
server='server_name'
schema=schema_name
DBMAX_TEXT= 256
properties='hive.vectorized.execution.enabled=false'
user=&username
pw=&password
);

create table lib_name.table_1 (compress=yes) as
SELECT * FROM CONNECTION TO hadoop (
select * from table_name);
disconnect from hadoop;
quit;

 

The datatype of all the columns in the hive table is this:

request_idvarchar(50)
header_idvarchar(50)
header_namevarchar(50)
time_stampvarchar(50)
event_type_idvarchar(50)
event_type_namevarchar(150)
event_type_descvarchar(150)
business_namevarchar(150)
first_namevarchar(100)
last_namevarchar(100)
language_prefvarchar(50)
ban_idvarchar(50)
ban_typevarchar(50)
ban_sub_typevarchar(50)
alternative_contact_numbervarchar(50)
product_billing_typevarchar(50)
ported_tnvarchar(50)
tn_statusvarchar(50)
port_typevarchar(50)
port_statusvarchar(50)
line_numbervarchar(50)
desired_due_date_timevarchar(50)
npdivarchar(50)
product_typevarchar(50)
auto_actvarchar(50)
old_network_spvarchar(50)
old_network_sp_namevarchar(150)
new_network_spvarchar(50)
new_network_sp_namevarchar(150)
npqtyvarchar(50)
target_brandvarchar(50)
src_application_idvarchar(50)
src_agent_idvarchar(50)
month_numvarchar(6)

 

4 REPLIES 4
LinusH
Tourmaline | Level 20

I haven't seen this myself.

I wonder how to intgerpret the "in the actual data" part.

Can you confirm the datatypes by using a libref instead? (also, Iooking at your code, pass through is not required).

Data never sleeps
JBailey
Barite | Level 11

Hi @sivaranjani 

 

Hopefully, you are using a supported version of Hadoop (Cloudera, AWS EMR, Hortonworks, etc.). For more information on supported Hadoop distributions check out:

An Insider's Guide to SAS and Database Support Information 

 

The first thing to try is setting the SAS_HADOOP_HIVE1815_WORKAROUND=YES environment variable. 

You can set environment variables in the !SASROOT/bin/sasenv_local file, in the shell before you invoke SAS, or on the invocation by:
$> sas -set SAS_HADOOP_HIVE1815_WORKAROUND YES jobname.sas

Your SAS Administrator will likely need to do this, system wide.
 
Best wishes,
Jeff
sivaranjani
Fluorite | Level 6
Thank you! I will reach out to the SAS admin about this
Grzegorz_klos1
Calcite | Level 5

Have you solved this issue with proposed  -set SAS_HADOOP_HIVE1815_WORKAROUND YES setting? 

I have the same ERROR but when I set this variable it does not help me.  

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 5535 views
  • 2 likes
  • 4 in conversation