BookmarkSubscribeRSS Feed
sivaranjani
Fluorite | Level 6

Hello everyone,

I am trying to read into SAS from a Hadoop Apache Hive table and facing the following error.

ERROR: Fetch error: java.lang.RuntimeException: Error retrieving next row: This may be due to very long strings in the actual data
and the use of DBSASTYPE to limit string length. Consider using "SAS_HADOOP_HIVE1815_WORKAROUND=YES" to force Hive fetchSize
to 1. Contact SAS Technical Support for details.

 

Has anyone faced a similar error? The script and table definition is given below

 

The script I am running is this:

proc sql;
connect to hadoop (
uri='url'
server='server_name'
schema=schema_name
DBMAX_TEXT= 256
properties='hive.vectorized.execution.enabled=false'
user=&username
pw=&password
);

create table lib_name.table_1 (compress=yes) as
SELECT * FROM CONNECTION TO hadoop (
select * from table_name);
disconnect from hadoop;
quit;

 

The datatype of all the columns in the hive table is this:

request_idvarchar(50)
header_idvarchar(50)
header_namevarchar(50)
time_stampvarchar(50)
event_type_idvarchar(50)
event_type_namevarchar(150)
event_type_descvarchar(150)
business_namevarchar(150)
first_namevarchar(100)
last_namevarchar(100)
language_prefvarchar(50)
ban_idvarchar(50)
ban_typevarchar(50)
ban_sub_typevarchar(50)
alternative_contact_numbervarchar(50)
product_billing_typevarchar(50)
ported_tnvarchar(50)
tn_statusvarchar(50)
port_typevarchar(50)
port_statusvarchar(50)
line_numbervarchar(50)
desired_due_date_timevarchar(50)
npdivarchar(50)
product_typevarchar(50)
auto_actvarchar(50)
old_network_spvarchar(50)
old_network_sp_namevarchar(150)
new_network_spvarchar(50)
new_network_sp_namevarchar(150)
npqtyvarchar(50)
target_brandvarchar(50)
src_application_idvarchar(50)
src_agent_idvarchar(50)
month_numvarchar(6)

 

4 REPLIES 4
LinusH
Tourmaline | Level 20

I haven't seen this myself.

I wonder how to intgerpret the "in the actual data" part.

Can you confirm the datatypes by using a libref instead? (also, Iooking at your code, pass through is not required).

Data never sleeps
JBailey
Barite | Level 11

Hi @sivaranjani 

 

Hopefully, you are using a supported version of Hadoop (Cloudera, AWS EMR, Hortonworks, etc.). For more information on supported Hadoop distributions check out:

An Insider's Guide to SAS and Database Support Information 

 

The first thing to try is setting the SAS_HADOOP_HIVE1815_WORKAROUND=YES environment variable. 

You can set environment variables in the !SASROOT/bin/sasenv_local file, in the shell before you invoke SAS, or on the invocation by:
$> sas -set SAS_HADOOP_HIVE1815_WORKAROUND YES jobname.sas

Your SAS Administrator will likely need to do this, system wide.
 
Best wishes,
Jeff
sivaranjani
Fluorite | Level 6
Thank you! I will reach out to the SAS admin about this
Grzegorz_klos1
Calcite | Level 5

Have you solved this issue with proposed  -set SAS_HADOOP_HIVE1815_WORKAROUND YES setting? 

I have the same ERROR but when I set this variable it does not help me.  

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 7455 views
  • 2 likes
  • 4 in conversation