BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JanVanHeyden
Fluorite | Level 6

Hi,

 

I'm trying to use SAS Studio for data preparation within hadoop. I'm using the libname statement or pass-through querries as 'proc sql'. From my understanding SAS generates hiveql code that does the work within hadoop and simply delivers the answer back to SAS Studio.

 

For 'smaller' tables it works just fine. For 'bigger' datasets I need to do the job directly in HIVE or use a work around. e.g.

 

libname myhive hadoop subprotocol=hive2 port=10000 host="myhost" schema=default user=&u_name. pw=&u_pass.;

data test;

    set myhive.bigtable;  *the table has around 4m rows and 40 columns, which isn't that big really;

run;

 

 

For this statement I get the following error:

ERROR: Prepare error: Error while processing statement: FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
SQL statement: SELECT * FROM `bigtable`
 
I guess its something to do with the size of the table. If I use an obs=1000000 statement and read-in the table in 4 datasets á 1m each it works.
 
I get the same error if I use any proc-step or proc-sql step on that data table. If I e.g. want to get some simple proc freq counts I get the same error.
 
Any help, ideas, work arounds would be highly appreciated!
 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
JanVanHeyden
Fluorite | Level 6

Hi,

 

we found the solution.

 

I got 2 slightly different Usernames for HUE and SasStudio. In the libname statement to connect to "myhive" I was using the SAS Studio User and PW. This is of course wrong. You need to set it to the User and PW for HUE. Strangely, however, the connection to hive still works if I used a "wrong" User and PW combination for smaller datasets.

 

View solution in original post

5 REPLIES 5
LinusH
Tourmaline | Level 20

Sounds like from the eror message that you have a problem on the Hive side.

Use

options msglevel=i sastrace=',,,d' sastraceloc=saslog nostsuffix;

to fins out more what SAS/ACCESS transforms to Hive, and how Hive reacts (try first with a small query so you get the error free version).

I take it that you use a commercial SAS license with no data volume restrictions?

Data never sleeps
JanVanHeyden
Fluorite | Level 6

Thanks for the quick answer.

 

I added your options statement and let this statement run again:

 

libname myhive hadoop subprotocol=hive2 port=10000 host="myhost" schema=myhive user=&u_name. pw=&u_pass.;
options msglevel=i sastrace=',,,d' sastraceloc=saslog nostsuffix;

data test; set myhive.bigtable; run;

 

Now I get the following messages:

 

HADOOP_1: Executed: on connection 3
USE `myhive`
 
HADOOP_2: Prepared: on connection 3
SHOW TABLES 'BIGTABLE'
 
HADOOP_3: Prepared: on connection 3
DESCRIBE FORMATTED BIGTABLE
 
HADOOP_4: Prepared: on connection 3
SELECT * FROM `BIGTABLE`
 
ERROR: Prepare error: Error while processing statement: FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
SQL statement: SELECT * FROM `BIGTABLE'
LinusH
Tourmaline | Level 20

What I was trying to say was have a successful execution first (obs=1000000), and see if you can identify any HiveQL issues.

Otherwise, I would try open a track to SAS tech support.

Data never sleeps
JanVanHeyden
Fluorite | Level 6

Looks like no HiveQL issues to me.

 

I guess I will open a track then.

 

Thanks for your help!

JanVanHeyden
Fluorite | Level 6

Hi,

 

we found the solution.

 

I got 2 slightly different Usernames for HUE and SasStudio. In the libname statement to connect to "myhive" I was using the SAS Studio User and PW. This is of course wrong. You need to set it to the User and PW for HUE. Strangely, however, the connection to hive still works if I used a "wrong" User and PW combination for smaller datasets.

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 2303 views
  • 1 like
  • 2 in conversation