BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
HashimBasheer
Fluorite | Level 6

I am trying to integrate SAS with Hadoop. The details are below. SAS interface to Hive 9.4

Linux 2.6.32-358.2.1.el6.x86_64 (LIN X64) platform. When I am trying to create a hive table using SAS I am getting the error below

libname hdplib hadoop server="XXXXXX" user="XXX" password="$XXXX" port=10001;

data hdplib.class;

set sashelp.class(obs=10);

run;

HADOOP_10: Prepared: on connection 2 37 1387861396 no_name 0 DATASTEP

SHOW TABLE EXTENDED LIKE `CLASS` 38 1387861396 no_name 0 DATASTEP

39 1387861396 no_name 0 DATASTEP

40 1387861397 no_name 0 DATASTEP

HADOOP_11: Executed: on connection 2 41 1387861397 no_name 0 DATASTEP

CREATE TABLE `CLASS` (`Name` STRING,`Sex` STRING,`Age` DOUBLE,`Height` DOUBLE,`Weight` DOUBLE) ROW FORMAT DELIMITED FIELDS

TERMINATED BY '\001' LINES TERMINATED BY '\012' STORED AS TEXTFILE TBLPROPERTIES ('SAS OS Name'='Linux','SAS

Version'='9.04.01M0P06192013','SASFMT:Name'='CHAR(8)','SASFMT:Sex'='CHAR(1)') 42 1387861397 no_name 0 DATASTEP

43 1387861397 no_name 0 DATASTEP

ERROR: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hdfs.DistributedFileSystem

NOTE: Validate the contents of the Hadoop configuration file and ensure user permissions are correct.

ERROR: Unable to create stream service from /tmp/sasdata-2013-12-24-00-03-17-513-e-00001.dlv. Use the debug option for more

information.

ERROR: Unable to create stream service from /tmp/sasdata-2013-12-24-00-03-17-513-e-00001.dlv. Use the debug option for more

information.


Can someone assist in resolving this issue?

1 ACCEPTED SOLUTION

Accepted Solutions
JBailey
Barite | Level 11

Hi Hashim,

The LIBNAME statement doesn't require that you provide a configuration file. You will need to know if the cluster is running Hive or HiveServer2. It it is running HiveServer2 you will need to include SUBPROTOCOL=hive2 on your LIBNAME statement. Getting this wrong results in the LIBNAME statement hanging. You aren't getting that far, yet.

Your JAR files do not match the ones that I use when access CDH 4.3.1. The error message you get from the LIBNAME statement leads me to believe this is the issue. Here is my list of JAR files (SAS_HADOOP_JAR_PATH points to a directory containing these JARs).

guava-11.0.2.jar

hadoop-auth-2.0.0-cdh4.3.1.jar

hadoop-common-2.0.0-cdh4.3.1.jar

hadoop-core-2.0.0-mr1-cdh4.3.1.jar

hadoop-hdfs-2.0.0-cdh4.3.1.jar

hive-exec-0.10.0-cdh4.3.1.jar

hive-jdbc-0.10.0-cdh4.3.1.jar

hive-metastore-0.10.0-cdh4.3.1.jar

hive-service-0.10.0-cdh4.3.1.jar

libfb303-0.9.0.jar

pig-0.11.0-cdh4.3.1.jar

protobuf-java-2.4.0a.jar

Most of the connection issues we see involve incorrect JAR files, Kerberos security (supported by SAS 9.4 only), and MapReduce2 running on the cluster. We periodically see folks have configuration issues when running HDFS on a separate machine.

View solution in original post

12 REPLIES 12
NaveenSrinivasan
Calcite | Level 5

Hi,

It looks like installation and configuration issue. I would check with the local Administrator first. I guess you might get more and better responses if you could run:

proc javainfo picklist 'hadoop/hdoopsasjars.txt';

run;

and post the results to identify if there are any missing JAR files.

Thanks,

Naveen

HashimBasheer
Fluorite | Level 6

Naveen,

Here is the result.

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/sas.hadoop.hivehelper_904000.0.0.20130522190000_v940/sas.hadoop.hivehel

per.jar

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/Log4J_1.2.15.0_SAS_20121211183158/log4j.jar

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/commons_beanutils_1.8.2.0_SAS_20121211183319/commons-beanutils.jar

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/commons_collections_3.2.1.0_SAS_20121211183225/commons-collections.jar

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/commons_logging_1.1.1.0_SAS_20121211183202/commons-logging.jar

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/jackson_1.9.7.0_SAS_20121211183158/jackson.jar

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/slf4j_1.5.10.0_SAS_20121211183229/slf4j-api.jar

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/slf4j_1.5.10.0_SAS_20121211183229/slf4j-log4j12.jar

Total URLs: 8

Is there anything missing?

HashimBasheer
Fluorite | Level 6

Also the Jar files being used are

The JAR files are

guava.jar

hadoop-auth-0.23.1.jar

hadoop-common.jar

hadoop-core.jar

hadoop-hdfs-0.23.5.jar

hadoop-streaming-2.0.0-mr1-cdh4.3.1.jar

hive-exec-0.10.0.jar

hive-jdbc-0.10.0.jar

hive-metastore-0.10.0.jar

hive-service-0.8.1.jar

libfb303-0.7.0.jar

pig.jar

protobuf-java-2.4.1.jar

AhmedAl_Attar
Rhodochrosite | Level 12

Hashim,

While the SAS Tech Support are away on Holiday, I would suggest you review the following Webcasts, they may shed some light on how to get around your issue.

- Getting Started with SAS® and Hadoop
   In this live webinar, SAS technical expert Jeff Bailey covers the basics of SAS and Hadoop. This has a section talking about configuring SAS Access to Hadoop.

- SAS® Integration with Hadoop: Part II

  Tune in for part two of our series on Hadoop – learning more about SAS integration with Hadoop

Hope this helps,

Ahmed

HashimBasheer
Fluorite | Level 6

Thanks Ahmed.

We have the connection established between SAS and Hadoop.

libname hdplib hadoop server="CSDFFGF" user="basheerh" password=XXXXXXXXXXXXX port=10001;

NOTE: Libref HDPLIB was successfully assigned as follows:


It seems its an issue with the configuration

RMP
SAS Employee RMP
SAS Employee

Well actually what you have done is made a connection to the Hive server and not to HDFS. I think the problem lies in the fact that SAS cannot connect to HDFS and hence the the original error. When you run the original code

libname hdplib hadoop server="XXXXXX" user="XXX" password="$XXXX" port=10001;

data hdplib.class;

set sashelp.class(obs=10);

run;


I think you will see that the temp file is being written to the local file system and not to HDFS. When HIVE attempts to move the temp file from HDFS to Hive it fails since the file is not available on the HDFS file system.


I see you are not pointing to a configuration file. This file tells SAS where to look for the HDFS and MapRed components. Perhaps this is the issue.


You will probably face the same issue if you use the filename to Hadoop, which emphasizes the fact that it is an HDFS connectivity issue and not Hive -


filename out hadoop '/tmp/' user='sasdemo' pass='Orion123' recfm=v lrecl=32167 dir ;
data _null_;
file out(shoes) ;
put 'write data to shoes file';
run;

438        filename out hadoop '/tmp' cfg='/tmp/richard.cfg'
439        user='sasinst' pass=XXXXXXXXXX recfm=v lrecl=32167 dir debug;
440        data _null_;
441        file out(shoes4) ;
442        put 'write data to shoes file';
443        run;

ERROR: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.hdfs.DistributedFileSystem


Sorry I cant help further, I am having the exact same issue and have decided to reinstall Hadoop to check if this is the issue.



HashimBasheer
Fluorite | Level 6

RMP thanks for your response.

Please correct me if I am wrong. In libname approach do we need to point to confifg file?

JBailey
Barite | Level 11

Hi Hashim,

The LIBNAME statement doesn't require that you provide a configuration file. You will need to know if the cluster is running Hive or HiveServer2. It it is running HiveServer2 you will need to include SUBPROTOCOL=hive2 on your LIBNAME statement. Getting this wrong results in the LIBNAME statement hanging. You aren't getting that far, yet.

Your JAR files do not match the ones that I use when access CDH 4.3.1. The error message you get from the LIBNAME statement leads me to believe this is the issue. Here is my list of JAR files (SAS_HADOOP_JAR_PATH points to a directory containing these JARs).

guava-11.0.2.jar

hadoop-auth-2.0.0-cdh4.3.1.jar

hadoop-common-2.0.0-cdh4.3.1.jar

hadoop-core-2.0.0-mr1-cdh4.3.1.jar

hadoop-hdfs-2.0.0-cdh4.3.1.jar

hive-exec-0.10.0-cdh4.3.1.jar

hive-jdbc-0.10.0-cdh4.3.1.jar

hive-metastore-0.10.0-cdh4.3.1.jar

hive-service-0.10.0-cdh4.3.1.jar

libfb303-0.9.0.jar

pig-0.11.0-cdh4.3.1.jar

protobuf-java-2.4.0a.jar

Most of the connection issues we see involve incorrect JAR files, Kerberos security (supported by SAS 9.4 only), and MapReduce2 running on the cluster. We periodically see folks have configuration issues when running HDFS on a separate machine.

HashimBasheer
Fluorite | Level 6

The issue is fixed. It was because of the wrong JAR files. We used the below Jar files and everything is working fine.

guava-11.0.2.jar

hadoop-auth-2.0.0-cdh4.3.1.jar

hadoop-common-2.0.0-cdh4.3.1.jar

hadoop-core-2.0.0-mr1-cdh4.3.1.jar

hadoop-hdfs-2.0.0-cdh4.3.1.jar

hive-exec-0.10.0-cdh4.3.1.jar

hive-jdbc-0.10.0-cdh4.3.1.jar

hive-metastore-0.10.0-cdh4.3.1.jar

hive-service-0.10.0-cdh4.3.1.jar

libfb303-0.9.0.jar

pig-0.11.0-cdh4.3.1.jar

protobuf-java-2.4.0a.jar

JBailey
Barite | Level 11

Hi Hashim,

I am very happy you have this sorted-out. Have fun!

Best wishes,

Jeff

Sangramjit
Calcite | Level 5

Hi,

I am also getting the similar issue.

The detailed issue is tracked in

ERROR: Error moving data from Hadoop to Hive (LOAD DATA failed).

Please help me out in sorting out the issue.

Thanks,

Sangramjit

fatcat
Calcite | Level 5

Hi, i am trying to connect to Hortonworks using SAS/Access for Hadoop. I compared the jar files and they look good (versions slightly different as they were provided by the Hadoop admin). I am able to get a kilist from the SAS Client machine (server) and even ssh to the nodes. But when i run a libname from SAS EG, i am getting this:

 

libname zhdplib hadoop subprotocol='hive2' server='myserver' schema=myschema user='user123' pwd='test123';

 

ERROR: Unable to connect to the Hive server.
ERROR: Error trying to establish connection.
ERROR: Error in the LIBNAME statement.

 

 

thanks, 

Alex

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 13063 views
  • 0 likes
  • 7 in conversation