SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Hadoop connectivity issues

Accepted Solution Solved
Reply
Occasional Contributor LYT
Occasional Contributor
Posts: 8
Accepted Solution

Hadoop connectivity issues


Has anyone sucessfully connected to Hadoop from SAS?

We are having problems with the connectivity and a case was opened with SAS tech support more than a week ago with no luck.

Let me know if you can share your experience and offer some advise.

thanks!


Accepted Solutions
Solution
‎07-12-2013 03:52 PM
SAS Employee
Posts: 203

Re: Hadoop connectivity issues

This is the answer to LYT's issue.

SAS 9.3M2 - SAS/ACCESS Interface to Hadoop - only supports Hive. Hive does not support Kerberos security. We will support Hiveserver2 (which supports Kerberos) with SAS 9.4 (it shipped this week).

From the text above.

"FYI-The way Hadoop is set up here requires Kerberos ticket. We are not allowed to connect to the Name Node. We are asked to use the edge node."

View solution in original post


All Replies
SAS Employee
Posts: 203

Re: Hadoop connectivity issues

SAS/ACCESS to Hadoop connection problems are typically cause by missing JAR files  and not having the SAS_HADOOP_JARS_PATH= environment variable properly set. Can you send me your SAS Tech Support track number?

PDK
N/A
Posts: 1

Re: Hadoop connectivity issues

Hi There,

Sorry I just landed to this page. I am having issues connecting to Hortonworks Hadoop. I downloaded the JARS (hoping they are correct) I am still unable to connect.

I downloaded Hortonworks ODBC driver/connector and I am able to connect to Hive easily. but SAS/ACCESS for hadoop is unable to connect. If you want me to post this on another area , please let me know.

1st error:

ERROR: java.io.IOException: The filename, directory name, or volume label syntax is incorrect

ERROR: Unable to create new Java object.

ERROR: Error trying to establish connection.

ERROR: Error in the LIBNAME statement.

2nd error:

3    libname hdp hadoop server="hadoop05.nfs.sde.rogersdigitalmedia.com" port=10000 schema=default

4       user=p_dogra password=XXXXXXXXX;

ERROR: java.lang.ClassNotFoundException: com.sas.access.hadoop.hive.HiveHelper

ERROR: Unable to create new Java object.

ERROR: Error trying to establish connection.

ERROR: Error in the LIBNAME statement.

SAS Employee
Posts: 203

Re: Hadoop connectivity issues

Hi,

Can you post a list of the JARs in your SAS_HADOOP_JAR_PATH= directory?

SAS Employee
Posts: 203

Re: Hadoop connectivity issues

Hi PDK,

I meant to add something to my response. I hit reply too quickly.

Is this a new SAS install, or did you upgrade a previous install?

Can you run this code and post the results?

proc javainfo picklist 'hadoop/hdoopsasjars.txt'; run;

And finally, post a list of the JARs in your SAS_HADOOP_JAR_PATH= directory.

Contributor
Posts: 30

Re: Hadoop connectivity issues

Hello,

I am also facing a similar issue. I have posted it here.

https://communities.sas.com/message/192123#192123

Results of proc javainfo picklist 'hadoop/hdoopsasjars.txt'; run; is given below.

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/sas.hadoop.hivehelper_904000.0.0.20130522190000_v940/sas.hadoop.hivehel

per.jar

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/Log4J_1.2.15.0_SAS_20121211183158/log4j.jar

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/commons_beanutils_1.8.2.0_SAS_20121211183319/commons-beanutils.jar

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/commons_collections_3.2.1.0_SAS_20121211183225/commons-collections.jar

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/commons_logging_1.1.1.0_SAS_20121211183202/commons-logging.jar

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/jackson_1.9.7.0_SAS_20121211183158/jackson.jar

file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/slf4j_1.5.10.0_SAS_20121211183229/slf4j-api.jar

The JAR files are

guava.jar

hadoop-auth-0.23.1.jar

hadoop-common.jar

hadoop-core.jar

hadoop-hdfs-0.23.5.jar

hadoop-streaming-2.0.0-mr1-cdh4.3.1.jar

hive-exec-0.10.0.jar

hive-jdbc-0.10.0.jar

hive-metastore-0.10.0.jar

hive-service-0.8.1.jar

libfb303-0.7.0.jar

pig.jar

protobuf-java-2.4.1.jar

SAS Employee
Posts: 203

Re: Hadoop connectivity issues

Hashim's problem was getting the correct JAR files. This is a very common problem and we are looking into making it much easier to configure SAS/ACCESS to Hadoop. The solution is detailed here:

https://communities.sas.com/message/192123#192123

I have copied the pertinent information so it is easy to find.

Your JAR files do not match the ones that I use when access CDH 4.3.1. The error message you get from the LIBNAME statement leads me to believe this is the issue. Here is my list of JAR files (SAS_HADOOP_JAR_PATH points to a directory containing these JARs).

guava-11.0.2.jar

hadoop-auth-2.0.0-cdh4.3.1.jar

hadoop-common-2.0.0-cdh4.3.1.jar

hadoop-core-2.0.0-mr1-cdh4.3.1.jar

hadoop-hdfs-2.0.0-cdh4.3.1.jar

hive-exec-0.10.0-cdh4.3.1.jar

hive-jdbc-0.10.0-cdh4.3.1.jar

hive-metastore-0.10.0-cdh4.3.1.jar

hive-service-0.10.0-cdh4.3.1.jar

libfb303-0.9.0.jar

pig-0.11.0-cdh4.3.1.jar

protobuf-java-2.4.0a.jar

Most of the connection issues we see involve incorrect JAR files, Kerberos security (supported by SAS 9.4 only), and MapReduce2 running on the cluster. We periodically see folks have configuration issues when running HDFS on a separate machine.

SAS Employee
Posts: 2

Re: Hadoop connectivity issues

Can you please share the error or issues you are experiencing? 

Occasional Contributor LYT
Occasional Contributor
Posts: 8

Re: Hadoop connectivity issues

Thanks so much for your response! The tracking number is [SAS 7611025753] Hadoop connectivity issue. Here is my SAS_HADOOP_JAR_PATH=/app/sas/Hadoop_JAR_Files -bash-3.2$ ls -ltr /app/sas/Hadoop_JAR_Files total 79352 -rwxr-x--- 1 sas sas 40560640 Jun 17 11:52 jarfiles.tar drwxr-x--- 3 sas sas    4096 Jun 17 11:52 tmp -rwxr-xr-x 1 sas sas  449818 Jun 17 11:52 protobuf-java-2.4.0a.jar -rwxr-xr-x 1 sas sas 25056387 Jun 17 11:52 pig-0.10.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas  175982 Jun 17 11:52 libfb303.jar -rwxr-xr-x 1 sas sas  1488583 Jun 17 11:52 hive-service-0.10.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas  3075237 Jun 17 11:52 hive-metastore-0.10.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas  116548 Jun 17 11:52 hive-jdbc-0.10.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas  4579420 Jun 17 11:52 hive-exec-0.10.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas  1643889 Jun 17 11:52 hadoop-hdfs-2.0.0-cdh4.2.0-tests.jar -rwxr-xr-x 1 sas sas  2266173 Jun 17 11:52 hadoop-common-2.0.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas    46856 Jun 17 11:52 hadoop-auth-2.0.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas  1648200 Jun 17 11:52 guava-11.0.2.jar I just found out that even my test Java script is not able to connect to Hadoop. Here is my Java connection string: Connection con = DriverManager.getConnection("jdbc:hive2://kwahdmnc1d002.devlab.dev:10000/default;principal=hive/kwahdmnc1d002-priv.devlab.dev@KWAHDC1D.DEV.FINRA.ORG"); And here is the errors I am getting: -bash-3.2$ ./test_with_localjars.sh log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. org.apache.thrift.transport.TTransportException: GSS initiate failed         at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:221)         at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:297)         at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)         at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52)         at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49)         at java.security.AccessController.doPrivileged(Native Method)         at javax.security.auth.Subject.doAs(Subject.java:396)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)         at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49)         at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:156)         at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:96)         at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:104)         at java.sql.DriverManager.getConnection(DriverManager.java:582)         at java.sql.DriverManager.getConnection(DriverManager.java:207)         at TestHiveServer2Jdbc.main(TestHiveServer2Jdbc.java:29) Exception in thread "main" java.sql.SQLException: Could not establish connection to jdbc:hive2://kwahdmnc1d002.devlab.dev:10000/default;principal=hive/kwahdmnc1d002-priv.devlab.dev@KWAHDC1D.DEV.FINRA.ORG: GSS initiate failed         at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:159)         at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:96)         at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:104)       at java.sql.DriverManager.getConnection(DriverManager.java:582)         at java.sql.DriverManager.getConnection(DriverManager.java:207)         at TestHiveServer2Jdbc.main(TestHiveServer2Jdbc.java:29) FYI-The way Hadoop is set up here requires Kerberos ticket. We are not allowed to connect to the Name Node. We are asked to use the edge node. 1? proc javainfo;   2? run; PFS_TEMPLATE = /app/sas/SASFoundation/9.3/misc/tkjava/qrpfstpt.xml java.class.path = /app/sas/SASVersionedJarRepository/eclipse/plugins/sas.launcher.jar java.class.version = 50.0 java.runtime.name = Java(TM) SE Runtime Environment java.runtime.version = 1.6.0_21-b06 java.security.auth.login.config = /app/sas/SASFoundation/9.3/misc/tkjava/sas.login.config java.security.policy = /app/sas/SASFoundation/9.3/misc/tkjava/sas.policy java.specification.version = 1.6 java.system.class.loader = com.sas.app.AppClassLoader java.vendor = Sun Microsystems Inc. java.version = 1.6.0_21 java.vm.name = Java HotSpot(TM) Server VM java.vm.specification.version = 1.0 java.vm.version = 17.0-b16 sas.app.class.path = /app/sas/SASVersionedJarRepository/eclipse/plugins/tkjava.jar sas.ext.config = /app/sas/SASFoundation/9.3/misc/tkjava/sas.java.ext.config tkj.app.launch.config = /app/sas/SASVersionedJarRepository/picklist user.country = US user.language = en NOTE: PROCEDURE JAVAINFO used (Total process time):       real time          2.92 seconds       cpu time            0.01 seconds   3? LIBNAME dhdlib HADOOP  PORT=10000 SERVER="kwahdenc1d003.devlab.dev" Let me know if you need additional info.

Solution
‎07-12-2013 03:52 PM
SAS Employee
Posts: 203

Re: Hadoop connectivity issues

This is the answer to LYT's issue.

SAS 9.3M2 - SAS/ACCESS Interface to Hadoop - only supports Hive. Hive does not support Kerberos security. We will support Hiveserver2 (which supports Kerberos) with SAS 9.4 (it shipped this week).

From the text above.

"FYI-The way Hadoop is set up here requires Kerberos ticket. We are not allowed to connect to the Name Node. We are asked to use the edge node."

Occasional Contributor
Posts: 5

Re: Hadoop connectivity issues

Hi,


the hadoop cluster in my environment is Kerberos sercured and SAS9.4M4 is installed and configured in the enviroment. I am getting below error while connecting to haddop from SAS Enterprise Guide. Kindly suggest.

 

libname ephive hadoop server='xxxxx.ffff.dddd' database=default subprotocol=hive2;
HADOOP: Connection to DSN=xxxxx.ffff.dddd failed. 894 1500636702 no_name 0 OBJECT_E
ERROR: java.sql.SQLException: Could not open client transport with JDBC Uri:
jdbc:hive2://xxxxx.ffff.dddd:10000/default;principal=hive/_HOST@xxxxx.dddd: GSS initiate failed
ACCESS ENGINE: Exiting DBICON with rc=0X801F9007 895 1500636702 no_name 0 OBJECT_E
ERROR: Error trying to establish connection.
ERROR: Error in the LIBNAME statement.

 

Community Manager
Posts: 486

Re: Hadoop connectivity issues

Hi swetawasthisas,

 

While your reply relates to this thread, it is a slightly different problem and this thread is already solved. It's best to open up a New Message on the community to keep the entire topic focused on your scenario.

 

Many thanks,

Anna

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 6147 views
  • 0 likes
  • 7 in conversation