Has anyone sucessfully connected to Hadoop from SAS?
We are having problems with the connectivity and a case was opened with SAS tech support more than a week ago with no luck.
Let me know if you can share your experience and offer some advise.
thanks!
This is the answer to LYT's issue.
SAS 9.3M2 - SAS/ACCESS Interface to Hadoop - only supports Hive. Hive does not support Kerberos security. We will support Hiveserver2 (which supports Kerberos) with SAS 9.4 (it shipped this week).
From the text above.
"FYI-The way Hadoop is set up here requires Kerberos ticket. We are not allowed to connect to the Name Node. We are asked to use the edge node."
SAS/ACCESS to Hadoop connection problems are typically cause by missing JAR files and not having the SAS_HADOOP_JARS_PATH= environment variable properly set. Can you send me your SAS Tech Support track number?
Hi There,
Sorry I just landed to this page. I am having issues connecting to Hortonworks Hadoop. I downloaded the JARS (hoping they are correct) I am still unable to connect.
I downloaded Hortonworks ODBC driver/connector and I am able to connect to Hive easily. but SAS/ACCESS for hadoop is unable to connect. If you want me to post this on another area , please let me know.
1st error:
ERROR: java.io.IOException: The filename, directory name, or volume label syntax is incorrect
ERROR: Unable to create new Java object.
ERROR: Error trying to establish connection.
ERROR: Error in the LIBNAME statement.
2nd error:
3 libname hdp hadoop server="hadoop05.nfs.sde.rogersdigitalmedia.com" port=10000 schema=default
4 user=p_dogra password=XXXXXXXXX;
ERROR: java.lang.ClassNotFoundException: com.sas.access.hadoop.hive.HiveHelper
ERROR: Unable to create new Java object.
ERROR: Error trying to establish connection.
ERROR: Error in the LIBNAME statement.
Hi,
Can you post a list of the JARs in your SAS_HADOOP_JAR_PATH= directory?
Hi PDK,
I meant to add something to my response. I hit reply too quickly.
Is this a new SAS install, or did you upgrade a previous install?
Can you run this code and post the results?
proc javainfo picklist 'hadoop/hdoopsasjars.txt'; run;
And finally, post a list of the JARs in your SAS_HADOOP_JAR_PATH= directory.
Hello,
I am also facing a similar issue. I have posted it here.
https://communities.sas.com/message/192123#192123
Results of proc javainfo picklist 'hadoop/hdoopsasjars.txt'; run; is given below.
file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/sas.hadoop.hivehelper_904000.0.0.20130522190000_v940/sas.hadoop.hivehel
per.jar
file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/Log4J_1.2.15.0_SAS_20121211183158/log4j.jar
file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/commons_beanutils_1.8.2.0_SAS_20121211183319/commons-beanutils.jar
file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/commons_collections_3.2.1.0_SAS_20121211183225/commons-collections.jar
file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/commons_logging_1.1.1.0_SAS_20121211183202/commons-logging.jar
file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/jackson_1.9.7.0_SAS_20121211183158/jackson.jar
file:/home/SASHome/SASVersionedJarRepository/eclipse/plugins/slf4j_1.5.10.0_SAS_20121211183229/slf4j-api.jar
The JAR files are
guava.jar
hadoop-auth-0.23.1.jar
hadoop-common.jar
hadoop-core.jar
hadoop-hdfs-0.23.5.jar
hadoop-streaming-2.0.0-mr1-cdh4.3.1.jar
hive-exec-0.10.0.jar
hive-jdbc-0.10.0.jar
hive-metastore-0.10.0.jar
hive-service-0.8.1.jar
libfb303-0.7.0.jar
pig.jar
protobuf-java-2.4.1.jar
Hashim's problem was getting the correct JAR files. This is a very common problem and we are looking into making it much easier to configure SAS/ACCESS to Hadoop. The solution is detailed here:
https://communities.sas.com/message/192123#192123
I have copied the pertinent information so it is easy to find.
Your JAR files do not match the ones that I use when access CDH 4.3.1. The error message you get from the LIBNAME statement leads me to believe this is the issue. Here is my list of JAR files (SAS_HADOOP_JAR_PATH points to a directory containing these JARs).
guava-11.0.2.jar
hadoop-auth-2.0.0-cdh4.3.1.jar
hadoop-common-2.0.0-cdh4.3.1.jar
hadoop-core-2.0.0-mr1-cdh4.3.1.jar
hadoop-hdfs-2.0.0-cdh4.3.1.jar
hive-exec-0.10.0-cdh4.3.1.jar
hive-jdbc-0.10.0-cdh4.3.1.jar
hive-metastore-0.10.0-cdh4.3.1.jar
hive-service-0.10.0-cdh4.3.1.jar
libfb303-0.9.0.jar
pig-0.11.0-cdh4.3.1.jar
protobuf-java-2.4.0a.jar
Most of the connection issues we see involve incorrect JAR files, Kerberos security (supported by SAS 9.4 only), and MapReduce2 running on the cluster. We periodically see folks have configuration issues when running HDFS on a separate machine.
Can you please share the error or issues you are experiencing?
Thanks so much for your response! The tracking number is [SAS 7611025753] Hadoop connectivity issue. Here is my SAS_HADOOP_JAR_PATH=/app/sas/Hadoop_JAR_Files -bash-3.2$ ls -ltr /app/sas/Hadoop_JAR_Files total 79352 -rwxr-x--- 1 sas sas 40560640 Jun 17 11:52 jarfiles.tar drwxr-x--- 3 sas sas 4096 Jun 17 11:52 tmp -rwxr-xr-x 1 sas sas 449818 Jun 17 11:52 protobuf-java-2.4.0a.jar -rwxr-xr-x 1 sas sas 25056387 Jun 17 11:52 pig-0.10.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas 175982 Jun 17 11:52 libfb303.jar -rwxr-xr-x 1 sas sas 1488583 Jun 17 11:52 hive-service-0.10.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas 3075237 Jun 17 11:52 hive-metastore-0.10.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas 116548 Jun 17 11:52 hive-jdbc-0.10.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas 4579420 Jun 17 11:52 hive-exec-0.10.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas 1643889 Jun 17 11:52 hadoop-hdfs-2.0.0-cdh4.2.0-tests.jar -rwxr-xr-x 1 sas sas 2266173 Jun 17 11:52 hadoop-common-2.0.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas 46856 Jun 17 11:52 hadoop-auth-2.0.0-cdh4.2.0.jar -rwxr-xr-x 1 sas sas 1648200 Jun 17 11:52 guava-11.0.2.jar I just found out that even my test Java script is not able to connect to Hadoop. Here is my Java connection string: Connection con = DriverManager.getConnection("jdbc:hive2://kwahdmnc1d002.devlab.dev:10000/default;principal=hive/kwahdmnc1d002-priv.devlab.dev@KWAHDC1D.DEV.FINRA.ORG"); And here is the errors I am getting: -bash-3.2$ ./test_with_localjars.sh log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. org.apache.thrift.transport.TTransportException: GSS initiate failed at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:221) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:297) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:156) at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:96) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:104) at java.sql.DriverManager.getConnection(DriverManager.java:582) at java.sql.DriverManager.getConnection(DriverManager.java:207) at TestHiveServer2Jdbc.main(TestHiveServer2Jdbc.java:29) Exception in thread "main" java.sql.SQLException: Could not establish connection to jdbc:hive2://kwahdmnc1d002.devlab.dev:10000/default;principal=hive/kwahdmnc1d002-priv.devlab.dev@KWAHDC1D.DEV.FINRA.ORG: GSS initiate failed at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:159) at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:96) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:104) at java.sql.DriverManager.getConnection(DriverManager.java:582) at java.sql.DriverManager.getConnection(DriverManager.java:207) at TestHiveServer2Jdbc.main(TestHiveServer2Jdbc.java:29) FYI-The way Hadoop is set up here requires Kerberos ticket. We are not allowed to connect to the Name Node. We are asked to use the edge node. 1? proc javainfo; 2? run; PFS_TEMPLATE = /app/sas/SASFoundation/9.3/misc/tkjava/qrpfstpt.xml java.class.path = /app/sas/SASVersionedJarRepository/eclipse/plugins/sas.launcher.jar java.class.version = 50.0 java.runtime.name = Java(TM) SE Runtime Environment java.runtime.version = 1.6.0_21-b06 java.security.auth.login.config = /app/sas/SASFoundation/9.3/misc/tkjava/sas.login.config java.security.policy = /app/sas/SASFoundation/9.3/misc/tkjava/sas.policy java.specification.version = 1.6 java.system.class.loader = com.sas.app.AppClassLoader java.vendor = Sun Microsystems Inc. java.version = 1.6.0_21 java.vm.name = Java HotSpot(TM) Server VM java.vm.specification.version = 1.0 java.vm.version = 17.0-b16 sas.app.class.path = /app/sas/SASVersionedJarRepository/eclipse/plugins/tkjava.jar sas.ext.config = /app/sas/SASFoundation/9.3/misc/tkjava/sas.java.ext.config tkj.app.launch.config = /app/sas/SASVersionedJarRepository/picklist user.country = US user.language = en NOTE: PROCEDURE JAVAINFO used (Total process time): real time 2.92 seconds cpu time 0.01 seconds 3? LIBNAME dhdlib HADOOP PORT=10000 SERVER="kwahdenc1d003.devlab.dev" Let me know if you need additional info.
This is the answer to LYT's issue.
SAS 9.3M2 - SAS/ACCESS Interface to Hadoop - only supports Hive. Hive does not support Kerberos security. We will support Hiveserver2 (which supports Kerberos) with SAS 9.4 (it shipped this week).
From the text above.
"FYI-The way Hadoop is set up here requires Kerberos ticket. We are not allowed to connect to the Name Node. We are asked to use the edge node."
Hi,
the hadoop cluster in my environment is Kerberos sercured and SAS9.4M4 is installed and configured in the enviroment. I am getting below error while connecting to haddop from SAS Enterprise Guide. Kindly suggest.
libname ephive hadoop server='xxxxx.ffff.dddd' database=default subprotocol=hive2;
HADOOP: Connection to DSN=xxxxx.ffff.dddd failed. 894 1500636702 no_name 0 OBJECT_E
ERROR: java.sql.SQLException: Could not open client transport with JDBC Uri:
jdbc:hive2://xxxxx.ffff.dddd:10000/default;principal=hive/_HOST@xxxxx.dddd: GSS initiate failed
ACCESS ENGINE: Exiting DBICON with rc=0X801F9007 895 1500636702 no_name 0 OBJECT_E
ERROR: Error trying to establish connection.
ERROR: Error in the LIBNAME statement.
Hi swetawasthisas,
While your reply relates to this thread, it is a slightly different problem and this thread is already solved. It's best to open up a New Message on the community to keep the entire topic focused on your scenario.
Many thanks,
Anna
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.