BookmarkSubscribeRSS Feed
mich1
Obsidian | Level 7

Does anyone know of any users that install\configure Hadoop, Hive, and Pig directly from Apache and then successfully connect the Hadoop SAS\Access engine to these? Can SAS connect to a stand-alone Hadoop cluster using proc hadoop? It seems SAS\Access for Hadoop requires Hive, but is that it?

Bread Crumbs and Circuses for All
3 REPLIES 3
SimonDawson
SAS Employee
It might work. I don't recommend it though.

You will only receive support from SAS for SAS\ACCESS Interface to Hadoop when you are using a supported distribution.

https://support.sas.com/en/documentation/third-party-software-reference/9-4/support-for-hadoop.html
mich1
Obsidian | Level 7

Thanks - So, if I have an existing "home-brew" Hadoop cluster with Hive and Pig installed, then it's possible that the SAS\ACCESS engine will be able to communicate? From what I understand, Hive is required in order to use the LIBNAME statement, but will PROC HDFS work without Hive? Has anyone out there built their own cluster and then gotten SAS to connect?

Bread Crumbs and Circuses for All
mich1
Obsidian | Level 7

PROC HADOOP is part of base SAS and works now from PC. I created a Hadoop Cluster in UBUNTU 16 LTS and ran the latest hadooptracer.py script from SAS. I then copied over the config files to my PC. Note that the output configs and jars are for a single node pseudo distributed mode configuration. The config files contain the JARS and the XMLs SAS needs to talk to the Hadoop cluster. I placed these on my PC and ran the following program (configs are at \\MY NETWORK SHARE\Hadoop). Note that "Configured Hadoop User" has the .bashrc configured for JAVA and HADOOP on the UBUNTU cluster...

 

options SET = SAS_HADOOP_JAR_PATH " \\MY NETWORK SHARE\Hadoop\lib";
options SET = SAS_HADOOP_CONFIG_PATH " \\MY NETWORK SHARE\Hadoop\conf";

proc hadoop
username='Configured Hadoop User on UBUNTU' password='user password';
hdfs mkdir="/user/new";
run;

If you go to http://YOURCLUSTER FULL ADDRESS:50070/dfshealth.html#tab-overview > Utilities > Browse the file system > look under “user” and you will see the new HDFS directory. So, it appears to be possible to connect sas to a home-brew cluster.

 

I also got my server to connect via PROC HDFS (code is very similar)...I'd recommend the above as a troubleshooting step to see if there are configuration issues with the cluster or sas user permission issues...I'm gonna try to connect the access engine to hive on the cluster (I also installed Hive)  and I'll post any resultsRobot LOL

Bread Crumbs and Circuses for All

suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

CLI in SAS Viya

Learn how to install the SAS Viya CLI and a few commands you may find useful in this video by SAS’ Darrell Barton.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1256 views
  • 3 likes
  • 2 in conversation