Hi friends
Can any one please advice me with the steps how to install/configure for sas to hadoop connectivity using SAS/ACCESS Interface to ODBC. We have sas 9.4 m6 on linux os.
we have Integrated Windows authentication with kerberos.
need your help here
Thank you all in advance
Typically one would need to install the odbc client of your version of Hadoop on the SAS application server and configure the odbc driver as per the Hadoop documentation.
Thereafter it will be possible to connect like any other odbc database connection.
Have a look at the following. Your Hadoop vendor may be of help
https://www.progress.com/odbc/apache-hadoop-hive
https://www.cloudera.com/downloads/connectors/hive/odbc/2-6-1.html
https://www.microsoft.com/en-us/download/details.aspx?id=40886
https://hortonworks.com/wp-content/uploads/2013/04/Hortonworks-Hive-ODBC-Driver-User-Guide.pdf
Thanks for the links
Since we don't we have SAS Access interface to Hadoop license, and we will have to try with SAS Access interface to ODBC.
What version of Hadoop ODBC driver I need to install from below links
So below would be the steps right i need to follow to make the connection from SAS to Hadoop and vice versa.
> Install Hadoop ODBC driver on Unix server on compute machine
> Configure the environment variables and are there any other files similar to tnsnames and odbci.ini files in oracle and sqlserver that i need to configure for Hadoop
> Once the above steps done test with the libname statement and do i need to install any Hadoop servers in sas management console for this testing
would appreciate if you could point me to link with the steps for this set up via odbc driver
A LIBNAME statement does not require any SMC server setup. That's only required if you are defining any data connections in SAS metadata. Test first with LIBNAME and then implement any required ODBC data connections in SMC, to be shared with other users, including the ODBC server definition.
It would be good idea to raise a Tech Support track regarding the ODBC configuration so they can walk you through the required steps.
Installation of Hadoop ODBC drivers
This is something any administrator (Hadoop / Linux and possibly SAS) should be able to do.
I have provided some links links in the earlier post.. Have a look at them.
Your Hadoop vendor should be able to provide odbc drivers and install configuration and documentation.
After performing below steps, everything went well..but I got stuck at the last step when connecting to the DB.
I encountered with the below error when testing the connectivity to verify whether am able to list the database and access the table data
Can you advice me where am going wrong in this
[S1000][unixODBC][Hortonworks][Hardy] (76) Error connecting to ZooKeeper. Please ensure the ZooKeeper host(s) and other connection information are configured correctly.
[ISQL]ERROR: Could not SQLConnect
>Installed and Configured the Hive Driver and unixodbc drivers with no issues
./configure --prefix=/opt/odbc/unixodbc --exec-prefix=/opt/odbc/unixodbc/oth --bindir=/opt/odbc/unixodbc/bin --sbindir=/opt/odbc/unixodbc/sbin --libexecdir=/opt/odbc/unixodbc/libexec --sysconfdir=/opt/odbc/unixodbc/etc --libdir=/opt/odbc/unixodbc/lib --includedir=/lib --includedir=/lib64
make
make install
>Ensure that the libraries installed (if UnixODBC was installed from source files) are included in LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/lib:/lib64:/opt/odbc/unixodbc/lib
>Copied the files from hive installation directory into this location
# cp /usr/lib/hive/lib/native/hiveodbc/Setup/odbc.ini /opt/odbc/unixodbc/etc/odbc.ini
# cp /usr/lib/hive/lib/native/hiveodbc/Setup/odbcinst.ini /opt/odbc/unixodbc/etc/odbcinst.ini
>Configured the odbc.ini file
[hivedb]
Driver=/usr/lib/hive/lib/native/Linux-amd64-64/libhortonworkshiveodbc64.so
HOST=node1.hortonworks.com
PORT=10000
HiveServerType=2
AuthMech=1
UID=hive
KrbHostFQDN=kerb.hortonworks.com
KrbServiceName=hive
KrbRealm=HWX.COM
>Exported the variables to ensure that only our configured configuration files are being used
export ODBCINI=/opt/odbc/unixodbc/odbc.ini
export ODBCSYSINI=/opt/odbc/unixodbc
export HORTONWORKSHIVEINI=/usr/lib/hive/lib/native/Linux-amd64-64/hortonworks.hiveodbc.ini
>Tested the connectivity
Got the ticket from Hive and Kerberos using keytab file.
# kinit -k -t /tmp/hive.service.keytab hive/node1.hortonworks.com@HWX.COM
# klist
Once the ticket is granted, executed the following command to test connectivity
# /opt/odbc/unixodbc/bin/isql hivedb -v
but I encountered with the below error when testing the connectivity to verify whether am able to list the database and access the table data
[S1000][unixODBC][Hortonworks][Hardy] (76) Error connecting to ZooKeeper. Please ensure the ZooKeeper host(s) and other connection information are configured correctly.
[ISQL]ERROR: Could not SQLConnect
I was able to connect to hive with ./isql command from sas server
What would be the next step to accomplish sas to hadoop connectivity to pull data from hadoop to sas.
Can I run below libname statements from SAS EG to accomplish this ?
For Hadoop servers, we have multiple servers, we have high availability for Hive server2
so how does the multiple servers would fit in below libname statement for server = ?
# /opt/odbc/unixodbc/bin/isql hivedb +---------------------------------------+ | Connected! | | | | sql-statement | | help [tablename] | | quit | | | +---------------------------------------+ SQL> show tables; +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | tab_name | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | csvinternal | | cvsexternal | | sample_07
Sorry I didnt get you
No my question is
so how does the multiple servers would fit in below libname statement for server = ?
we have high availability for hive server2
HOST=<pcnkadpnt01.abc-abc.com:3020>,<pcnkadpnt02.abc-abc.com:3020>,<pcnkadpnt03.abc-abc.com:3020>
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.