We’re smarter together. Learn from this collection of community knowledge and add your expertise.

Successfully configuring an ODBC driver to access Redshift from SAS on Linux

by SAS Employee ErwanGranger on ‎07-03-2015 11:03 AM - edited on ‎02-03-2017 01:35 PM by Community Manager (9,648 Views)

As many SAS Administrators and Architects know, before any SAS/Access engine can access any database, the proper database clients or ODBC configurations have to be present and successfully tested. (If that is not the case, your SAS libnames will not work any better, and can actually mask the problems further.)
And yet, these are not things that we typically have to do on day to day basis.  I, for one, do it so seldom that I can never remember the steps from one time to the next.
So, for my own future benefits (and hopefully, your present benefit), I decided to write this how-to, in the hope that:

  • I can find it when I have to do this again in a few months
  • It helps others out there

 

Disclaimer

I am by no means an expert in these matters, so it's possible that there is a better way to do these things. If that is the case, feel free to comment below this post.
Secondly, using Redshift from SAS is a wider topic that deserves its own discussion. If that is of interest to you, I recommend my colleague James Ke Wang's paper, Step into the Cloud: Ways to Connect to Amazon Redshift with SAS/ACCESS® . My post only deals with how to successfully install the Redshift drivers onto a RedHat Linux server.

The situation

Here is the situation I am describing:

  • Running on RHEL
  • Using SAS Access to ODBC
  • Configuring a connection to AWS Redshift

What you need

For this, you will need:

  • A Driver Manager (because we are working on Linux)
  • A Driver for Redshift
  • A working Redshift
  • A Security Group in AWS that allows the SAS server to connect to Redshift

If you are missing ANY of these, you will not be successful!
Keep in mind that the Driver Manager and the Driver are 2 very different things!
The driver for Redshift is what allows you to connect to it.
Whereas, the Driver Manager , well, it manages my drivers! :smileyhappy:

In other words, if I was to pick different data sources (say, Redshift, and Impala), it would be the part that chooses the right driver for the right database.
Note that you still need the driver manager on Linux, even if you have a single data source.

Dealing with the Driver Manager

There are multiple choices for the driver manager. You can get driver managers for Linux from companies like DataDirect, for a fee, or you can use the free/opensource ones.

For our purposes here, we will use the free one called “unixodbc”.

Check it’s there

[ec2-user@ip-10-0-0-145 ~]$ sudo rpm -q unixODBC
package unixODBC is not installed

If not, install it

[ec2-user@ip-10-0-0-145 ~]$ sudo yum -y install unixODBC

Loaded plugins: amazon-id, rhui-lb

rhui-REGION-client-config-server-7 | 2.9 kB 00:00:00

rhui-REGION-rhel-server-releases | 3.7 kB 00:00:00

rhui-REGION-rhel-server-rh-common | 1.9 kB 00:00:00

(1/4): rhui-REGION-client-config-server-7/x86_64/primary_db | 4.3 kB 00:00:00

(2/4): rhui-REGION-rhel-server-rh-common/7Server/x86_64/updateinfo | 11 kB 00:00:00

(3/4): rhui-REGION-rhel-server-rh-common/7Server/x86_64/primary | 30 kB 00:00:00

(4/4): rhui-REGION-rhel-server-releases/7Server/x86_64/primary_db | 12 MB 00:00:00

(1/2): rhui-REGION-rhel-server-releases/7Server/x86_64/updateinfo | 549 kB 00:00:00

(2/2): rhui-REGION-rhel-server-releases/7Server/x86_64/group_gz | 133 kB 00:00:00

rhui-REGION-rhel-server-rh-common 131/131

Resolving Dependencies

--> Running transaction check

---> Package unixODBC.x86_64 0:2.3.1-10.el7 will be installed

--> Processing Dependency: libltdl.so.7()(64bit) for package: unixODBC-2.3.1-10.el7.x86_64

--> Running transaction check

---> Package libtool-ltdl.x86_64 0:2.4.2-20.el7 will be installed

--> Finished Dependency Resolution

Dependencies Resolved

================================================================================================================================================================================

Package Arch Version Repository Size

================================================================================================================================================================================

Installing:

unixODBC x86_64 2.3.1-10.el7 rhui-REGION-rhel-server-releases 413 k

Installing for dependencies:

libtool-ltdl x86_64 2.4.2-20.el7 rhui-REGION-rhel-server-releases 49 k

Transaction Summary

================================================================================================================================================================================

Install 1 Package (+1 Dependent package)

Total download size: 461 k

Installed size: 1.3 M

Downloading packages:

(1/2): libtool-ltdl-2.4.2-20.el7.x86_64.rpm | 49 kB 00:00:00

(2/2): unixODBC-2.3.1-10.el7.x86_64.rpm | 413 kB 00:00:00

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Total 1.1 MB/s | 461 kB 00:00:00

Running transaction check

Running transaction test

Transaction test succeeded

Running transaction

Installing : libtool-ltdl-2.4.2-20.el7.x86_64 1/2

Installing : unixODBC-2.3.1-10.el7.x86_64 2/2

Verifying : libtool-ltdl-2.4.2-20.el7.x86_64 1/2

Verifying : unixODBC-2.3.1-10.el7.x86_64 2/2

 

Installed:

unixODBC.x86_64 0:2.3.1-10.el7

Dependency Installed:

libtool-ltdl.x86_64 0:2.4.2-20.el7

Complete!

 

Dealing with the Redshift Driver

Note that once again, there are multiple choices. Datadirect offers some drivers, for a fee. Here, we are using the drivers provided for free by AWS.

Check if it’s there

[ec2-user@ip-10-0-0-145 ~]$ sudo rpm -q AmazonRedshiftODBC-64bit.x86_64

package AmazonRedshiftODBC-64bit.x86_64 is not installed

 

If not, install it

Make sure to check this page for the hyperlink to the latest .rpm of the driver. (mine below dates back to May).

 

[ec2-user@ip-10-0-0-145 ~]$ wget https://s3.amazonaws.com/redshift-downloads/drivers/AmazonRedshiftODBC-64bit-1.1.0.0000-1.x86_64.rpm
--2015-05-15 12:45:31-- https://s3.amazonaws.com/redshift-downloads/drivers/AmazonRedshiftODBC-64bit-1.1.0.0000-1.x86_64.rpm

Resolving s3.amazonaws.com (s3.amazonaws.com)... 54.231.9.88

Connecting to s3.amazonaws.com (s3.amazonaws.com)|54.231.9.88|:443... connected.

HTTP request sent, awaiting response... 200 OK

Length: 10546318 (10M) [application/x-rpm]

Saving to: ‘AmazonRedshiftODBC-64bit-1.1.0.0000-1.x86_64.rpm’

100%[======================================================================================================================================>] 10,546,318 --.-K/s in 0.1s

2015-05-15 12:45:31 (70.7 MB/s) - ‘AmazonRedshiftODBC-64bit-1.1.0.0000-1.x86_64.rpm’ saved [10546318/10546318]

 

[ec2-user@ip-10-0-0-145 ~]$ sudo yum -y --nogpgcheck localinstall AmazonRedshiftODBC-64bit-1.1.0.0000-1.x86_64.rpm

Loaded plugins: amazon-id, rhui-lb

Examining AmazonRedshiftODBC-64bit-1.1.0.0000-1.x86_64.rpm: AmazonRedshiftODBC-64bit-1.1.0-1.x86_64

Marking AmazonRedshiftODBC-64bit-1.1.0.0000-1.x86_64.rpm to be installed

Resolving Dependencies

--> Running transaction check

---> Package AmazonRedshiftODBC-64bit.x86_64 0:1.1.0-1 will be installed

--> Finished Dependency Resolution

Dependencies Resolved

================================================================================================================================================================================

Package Arch Version Repository Size

================================================================================================================================================================================

Installing:

AmazonRedshiftODBC-64bit x86_64 1.1.0-1 /AmazonRedshiftODBC-64bit-1.1.0.0000-1.x86_64 30 M

Transaction Summary

================================================================================================================================================================================

Install 1 Package

Total size: 30 M

Installed size: 30 M

Downloading packages:

Running transaction check

Running transaction test

Transaction test succeeded

Running transaction

Installing : AmazonRedshiftODBC-64bit-1.1.0-1.x86_64 1/1

Verifying : AmazonRedshiftODBC-64bit-1.1.0-1.x86_64 1/1

 

Installed:

AmazonRedshiftODBC-64bit.x86_64 0:1.1.0-1

 

Complete!

 

The progressive configuration of ODBC

Copy the sample files to another location

First, we need to copy some files from their original locations. Although you could leave them in place, it’s not a very good idea. A future “yum update” may modify them. It’s better to have your own customized ones.

[ec2-user@ip-10-0-0-145 ~]$ cp /etc/odbcinst.ini /shared/odbc/

[ec2-user@ip-10-0-0-145 ~]$ cp /opt/amazon/redshiftodbc/Setup/odbc.ini /shared/odbc/

[ec2-user@ip-10-0-0-145 ~]$ cp /opt/amazon/redshiftodbc/lib/64/amazon.redshiftodbc.ini /shared/odbc

[ec2-user@ip-10-0-0-145 ~]$ ll /shared/odbc

total 12

-rwxr--r--. 1 ec2-user ec2-user 1181 May 15 12:56 amazon.redshiftodbc.ini

-rwxr--r--. 1 ec2-user ec2-user 1961 May 15 12:56 odbc.ini

-rw-r--r--. 1 ec2-user ec2-user 577  May 15 12:55 odbcinst.ini

 

Create some Environment Variables

For SAS to pick up the right locations, we will need to define some environment variables.
But first, let’s see which location is currently defined.
Before changing anything else, you should run:

[ec2-user@ip-10-0-0-145 ~]$ odbcinst -j

unixODBC 2.3.1

DRIVERS............: /etc/odbcinst.ini

SYSTEM DATA SOURCES: /etc/odbc.ini

FILE DATA SOURCES..: /etc/ODBCDataSources

USER DATA SOURCES..: /home/ec2-user/.odbc.ini

SQLULEN Size.......: 8

SQLLEN Size........: 8

SQLSETPOSIROW Size.: 8

[ec2-user@ip-10-0-0-145 ~]$

 

As you can see, it’s pointing to the wrong places (default paths).
To change that, we need to define some environment variables for it. Mine look like this:

[ec2-user@ip-10-0-0-145 ~]$ cat /sas/sashome/SASFoundation/9.4/bin/sasenv_local

# ...

export ODBCSYSINI=/shared/odbc

export ODBCINI=/shared/odbc/odbc.ini

export AMAZONREDSHIFTODBCINI=/shared/odbc/amazon.redshiftodbc.ini

export LD_LIBRARY_PATH=/usr/lib64/:$LD_LIBRARY_PATH

# ...

If I then source these environment variables (don’t forget the “dot-space” at the beginning of the line), things will look better:

[ec2-user@ip-10-0-0-145 ~]$ . /sas/sashome/SASFoundation/9.4/bin/sasenv_local

[ec2-user@ip-10-0-0-145 ~]$ odbcinst -j

unixODBC 2.3.1

DRIVERS............: /shared/odbc/odbcinst.ini

SYSTEM DATA SOURCES: /shared/odbc/odbc.ini

FILE DATA SOURCES..: /shared/odbc/ODBCDataSources

USER DATA SOURCES..: /shared/odbc/odbc.ini

SQLULEN Size.......: 8

SQLLEN Size........: 8

SQLSETPOSIROW Size.: 8

 

As you can see, it’s now picking up the new location. We are making progress!

ODBC configuration files

We will need to update the 3 files, to describe our driver, and our data sources. I advise you to do them in the order described here.

Check the drivers

To check which drivers are configured, I can run this:

[ec2-user@ip-10-0-0-145 ~]$ odbcinst -q -d

[PostgreSQL]

[MySQL]

 

These are not the droids drivers I am looking for.
I will remove them from the odbcinst.ini file, and add the Redshift one instead. (I am only setting up the 64 bit driver for Redshift, by the way).

odbcinst.ini

We need to edit the odbcinst.ini file to define our Redshift driver. The name in bracket is chosen by me.

[ec2-user@ip-10-0-0-145 ~]$ cat /shared/odbc/odbcinst.ini

#[Amazon Redshift (x64)]

[MyRedshiftDriver]

Description=Amazon Redshift ODBC Driver(64-bit)

Driver=/opt/amazon/redshiftodbc/lib/64/libamazonredshiftodbc64.so

 

now:

[ec2-user@ip-10-0-0-145 ~]$ odbcinst -q -d

[MyRedshiftDriver]

 

Success, so far.
Watch out: In your odbcinst.ini file, if you have one or more spaces at the beginning of the line with the driver name, the odbcinst -q -d  will not pick them up. I does not seem to read the line at all. I wasted quite a bit of time, because I had "indented" the file. So remember that there should be no leading space on the line before the [MyRedshiftDriver].

amazon.redshiftodbc.ini

The default file needs to be modified because we are using unixODBC. The parts I changed are in Red below.

[ec2-user@ip-10-0-0-145 ~]$ cat /shared/odbc/amazon.redshiftodbc.ini

## If you did not install in the default directory, replace '/opt/amazon/redshiftodbc'

## with the correct location.

 

## - Note that this default DriverManagerEncoding of UTF-32 is for iODBC.

## - unixODBC uses UTF-16 by default.

## - If unixODBC was compiled with -DSQL_WCHART_CONVERT, then UTF-32 is the correct value.

## Execute 'odbc_config --cflags' to determine if you need UTF-32 or UTF-16 on unixODBC

## - AmazonDM can be used with UTF-8 or UTF-16.

## The DriverUnicodeEncoding setting will cause AmazonDM to run in UTF-8 when set to 2 or UTF-16 when set to 1.

 

[Driver]

#EG# DriverManagerEncoding=UTF-32

DriverManagerEncoding=UTF-16

ErrorMessagesPath=/opt/amazon/redshiftodbc/ErrorMessages

LogLevel=0

LogPath=[LogPath]

 

## - Uncomment the ODBCInstLib corresponding to the Driver Manager being used.

## - Note that the path to your ODBC Driver Manager must be specified in LD_LIBRARY_PATH.

 

# Generic ODBCInstLib

# iODBC

#EG# ODBCInstLib=libiodbcinst.so

 

# AmazonDM / unixODBC

#ODBCInstLib=libodbcinst.so

 

# AIX specific ODBCInstLib

# iODBC

#ODBCInstLib=libiodbcinst.a(libiodbcinst.so.2)

 

# AmazonDM

#ODBCInstLib=libodbcinst.a(odbcinst.so)

 

# unixODBC

#EG# I uncomment the line below because we use unixODBC.

ODBCInstLib=libodbcinst.a(libodbcinst.so.1)

Here, these is no test I know of that we can run yet. we have to wait a bit more.

odbc.ini

This is where we will define our data sources. To see what is currently defined:

[ec2-user@ip-10-0-0-145 ~]$ odbcinst -q -s

[ODBC]

[Amazon Redshift ODBC DSN 32]

[Amazon Redshift ODBC DSN 64]

These are the default template ones. So we need to update the file with our “real” Redshift.
Here is my version of odbc.ini, with as little content as I can get away with:

[ec2-user@ip-10-0-0-145 ~]$ cat /shared/odbc/odbc.ini

ODBC Data Sources]

MyRealRedshift=MyRedshiftDriver

 

[MyRealRedshift]

# Driver: The location where the ODBC driver is installed to.

Driver=/opt/amazon/redshiftodbc/lib/64/libamazonredshiftodbc64.so

 

# Required: These values can also be specified in the connection string.

Server=abc

Port=1234

Database=mydb

locale=en-US

 

As you can see, the server info is bogus, but for now, that should be enough.
I chose the Server name to be "abc". Keep that in mind, it will be important in a minute.
Let’s test again:

[ec2-user@ip-10-0-0-145 ~]$ odbcinst -q -s

[MyRealRedshift]

 

Nice job! Getting closer.

 

Connection testing.

This is where connection problems (if they exist) are going to start surfacing. We need to start testing whether the ODBC connection works.
You could install an elaborate ODBC client like SQL Workbench on the server.
But I usually prefer to use the isql tool that comes with the Driver Manager on Linux.

As you can see below, when trying to connect to Redshift, this shows me there is an issue.

[ec2-user@ip-10-0-0-145 ~]$ isql -v MyRealRedshift

[S1000][unixODBC][Amazon][ODBC] (11560)

Unable to locate SQLGetPrivateProfileString function.

[ISQL]ERROR: Could not SQLConnect

 

After some googling, I discovered that the reason for this particular error is that the last line in my amazon.redshiftodbc.ini is wrong. After I fix it, it looks like:

[ec2-user@ip-10-0-0-145 ~]$ tail /shared/odbc/amazon.redshiftodbc.ini

#ODBCInstLib=libodbcinst.a(odbcinst.so)

 

# unixODBC

#EG# I uncomment the line below.

#EG# this does not work:

# ODBCInstLib=libodbcinst.a(libodbcinst.so.1)

#EG# But this does:

ODBCInstLib=/usr/lib64/libodbcinst.so

And now, things work slightly better (it's a process!):

[ec2-user@ip-10-0-0-145 ~]$ isql -v MyRealRedshift

[08001][unixODBC][Amazon][ODBC] (10380)

Unable to establish a connection with data source.

Missing settings: {[Password] [Username]}

[ISQL]ERROR: Could not SQLConnect

[ec2-user@ip-10-0-0-145 ~]$ isql -v MyRealRedshift dummy dummy

[S1000][unixODBC][Amazon][RedShift ODBC] (10)

Error occurred while trying to connect:

could not translate host name "abc" to address:

Name or service not known

[ISQL]ERROR: Could not SQLConnect

[ec2-user@ip-10-0-0-145 ~]$

 

We are getting closer. It’s complaining about the hostname “abc” which is understandable.

So, if I now put in a real value for hostname and port, I can try to connect.
These values are provided by the Redshift interface (or the person who has access to it):

ClusterProperties.png

 

[ec2-user@ip-10-0-0-145 ~]$ tail /shared/odbc/odbc.ini

 

[MyRealRedshift]

# Driver: The location where the ODBC driver is installed to.

Driver=/opt/amazon/redshiftodbc/lib/64/libamazonredshiftodbc64.so

 

# Required: These values can also be specified in the connection string.

Server=bluemoon.XXXXXXXXXXXXXXX.us-east-1.redshift.amazonaws.com

Port=5439

Database=bluemoon

locale=en-US

 

 

[ec2-user@ip-10-0-0-145 ~]$ isql -v "MyRealRedshift" redshiftuser redshiftpassword

[S1000][unixODBC][Amazon][RedShift ODBC] (10)

Error occurred while trying to connect:

could not connect to server: Connection timed out

Is the server running on host "bluemoon.XXXXXXXXXXXXXX.us-east-1.redshift.amazonaws.com" (54.81.XXX.216) and accepting TCP/IP connections on port 5439?

[ISQL]ERROR: Could not SQLConnect

[ec2-user@ip-10-0-0-145 ~]$

 

This particular error is likely due to a firewall issue. Redshift has an AWS Security Group that restricts access to it. That needs to be opened up (although probably not to the world like I am doing here):

SecurityGroup

[ec2-user@ip-10-0-0-145 ~]$ isql -v "MyRealRedshift" redshiftuser redshiftpassword

+---------------------------------------+

| Connected!                            |

|                                       |

| sql-statement                         |

| help [tablename]                      |

| quit                                  |

|                                       |

+---------------------------------------+

SQL> quit

[ec2-user@ip-10-0-0-145 ~]$

This shows a successful connection.
Since I had pre-loaded some data into Redshift, I can look at it:

[ec2-user@ip-10-0-0-145 ~]$ echo "select * from class" | isql -v "MyRealRedshift" redshiftuser redshiftpassword

+---------------------------------------+

| Connected!                            |

|                                       |

| sql-statement                         |

| help [tablename]                      |

| quit                                  |

|                                       |

+---------------------------------------+

SQL> select * from class

+---------+----+-----------+------------+------------+

| name    | sex| age       | height     | weight     |

+---------+----+-----------+------------+------------+

| Barbara | F  | 13        | 65.3       | 98         |

| Carol   | F  | 14        | 62.8       | 102.5      |

| James   | M  | 12        | 57.3       | 83         |

| John    | M  | 12        | 59         | 99.5       |

| Louise  | F  | 12        | 56.3       | 77         |

| Mary    | F  | 15        | 66.5       | 112        |

| Philip  | M  | 16        | 72         | 150        |

| Robert  | M  | 12        | 64.8       | 128        |

| Ronald  | M  | 15        | 67         | 133        |

| William | M  | 15        | 66.5       | 112        |

| Alfred  | M  | 14        | 69         | 112.5      |

| Alice   | F  | 13        | 56.5       | 84         |

| Henry   | M  | 14        | 63.5       | 102.5      |

| Jane    | F  | 12        | 59.8       | 84.5       |

| Janet   | F  | 15        | 62.5       | 112.5      |

| Jeffrey | M  | 13        | 62.5       | 84         |

| Joyce   | F  | 11        | 51.3       | 50.5       |

| Judy    | F  | 14        | 64.3       | 90         |

| Thomas  | M  | 11        | 57.5       | 85         |

+---------+----+-----------+------------+------------+

SQLRowCount returns 19

19 rows fetched

SQL>

 

If you are not fully convinced, a look at the Redshift Queries page will finish confirming our success:

red02.png

At this point,  we are satisfied that the drivers works properly outside of SAS.
Get the champagne out of the fridge! Almost time to celebrate!

 

The verification

Finally, to test it from SAS, you can start a new SAS Session (so it gets the environment variables you defined in the sasenv_local file).
You then assign a libname, like:

options sastrace= ',,,ds' sastraceloc=saslog nostsuffix sql_ip_trace=(note,source) msglevel=i;

options mprint  source details;

 

%put %quote(%sysget(LD_LIBRARY_PATH));

 

LIBNAME bluem ODBC DSN='MyRealRedshift' user=redshiftuser password=redshiftpassword;

 

If you see the following in the log, you are probably in business with Redshift:

NOTE: Libref BLUEM was successfully assigned as follows:

  Engine: ODBC

  Physical Name:MyRealRedshift

 

You should now be able interact with Redshift from SAS.

 

References

Here are some of the documents I used while working on this:

http://docs.aws.amazon.com/redshift/latest/mgmt/install-odbc-driver-linux.html

http://docs.aws.amazon.com/redshift/latest/mgmt/odbc-driver-configure-linux-mac.html

http://support.sas.com/documentation/installcenter/en/ikfdtnunxcg/66380/PDF/default/config.pdf

Your turn
Sign In!

Want to write an article? Sign in with your profile.