BookmarkSubscribeRSS Feed
juanvg1972
Pyrite | Level 9

 

Hi,

 

I am working with a SAS Base in a server and I need to connect to a Hadoop based data lake to get some data from HDFS.

I am using SAS/ACCESS to Hadoop and using a libname to Hive, then I run a query in Hadoop. This type of connection works but It's very slow,.It takes more than an hour tu run a Hive query that get 10.000 rows from a 10 millions rows file.

 

Is there any other way to ge data from Hadoop in SAS?, can I connect to Impala vía libname? in the data lake I have algo Spark installed.

 

Any advice will be greatly appreciated

 

Thanks in advance

7 REPLIES 7
ChrisNZ
Tourmaline | Level 20

As I recall, Hive is very slow for querying and is best used to write data, while Impala is best used to query data.

juanvg1972
Pyrite | Level 9

Yes, I know it, but how can I connect Impala from SAS?

 

Thanks

juanvg1972
Pyrite | Level 9

I have found this:

 

https://www.sas.com/en_us/software/access-interface-impala.html

 

I have other question:

 

Is there any way of execute SAS code (data steps, proc sql and statistycal procs) in a Hadoop in a distrbuted way?

Is there any module to do this?. If I transforma this code to proc ds2 o proc fedsql it will be possible?

 

Thanks in advance

ChrisNZ
Tourmaline | Level 20

 

1.  SAS/ACCESS or ODBC ( https://www.cloudera.com/downloads/connectors/impala/odbc/2-5-37.html ) allow you to use Impala from SAS

 

2. You need SAS Embedded Process to execute SAS code within the Hadoop cluster.

Otherwise you are limited to SQL-type queries.

 

sriramsahani
Fluorite | Level 6

Hello,

 

Did you find a way to connect to Impala without using SAS Access to Impala, using the Hadoop engine ?

 

Regards,

SS

ChrisNZ
Tourmaline | Level 20
The Hadoop engine connects to Hive.
You may be able to connect to Impala via ODBC?
sriramsahani
Fluorite | Level 6

Thank you Chris for confirming

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 3530 views
  • 1 like
  • 3 in conversation