BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jimbarbour
Meteorite | Level 14

I'm running SAS 9.4 M6 on a 64 bit Windows Server 2016 machine.  I have both Enterprise Guide (8.1) and Display Manager available to me, but I typically use EG.

 

I'm connecting to a Hadoop database.  I can create a Libname, or I can access the database via PROC SQL and explicit pass through.  Both methods are using ODBC.

 

ISSUE:  I can't get a list of Hadoop tables into my SAS session.  

 

If I know the table name in advance, I can successfully run a query, but it's darned inconvenient not to be able to get a list of tables.

 

In past shops, I've been able to see a list of the Hadoop tables after assigning a Libname in the SAS EG "Servers" panel.  In this case, I can see the Libname, but when I click on the drop down, there are no tables.  Display Manager behaves similarly.  See SAS code for Libname below.

 

I tried running a Proc SQL against DICTIONARY.TABLES, but I got no rows in return.

 

I also tried using PROC DATASETS.  In my log, I get:

 

29         PROC	DATASETS
30         	LIBRARY=HIVEDB
31         	;
ODBC: AUTOCOMMIT is YES for connection 1
ODBC: Called SQLTables with schema of NULL
WARNING: No matching members in directory.
32         RUN;

 

 

I can do a SHOW TABLES via an explicit pass through, but nothing gets returned to the SAS log.

 

Question 1:  I notice in my log from my Proc Datasets the message, "Called SQLTables with schema of NULL".  Is there a way to tell Proc Datasets that I want a particular schema?  I did specify a schema on my Libname, but apparently it's not coded correctly or isn't getting passed to Proc Datasets.  Here's my Libname:

 

libname hiveDB odbc noprompt="uid=jbarbour; pwd=XXXXXXXXX; dsn=OPSI_HIVE_STG1; 
host=dbms0502; port=10000;schema=edps; authmech=3";

 

 

Question 2:  Is there a way I can redirect the output of SHOW TABLES to the SAS log?  I have the following options in effect:

 

options SASTRACE=",,,ds" sastraceloc=saslog nostsuffix;
options source source2 mprint fullstimer notes fmterr;
options msglevel=i; 

 

 

Thank you in advance,

 

Jim

1 ACCEPTED SOLUTION

Accepted Solutions
r_behata
Barite | Level 11

Hello @jimbarbour 

 

If your site has licence for SAS Access to Hadoop, try this instead of ODBC Connection :

 

proc sql;
	connect to hadoop
		(READ_METHOD=HDFS                                       /* this attemps HDFS read which is faster than JDBC */
		server="<Serve-Name>"       /* the hiveserver we talk to */
		LOGIN_TIMEOUT=300
		schema=&hdp_schema.                                   /* this is the schema where you want to read or write data to */
		uri="<Server_uri>");

create table schema_hdp as select * from connection to hadoop (show tables); disconnect from hadoop; quit;

 

 

 

 

View solution in original post

13 REPLIES 13
Reeza
Super User
What does your explicit pass through code look like?
jimbarbour
Meteorite | Level 14

@Reeza ,

 

My query looks like:

PROC	SQL;
	CONNECT	TO	ODBC	AS	Hive_DB	
		(NOPROMPT	=	"uid=jbarbour; pwd=XXXXXXXX; dsn=OPSI_HIVE_STG1; 
						host=dbms0502; port=10000;schema=edps; authmech=3")
						;
	EXECUTE(Show Tables)	BY	Hive_DB;
	DISCONNECT	FROM	Hive_DB;
QUIT;

My log looks like:

2 The SAS System                                                                                15:44 Thursday, March 5, 2020

29         PROC	SQL;
30         	CONNECT	TO	ODBC	AS	Hive_DB	
31         		(NOPROMPT	=	XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
32         XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX)
33         						;
ODBC: AUTOCOMMIT is YES for connection 2
34         	EXECUTE(Show Tables)	BY	Hive_DB;
 
ODBC_1: Executed: on connection 2
Show Tables
 
 
Summary Statistics for ODBC are:
Total SQL execution seconds were:                   0.158019
Total seconds used by the ODBC ACCESS engine were     0.159900
 
35         	DISCONNECT	FROM	Hive_DB;
36         QUIT;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.61 seconds
      user cpu time       0.01 seconds
      system cpu time     0.03 seconds
      memory              365.28k
      OS Memory           20828.00k
      Timestamp           03/05/2020 03:44:38 PM
      Step Count                        35  Switch Count  0

Jim

 

 

 

 

Reeza
Super User
Google searches are showing a metastore.tbls table? Does that exist for you?
jimbarbour
Meteorite | Level 14

@Reeza ,

 

Could "metastore" be "metastor" or something?  I'm getting an 8 character message.

29         data	test2;
30         	set	METASTORE.TBLS;
ERROR: Libref 'METASTORE' exceeds 8 characters.
ERROR: File WORK.TBLS.DATA does not exist.
31         RUN;

Jim

r_behata
Barite | Level 11

Hello @jimbarbour 

 

If your site has licence for SAS Access to Hadoop, try this instead of ODBC Connection :

 

proc sql;
	connect to hadoop
		(READ_METHOD=HDFS                                       /* this attemps HDFS read which is faster than JDBC */
		server="<Serve-Name>"       /* the hiveserver we talk to */
		LOGIN_TIMEOUT=300
		schema=&hdp_schema.                                   /* this is the schema where you want to read or write data to */
		uri="<Server_uri>");

create table schema_hdp as select * from connection to hadoop (show tables); disconnect from hadoop; quit;

 

 

 

 

jimbarbour
Meteorite | Level 14

@r_behata ,

 

We do have SAS/ACCESS for ODBC.  Let me ask and see if I can get a URI.

---Base SAS Software                    
---SAS/STAT                             
---SAS/CONNECT                          
---SAS/Secure 168-bit                   
---SAS/Secure Windows                   
---SAS/ACCESS Interface to PC Files     
---SAS/ACCESS Interface to ODBC         
---SAS Workspace Server for Local Access
---High Performance Suite               

Jim

 

 

Reeza
Super User

Schema is usually called metastore?

 

libname hiveDB odbc noprompt="uid=jbarbour; pwd=XXXXXXXXX; dsn=OPSI_HIVE_STG1; 
host=dbms0502; port=10000;schema=metastore; authmech=3";

data want;
set hivedb.tbls;
run;

Or something like this, not sure if the schema may affect things here:

 

PROC	SQL;
	CONNECT	TO	ODBC	AS	Hive_DB	
		(NOPROMPT	=	"uid=jbarbour; pwd=XXXXXXXX; dsn=OPSI_HIVE_STG1; 
						host=dbms0502; port=10000;schema=edps; authmech=3")
	create table demo as select * from connection to HIVE_DB (select * from metastore.tbls)					
	DISCONNECT	FROM	Hive_DB;
QUIT;

These are just guesses based on how I know Hadoop functions, can't test unfortunately. Good Luck!

jimbarbour
Meteorite | Level 14

@Reeza ,

 

Thank you for following up.

 

I couldn't get it to work.  It errors out.  I ran the same code as I had before except that I changed the schema to metastore.

 

The error messages isn't particularly instructive, but I assume that it just can't find a schema named metastore.

ERROR: CLI error trying to establish connection: Unable to retrieve error message.

Jim

Reeza
Super User
Weird, thought the explicit sql would have worked for sure. Oh well, learned something 🙂
jimbarbour
Meteorite | Level 14

@Reeza ,

 

Well, the explicit pass through did in fact work, just not the schema "metastore".  I'm thinking that, while "metastore" may be common as a schema, it is unfortunately not universal.

 

Jim

jimbarbour
Meteorite | Level 14

@r_behata ,

 

OK!  That did the trick!  Thank you so much!!

 

The key was "SELECT * FROM CONNECTION TO Hive_DB(Show Tables)".  I had coded EXECUTE(Show Tables) BY Hive_DB which ran fine -- but returned nothing to SAS.  

 

Here's my working code:

 

PROC	SQL;
	CONNECT	TO	ODBC	AS	Hive_DB	
		(NOPROMPT	=	"uid=jbarbour; pwd=XXXXXXXX; dsn=OPSI_HIVE_STG1; 
						host=dbms0502; port=10000;schema=charttarget; authmech=3")
						;
	CREATE	TABLE	WORK.Hadoop_Table_List	AS
		SELECT	*	FROM	CONNECTION	TO	Hive_DB(SHOW TABLES);
	DISCONNECT	FROM	Hive_DB;
QUIT;

I thank you again,

 

Jim

 

Reeza
Super User
Out of curiosity does the metastore query work as well or does it not work?
jimbarbour
Meteorite | Level 14

@Reeza , @r_behata ,

 

Actually, there's an even better way to do this:  List the schema twice on a Libname.  As counter-intuitive and kludgey as that sounds, it works.  When done in this fashion, Proc Datasets works, SQL against DICTIONARY tables works, and, more importantly, the tables are listed in Display Manager or Enterprise Guide just as though they were SAS tables.

 

For example:

LIBNAME	HiveDB	ODBC	SCHEMA		=	ChartTarget
						NOPROMPT	=	"UID=jbarbour; PWD=&lPwd; HOST=DBMS0502; 
										DSN=OPSI_HIVE_STG1; SCHEMA=ChartTarget; 
										PORT=10000; AUTHMECH=3"
										;

Notice in the above example that I specify the schema twice, once outside the NOPROMPT and once inside.  The schema inside the NOPROMPT appears to be used for the actual connection to the databse.  The schema outside the NOPROMPT appears to be used by SAS to populate Proc Datasets, the SAS GUI's (DMS and EG), and the DICTIONARY tables.

 

The direct connection with a "Select * from connection to" syntax is still a very useful construct, so the original answer from @r_behata still stands as the solution.  

 

Jim

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 13 replies
  • 4253 views
  • 3 likes
  • 3 in conversation