Hi Guys,
I am trying to connect to a Cloudera cluster and run a query there by using an execute statement:
libname sas2 hadoop
uri='.........'
schema="data_analyst"
SERVER="1***.**.**.***"
TRANSCODE_FAIL=SILENT
DBMAX_TEXT=50;
libname vantage hadoop
uri='.........'
schema="vantage"
SERVER="***.**.**.***"
TRANSCODE_FAIL=SILENT
DBMAX_TEXT=50;
proc sql;
connect to HADOOP
(
SERVER="***.**.**.***"
uri='......'
);
execute(
create table sas.T0 as
SELECT
A.var1, A.var2,
B.var3,B.var4,
FROM
sas2.data_set_1 as A left join vantage.data_set_2 as B on A.column1=B.column2) by hadoop;
disconnect from hadoop;
quit;
The problem is that (since I want to "execute" in Cloudera) I cannot use the name sas2 as a Libref for the first schema. I must give it the name "data_analyst" which is not a valid SAS Libname... . So I cannot access/use the dataset sas2.data_set_1 in the query above. Any solution other than renaming the Cloudera schema "data_analyst" to something less than 9 characters ?
Hi LinusH and thank you for your help!
if I let "sas2" as the name of the "data_analyst" schema found in Cloudera, then upon using a dataset of that schema (sas2.dataset1) in an execute statement the following error is produced: "[Error 10072]: Database does not exist: sas2" or "["Error 10001]: Dataset does not exist: dataset1"
But these problems disappear when the name of the libref is the same as the name of the schema..
thanks again
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.