- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Experts,
I tried to run this code and got out of memory error: connet time out exception
proc sql;
create table bda.temp as
select a.*, (today()-input( substr( date,1,10),yymmdd10.))/365 as age
from bda.aaa as a
;
quit;
now i want to try do it with pass-through to hadoop and create the age calculation by using hive functions
can someone share with me some code? i am pretty new to this kind of feature
Best Regards
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Before you invest too much time into it, can you connect? If you can't connect via a Libname, then it's unlikely that you will be able to connect using explicit pass through SQL. I think the connection issue is the first priority. A Timeout could just be a lack of resources, so connect should be tried at different junctures.
That said, explicit pass through SQL is often a better option with Hadoop than regular Proc SQL with a Libname, so let me dig up some code for you.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi ,
Thanks for helping me 🙂
When im using the libname with simple sql (without the calculated column age as i shown in previous example) it works fine
Find attach the error from the proc sql query
Best Regards
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for posting your log. I was thinking that since it was just text that you could copy it into your reply, but a photo works.
So, a simple SQL command works but a more complex one does not? Fascinating.
Do you see the "NOTE:?" It says "Validate the contents of the Hadoop configuration file and ensure user permissions are correct." I suspect that this is the problem, i.e. that it is a permissions problem. You should definitely talk to your Data Architecture or Hadoop DBA group (or whatever it is called at your workplace) and describe the problem and have them look into it.
In the mean time, though, now that we've diagnosed the "connection" problem a bit (actually it looks more like a permissions problem now that I've seen your log), it's time to try the explicit pass through SQL. Part of why I had you set your options to:
OPTIONS SASTRACE = ",,,ds";
OPTIONS SASTRACELOC = SASLOG;
OPTIONS NOSTSUFFIX;
OPTIONS MsgLevel = I;
was so that we could see what SAS was doing behind the scenes. If you scroll up a bit, you'll see that SAS is trying to create some kind of intermediate work table -- and failing. The simple counts query probably doesn't need any intermediate tables, so it runs fine, but when SAS determines that an intermediate table is required (as in this case), your query fails. This is consistent with a permissions problem.
Here is where explicit pass through SQL comes in really handy. When using regular Proc SQL, we don't have much choice about what SAS passes to Hadoop. However, with explicit pass through, we have a great deal more control. Please, now let's try the explicit pass through code that I gave you. First, let's get the simple query I gave you working, then let's try using date functions. However, note: SAS date functions and Hadoop date functions may differ slightly. Even if we get the explicit pass through code working, you should still get the permissions issue taken care of so that you can use regular Proc SQL and SAS functions instead of having to figure out Hadoop functions and adjust for their differences. SAS has already identified and compensated for the differences in date functions when using Proc SQL with a Hadoop Libname, so we will want to used this already tested and de-bugged code from SAS.
Please try the explicit pass through code and let me know how it goes.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
OK, here's the first bit of code. This is a Libname statement. Yes, I know you want explicit pass through code. I'll get to that in a minute. First, I'd like to see what's going on with your connection.
OPTION SASTRACE=',,,ds' SASTRACELOC=saslog nostsuffix;
%MACRO Allo_Hadoop(Libname, Host, Port, Schema, User, Password, Subprotocol=hive2); LIBNAME &Libname HADOOP SERVER = &Host PORT = &Port SCHEMA = &Schema USER = &SysuserID PASSWORD = "&Password" SUBPROTOCOL = &Subprotocol PROPERTIES = "mapred.job.queue.name=root.ctm_q1.dev_sq1" ; %MEND Allo_Hadoop;
%Allo_Hadoop(&Lib, Host=apvrp817, Port=10486, Schema=Rsc1, User=&SysuserID, Password=&Password);
Can you post your Libname as well? And exactly which error are you getting? Would you please post the log?
And then some very simple SQL. Yes, this is very very simple SQL. If it doesn't work with simple SQL, then it won't work with complex SQL, so please try (after you've allocated a Libname) running the below SQL and posting the log with any errors. You'll need to specify the Libname and the Table name. Also please set these options before you run your Libname or SQL code:
OPTION SASTRACE=',,,ds' SASTRACELOC=saslog nostsuffix ;
%LET Lib = XXXXXX;
%LET Table = YYYYYY;
PROC SQL;
SELECT COUNT(1) AS Row_Count FORMAT=COMMA22.
INTO : Row_Count
FROM &Lib..&Table;
QUIT;
%PUT NOTE: &Lib..&Table rows=%QCMPRES(&Row_Count);
Please try a Libname and the above count SQL and post the log for them. Let's see if we can figure out what's going on with the connection.
Now, let me go dig up some explicit pass through code.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Now, here's some explicit pass through code. Please try to get the Libname/regular Proc SQL code I posted previously first, and then let's try this code:
OPTIONS SASTRACE = ",,,ds";
OPTIONS SASTRACELOC = SASLOG;
OPTIONS NOSTSUFFIX;
OPTIONS MsgLevel = I;
**------------------------------------------------------------------------------**;
%LET Conn = Hdp_Conn;
%LET SAS_Eng = HADOOP;
%LET Engine = MR;
*%LET Engine = Spark;
*%LET Engine = Tez;
%LET W_Queue = root.analyt_sq1;
%LET L_Queue = root.dev_sq1;
**------------------------------------------------------------------------------**;
%IF %QUPCASE(&SAS_Eng) = HADOOP %THEN
%DO;
%LET Type = Lnx_JDBC;
%LET Connect = %NRSTR(SERVER = &Host
PORT = &Port
SCHEMA = &Schema
USER = &SysuserID
PASSWORD = "&Password"
SUBPROTOCOL = Hive2);
%END;
%ELSE
%DO;
%LET Type = Win_ODBC;
%LET Connect = %NRSTR(NOPROMPT = "uid=&SysuserID;
pwd=&PASSWORD;
host=&host;
dsn=&DSN;
schema=&Schema;
port=&port;
authmech=3");
%END;
**------------------------------------------------------------------------------**;
PROC SQL;
CONNECT TO &SAS_Eng AS &Conn (&Connect);
EXECUTE(set hive.resultset.use.unique.column.names = FALSE) BY &Conn;
%IF %QUPCASE(&SAS_Eng) = HADOOP %THEN
%DO;
EXECUTE(set mapred.job.queue.name = &L_Queue) BY &Conn;
%END;
%ELSE
%DO;
EXECUTE(set mapred.job.queue.name = &W_Queue) BY &Conn;
%END;
EXECUTE(set hive.execution.engine = &Engine) BY &Conn;
EXECUTE(set mapred.job.name = &Type._Mems_Mem_Prov_Rel) BY &Conn;
CREATE TABLE WORK.Provs_Mem_Prov_Rel AS
SELECT *
FROM CONNECTION TO &Conn (
SELECT COUNT(1)
FROM &Schema..ALL_MEMBERS
);
DISCONNECT FROM &Conn;
QUIT;
The explicit pass through code above may look a little complicated, but it's not really too bad. I connect from different machines, so I use macro variables to control the parameters depending on which machine I'm running on. Many of these macro variables can be left as is, but you'll need to add %LET statements to set the value of:
- Password
- Host
- Schema
- DSN
- Port
But first, let's get your connection working with a Libname and regular Proc SQL, then let's try the explicit pass through code.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Jim,
Our DBA needs to sent me those parameters first
- Password
- Host
- Schema
- DSN
- Port
after that i will run the simple query and let you know whats going on
Thanks for helping me 🙂
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You're welcome. Not every one of those parameters will be necessary for every environment. Some environments are, for example, set up to not use passwords but use Kerberos (or something similar) for security.
Many of those parameters may already be on your Libname. If one of the parameters I listed is not on the Libname, you may not need that parameter in your environment. If the DBA takes a long time to get back to you, you could try connecting with what parameters you have and see what happens.
Jim