SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Passing SAS Functions to Hadoop

Accepted Solution Solved
Reply
Contributor
Posts: 23
Accepted Solution

Passing SAS Functions to Hadoop

Hi all,

I was wondering if anyone has any good links on what is pushed down to Hadoop when the Hadoop engine is used. I am using SAS 9.4.

Also, is the processing only carried out in the Hadoop cluster when PROC SQL is used?

Is anything done in the datastep?

Many thanks


Accepted Solutions
Solution
‎05-06-2014 08:39 AM
Super User
Posts: 5,441

Re: Passing SAS Functions to Hadoop

List of functions supported by SAS/ACCESS is listed in the documentation:

http://support.sas.com/documentation/cdl/en/acreldb/66787/HTML/default/viewer.htm#p010gv8z4k5kvin1om...

The concept about processing in Hadoop is pretty much the same as any other RDBMS. This is quite clearly described in the documentation as well.

The same goes with the data step, which in short form is:

  • Where-clause (any used function must be supported as described above)
  • Implicit sorting if BY is used

Use options sastrace=',,,d' sastraceloc=SASLOG; to see what SQL the ACCESS engine issues against Hive.

Data never sleeps

View solution in original post


All Replies
Solution
‎05-06-2014 08:39 AM
Super User
Posts: 5,441

Re: Passing SAS Functions to Hadoop

List of functions supported by SAS/ACCESS is listed in the documentation:

http://support.sas.com/documentation/cdl/en/acreldb/66787/HTML/default/viewer.htm#p010gv8z4k5kvin1om...

The concept about processing in Hadoop is pretty much the same as any other RDBMS. This is quite clearly described in the documentation as well.

The same goes with the data step, which in short form is:

  • Where-clause (any used function must be supported as described above)
  • Implicit sorting if BY is used

Use options sastrace=',,,d' sastraceloc=SASLOG; to see what SQL the ACCESS engine issues against Hive.

Data never sleeps
Contributor
Posts: 23

Re: Passing SAS Functions to Hadoop

Hi LunusH,

That is great, many thanks.

Super User
Posts: 5,441

Re: Passing SAS Functions to Hadoop

Just found this about data step (pre production...):

SAS(R) 9.4 In-Database Products: User's Guide, Third Edition

Data never sleeps
Contributor
Posts: 23

Re: Passing SAS Functions to Hadoop

Hi, that is really useful.

SAS Employee
Posts: 215

Re: Passing SAS Functions to Hadoop

Hi,

You can make SAS tell you which functions are eligible for implicit pass-through (to Hive). Here is an example (off the top of my head so it may just be close):

libname myhdp hadoop server=myHadoop sql_functions_copy=work.hadoop_functions;

This will create a SAS data set (in work) which lists the functions available for implicit pass-through. Keep in mind, implicit pass-through must be invoked in order to have them pass to Hadoop. It will also show you which Hive functions they map too.

When you are looking at your SASTRACE=',,,d' output, keep in mind it is more important that functions on a WHERE clause pass down. Post processing them (SAS processing the functions) could cause the entire contents of the Hive table to be returned.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 667 views
  • 3 likes
  • 3 in conversation