BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
msd83
Calcite | Level 5

Hi all,

I was wondering if anyone has any good links on what is pushed down to Hadoop when the Hadoop engine is used. I am using SAS 9.4.

Also, is the processing only carried out in the Hadoop cluster when PROC SQL is used?

Is anything done in the datastep?

Many thanks

1 ACCEPTED SOLUTION

Accepted Solutions
LinusH
Tourmaline | Level 20

List of functions supported by SAS/ACCESS is listed in the documentation:

http://support.sas.com/documentation/cdl/en/acreldb/66787/HTML/default/viewer.htm#p010gv8z4k5kvin1om...

The concept about processing in Hadoop is pretty much the same as any other RDBMS. This is quite clearly described in the documentation as well.

The same goes with the data step, which in short form is:

  • Where-clause (any used function must be supported as described above)
  • Implicit sorting if BY is used

Use options sastrace=',,,d' sastraceloc=SASLOG; to see what SQL the ACCESS engine issues against Hive.

Data never sleeps

View solution in original post

5 REPLIES 5
LinusH
Tourmaline | Level 20

List of functions supported by SAS/ACCESS is listed in the documentation:

http://support.sas.com/documentation/cdl/en/acreldb/66787/HTML/default/viewer.htm#p010gv8z4k5kvin1om...

The concept about processing in Hadoop is pretty much the same as any other RDBMS. This is quite clearly described in the documentation as well.

The same goes with the data step, which in short form is:

  • Where-clause (any used function must be supported as described above)
  • Implicit sorting if BY is used

Use options sastrace=',,,d' sastraceloc=SASLOG; to see what SQL the ACCESS engine issues against Hive.

Data never sleeps
msd83
Calcite | Level 5

Hi LunusH,

That is great, many thanks.

LinusH
Tourmaline | Level 20

Just found this about data step (pre production...):

SAS(R) 9.4 In-Database Products: User's Guide, Third Edition

Data never sleeps
msd83
Calcite | Level 5

Hi, that is really useful.

JBailey
Barite | Level 11

Hi,

You can make SAS tell you which functions are eligible for implicit pass-through (to Hive). Here is an example (off the top of my head so it may just be close):

libname myhdp hadoop server=myHadoop sql_functions_copy=work.hadoop_functions;

This will create a SAS data set (in work) which lists the functions available for implicit pass-through. Keep in mind, implicit pass-through must be invoked in order to have them pass to Hadoop. It will also show you which Hive functions they map too.

When you are looking at your SASTRACE=',,,d' output, keep in mind it is more important that functions on a WHERE clause pass down. Post processing them (SAS processing the functions) could cause the entire contents of the Hive table to be returned.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1758 views
  • 3 likes
  • 3 in conversation