Don't just think like a data scientist. Be one.

Hive UDF not required ?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 7
Accepted Solution

Hive UDF not required ?

While accessing the Hive tables through Libname statement, we can perform lot of computation intensive tasks on the table like proc freq and proc means etc.

As I understand, traditionally Hive doesn’t have “out of box” functionality to do all of such task, hence Hive developers create custom UDF’s in python/Java and register them in hive to use them in there Hive query.  Does that mean, as a SAS developer I don’t need to learn python as creation of UDF’s in not required? I'm new to hadoop world, apologies if my question doesn't make sense. 


Accepted Solutions
Solution
2 weeks ago
SAS Employee
Posts: 31

Re: Hive UDF not required ?

Posted in reply to akpattnaik

As a SAS developer you can use the SAS language and, as shown in the course, SAS Access converts some of that into HiveQL processes so the processing is done in Hadoop. But if you need to do something that the SAS Access technology does not convert into a HiveQL equivalent AND you need the processing to be done in Hadoop (not brought back to SAS), then you would need to use another technique that does not involve SAS. In other words, some SAS language elements are converted into HiveQL but many are not.

 

Apart from LIBNAME method (SAS Access to Hadoop) it’s also possible to license other SAS technologies like SAS HPA, SAS LASR, Code Accelerator for Hadoop, and other technologies that will also perform additional SAS language based processing in Hadoop. This goes beyond what you can do with the SQL based processing of HiveQL and takes you into the realm of Statistical modeling, data mining, machine learning, or Data step-like processing (DS2) in Hadoop. That is overviewed in chapter 6 of the course. And some of it is  covered in other courses in the Data Science academy as well

View solution in original post


All Replies
Solution
2 weeks ago
SAS Employee
Posts: 31

Re: Hive UDF not required ?

Posted in reply to akpattnaik

As a SAS developer you can use the SAS language and, as shown in the course, SAS Access converts some of that into HiveQL processes so the processing is done in Hadoop. But if you need to do something that the SAS Access technology does not convert into a HiveQL equivalent AND you need the processing to be done in Hadoop (not brought back to SAS), then you would need to use another technique that does not involve SAS. In other words, some SAS language elements are converted into HiveQL but many are not.

 

Apart from LIBNAME method (SAS Access to Hadoop) it’s also possible to license other SAS technologies like SAS HPA, SAS LASR, Code Accelerator for Hadoop, and other technologies that will also perform additional SAS language based processing in Hadoop. This goes beyond what you can do with the SQL based processing of HiveQL and takes you into the realm of Statistical modeling, data mining, machine learning, or Data step-like processing (DS2) in Hadoop. That is overviewed in chapter 6 of the course. And some of it is  covered in other courses in the Data Science academy as well

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 1 reply
  • 137 views
  • 0 likes
  • 2 in conversation