SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Access data from Hadoop with SAS

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 18
Accepted Solution

Access data from Hadoop with SAS

Hi,

I read that SAS Data Loader for H
adoop include the following 4 solutions:

  • SAS Data Loader
  • SAS/Acess Interface to Hadoop
  • SAS In-Database for Hadoop
  • SAS Data Quality Accelarator for Hadoop

My question is: To access and extract data from Hadoop we need all the four components, or we just use SAS Data Loader for Hadoop?

Because I'm studiyng how SAS can work with data from Hadoop and already see that can I:

  • extract direclty from HDFS (using Base SAS)
  • use SAS Data Loader for Hadoop
  • use SAS Scalable Performance Data Server
  • use SAS LASR
  • use SAS In-Database Products

I don't know If I'm thinking correclty...


Accepted Solutions
Solution
‎04-27-2016 07:22 AM
Super User
Posts: 5,429

Re: Access data from Hadoop with SAS

Posted in reply to Rodgers_125

Data Loader for Haddop is a package, so there are overlapping in functionality.

Data Loader is a single user environments in the current release, but that shouldn't be a problem for you?

If you have existing data in Hive, the basic license is SAS/ACCESS fro Hadoop.

Then you can use SQL and other stuff to extract and analyze data.

Then there's a question of how you wish to analyze/browse the data - these requirements are needed to chose the appropriate SAS tools.

Data never sleeps

View solution in original post


All Replies
Super User
Posts: 5,429

Re: Access data from Hadoop with SAS

Posted in reply to Rodgers_125

I think your scope is somewhat complex.

This means that you need to

  • explain exactly what license your have
  • what data you have
  • what you wish to with it
  • and where

Bottom line is that you may need on-site guidance on how to architect your environment. I think SAS should assist you on this (especially on haw to use a subset of components in the "Loader" product compared to the separate modules). Or are you trying to get a second opinion?

Data never sleeps
Occasional Contributor
Posts: 18

Re: Access data from Hadoop with SAS

Hi LinusH,

 

I'm just doing  a research about how can SAS could exract some insights from Hadoop. I don't have any SAS license at this time, because is just a research program for my Master Thesis.
I've amount of data in Hadoop (some files with a large amount of data in HDFS) and I create with Hive some new tables to do some segmentations to reduce the quantity of data.
What I want now is available what are the options to explore my data with SAS to extract some insights. For that I need to access the Data in Hadoop using SAS (that's Why all my questions above because I seeing amount of options with the same goal). I already read that I can extract the data directly from HDFS to SAS (don't know what are the pre-requisites) or I can use SAS Data Loader for Hadoop.

If you have a amount of Data in HDFS, Hive or Hbase, whick solution do you use to extract into SAS. Or, is a better option, read directly to Hadoop via SAS?

Hope I have explained better.

Thanks for your help!

Solution
‎04-27-2016 07:22 AM
Super User
Posts: 5,429

Re: Access data from Hadoop with SAS

Posted in reply to Rodgers_125

Data Loader for Haddop is a package, so there are overlapping in functionality.

Data Loader is a single user environments in the current release, but that shouldn't be a problem for you?

If you have existing data in Hive, the basic license is SAS/ACCESS fro Hadoop.

Then you can use SQL and other stuff to extract and analyze data.

Then there's a question of how you wish to analyze/browse the data - these requirements are needed to chose the appropriate SAS tools.

Data never sleeps
Occasional Contributor
Posts: 18

Re: Access data from Hadoop with SAS

LinusH,

 

sorry only more one question:

If i said:

 

If we want to storage the data into our SAS machine we can use SAS/ACCESS for Hadoop or just Base (SAS). iF we want to analyze and explore the data in a SAS Application (lik SAS Visual Analytics) we use SAS SPDS because it's included on it structure. 

 

Is this thinking wrong?

Super User
Posts: 5,429

Re: Access data from Hadoop with SAS

Posted in reply to Rodgers_125

"Store the data into our SAS machine we can use SAS/ACCESS for Hadoop or just Base (SAS)."

If you want to benefit from the Hive metastore, you need SAS/ACCESS to Hadoop.

 

"Analyze and explore the data in a SAS Application (like SAS Visual Analytics) we use SAS SPDS"

SPDS is a great SAS data store. But it's not required for Visual Analytics. But loading data to the LASR server might go faster if you use SPDS (compared to base SAS data sets) - given the same physical conditions.

Data never sleeps
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 582 views
  • 1 like
  • 2 in conversation