SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

minimum requirement to run sas with hadoop

Reply
Super Contributor
Posts: 490

minimum requirement to run sas with hadoop

If we have existing SAS 9.4 up and running,

What is the minimum requirement to import and export data to hadoop from SAS? Is that possible without purchasing any additional SAS components? Do i must have SAS/ACCESS for Hadoop?

Is it must to have one of the hadoop distribution like cloudera...? or i can run Apache hadoop on commodity HW and work and connect with SAS?

This is not for production environment, it just for testing and understanding SAS and hadoop? I know that there are many consideration for planning for hadoop, but i am asking for the simplest connection and interaction.

Also is this enough to run the transformation in DIS?

I am not looking for document, but looking for short experienced comments and advice and will go in details later.

Super User
Posts: 5,255

Re: minimum requirement to run sas with hadoop

Too lazy are we? Smiley Wink

Short answer: using file interface with hdfs using FILENAME and SPDE hdfs engine requires Base SAS.

For data base queries against Hive, you need SAS/ACCESS license.

Data never sleeps
Super Contributor
Posts: 490

Re: minimum requirement to run sas with hadoop

Thanks a lot

What about the supported Hadoop distribution, it must be (C, H, MapR, IBM, Pivotal) or a running Hadoop on Linux (Single-Node Cluster) is working too? is there any other restrictions?

After that will the Hadoop transformation in DIS work except the hive one sure?

SAS Employee
Posts: 75

Re: minimum requirement to run sas with hadoop

The prereqs for the Hadoop transformations in DI Studio are described in this topic:

SAS(R) Data Integration Studio 4.9: User's Guide

Super User
Posts: 5,255

Re: minimum requirement to run sas with hadoop

Please, you could check this out in the system requirements as well as having us do it.

On what functionality is covered is quite clearly described under each product (Base, DI Studio, SAS/ACCESS).

Data never sleeps
SAS Employee
Posts: 5

Re: minimum requirement to run sas with hadoop

With just Base SAS, you can use the FILENAME statement to access data in HDFS, PROC HADOOP to interact with Hadoop data by running Apache Hadoop code, and the SPD Engine to write data, retrieve data, perform administrative functions and even update data in HDFS.

 

The version of SAS 9.4 determines which Hadoop distributions are supported. This site lists the SAS 9.4 supported Hadoop distributions for several SAS products, offerings, and technologies, including the Base SAS FILENAME statement, PROC HADOOP, and the SPD Engine:

 

SAS 9.4 Supported Hadoop Distributions

 

If you would like overview information for SAS and Hadoop technologies, use this document. Each overview tells you what the product is, what's required, and where to go for more detailed documentation. 


SAS and Hadoop Technology: Overview

Ask a Question
Discussion stats
  • 5 replies
  • 855 views
  • 0 likes
  • 4 in conversation