SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

SAS Grid vs SAS with Hadoop

Reply
Frequent Contributor
Posts: 121

SAS Grid vs SAS with Hadoop

Hi,

 

I have to work with SAS in a very large datasets environment and we consider different options in order to have a good performance.

 

1) SAS Grid: we think is a good option because we don't have to rewrite SAS programs and the performance is good

2) SAS and Hadoop: in this envirornment we can execute sas programs in a computing distributed environment like Hadoop, but we are not sure ig our SAS programs will need to ne rewritten....does it depends on the kind of data steps or procs?

 

Another question: proc ds2 and proc fedsql can be used in both environments?

 

Last question: costs questions, now we have SAS EG and SAS Base licences in 1) we have to buy SAS Grid and 2) we have to buy SAS ACCESS to Hadoop. In a first estimation we find 2)optin more economical, are we right??, somebody has compared both?

 

Any advice will be greatly appreciated. 

 

Thanks in advance

 

Respected Advisor
Posts: 4,131

Re: SAS Grid vs SAS with Hadoop

@juanvg1972

You're asking a few big questions here. The answer what's right for you will very much depend on your current SAS usage, what problem you want to solve and your strategy for future SAS usage.

 

SAS Grid and Hadoop are not two exclusive alternatives but have different use cases and can very well both be part of the same architecture (I know of such topologies).

 

I'd suggest you contact your local SAS office and ask for their guidance/a proposal.

Frequent Contributor
Posts: 121

Re: SAS Grid vs SAS with Hadoop

Thanks Patrick,

 

My main doubt is about if my actual SAS processes need to be rewritten if I choose SAS Access to Hadoop. My processes are data management and analytical (data steps, common procedures and statistycal procedures). My volume of data is going to grow and in the future perhaps still be growing.

 

Also I want to know if SAS Grid can be as scalable as Hadoop.

 

Before contacting SAS office I am gathering information. Any similar experience?

 

If anybody can help me I will be gratefuk (sorry for my bad english)

Frequent Contributor
Posts: 121

Re: SAS Grid vs SAS with Hadoop

[ Edited ]

@Patrick

Patrcick,

 

I would like to know the different uses cases of both architectures that you mentioned, can you explain a little more??

 

Thanks 

Respected Advisor
Posts: 4,131

Re: SAS Grid vs SAS with Hadoop

[ Edited ]

@juanvg1972

How I see it:

A SAS Grid is about Scalability, High Availability and Workload Balancing, Hadoop is more about data storage/management and data lake.

 

If you're considering architectural changes to your current SAS platform which also will have license implications then you should really start talking to your local SAS office so that you can make an informed decision and end-up with something that's right for you.

Super User
Posts: 3,233

Re: SAS Grid vs SAS with Hadoop

I think it would be helpful to provide more details on the problems you are having processing large datasets in your current environment. How big are these and how long is it taking to process them? What improvement are you aiming for? Have you investigated other options for speeding up processing like dataset compression, SPDE, SPD Server, storage hardware and so on?

 

I think you may be limiting your options by just focusing on SAS Grid and Hadoop.

Super User
Posts: 5,382

Re: SAS Grid vs SAS with Hadoop

Just want to add that Grid not necessarily are good with handling large data sets. It can be considered if you have quite many queries and wish to scale up (and down?) quite easy. So if your use case is few but heavy on resource consumption Grid might not be you fit.
Also, Grid is available with Hadoop by using Yarn. Best of two worlds? Dont know since I haven't seen any in depth use case exposition.
Data never sleeps
PROC Star
Posts: 1,146

Re: SAS Grid vs SAS with Hadoop

This is a very big question, with major cost and performance implications.

 

Reaching out to find others who have gone down these roads is a very good step. However, you probably won't find anybody who can either explain the difference in a way that is relevant to your requirement, or who has implemented either of these alternatives in a way that will completely shed light on your situation.

 

As others have suggested, talking to your SAS office is a good idea, as they have the customer knowledge to find the most relevant comparables. But be careful that in any comparisons that come up, you're doing a true apples to apples comparison. This will be very difficult.

 

If SAS is in a position to make some of their resources available for benchmarking, using either your data or synthetic test data that simulates your environment, that may help towards making a decision.

 

Once you've reached the end of these steps, if you don't have a clear indication of which alternative is superior, you may have to undertake a proof on concept yourself. It will be difficult, but the consequences of selecting an option that won't meet your ongoing needs are worse!

 

Tom

Ask a Question
Discussion stats
  • 7 replies
  • 375 views
  • 7 likes
  • 5 in conversation