BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Zeroster
Calcite | Level 5

Hi all,

 

My company is thinking of implementing a data warehouse for BI report generation with advanced analytics in mind.

 

We are thinking of using SAS as data warehouse due to analytics they provide and the supports available (especially security).

However, Hadoop seems to be the choice as data warehouse/lake. I am worried about the security though considering its still a maturing software and we dont have dedicated Hadoop/server developer yet.

 

So my questions is, would SAS be okay to use as data warehouse? We dont have that large set of data, few hundred Gb and they are quite structured. But we are looking to get unstructured data in the future (social media)

 

From my rather limited knowledge, best practicse seems to be using Hadoop as data lake and use SAS to process the data. But I dont know if this is correct.

 

If you could part with your knowledge and experience to help this lost one out, that will be much appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

Hadoop is really designed for managing large amounts of data - terabytes not gigabytes. IMO if your data requirements are modest, say in the gigabyte range, then Hadoop isn't really required for good performance. You would be adding extra complexity with Hadoop for small performance improvements.

 

In my company we manage a small datamart of 2-3 terrabytes in total. We just use SAS, including VA and performance is still great. If we grew to over 5 terrabytes then Hadoop might then be worth considering.  

View solution in original post

3 REPLIES 3
Patrick
Opal | Level 21

@Zeroster

From below link: "When you use SAS with Hadoop, you combine the power of analytics with the key strengths of Hadoop".

http://support.sas.com/documentation/cdl/en/hadoopov/68100/PDF/default/hadoopov.pdf

 

You've got the perfect question for contacting your local SAS office. I'm sure they'll be more than happy to support you in your decision process.

 

SASKiwi
PROC Star

Hadoop is really designed for managing large amounts of data - terabytes not gigabytes. IMO if your data requirements are modest, say in the gigabyte range, then Hadoop isn't really required for good performance. You would be adding extra complexity with Hadoop for small performance improvements.

 

In my company we manage a small datamart of 2-3 terrabytes in total. We just use SAS, including VA and performance is still great. If we grew to over 5 terrabytes then Hadoop might then be worth considering.  

Zeroster
Calcite | Level 5

Many thanks, guys, for your replies.

 

My main stumbling block to making a decision was whether not using Hadoop will mean relatively inefficient implementation. I had a feeling that the size of the data my company is dealing with would not benefit much by having Hadoop (especially considering ROI as we will need to recruit necessary IT people for it) .

 

Thanks SASKiwi for sharing your experience it helps to put my mind at ease when questions come flying in about why arent you using Hadoop.

 

@Patrick: Thank you for the link. I will be sure to read up on that as well! Although it seems to just explain the benfit of using Hadoop and SAS...

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Discussion stats
  • 3 replies
  • 1521 views
  • 0 likes
  • 3 in conversation