BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Zeroster
Calcite | Level 5

Hi all,

 

My company is thinking of implementing a data warehouse for BI report generation with advanced analytics in mind.

 

We are thinking of using SAS as data warehouse due to analytics they provide and the supports available (especially security).

However, Hadoop seems to be the choice as data warehouse/lake. I am worried about the security though considering its still a maturing software and we dont have dedicated Hadoop/server developer yet.

 

So my questions is, would SAS be okay to use as data warehouse? We dont have that large set of data, few hundred Gb and they are quite structured. But we are looking to get unstructured data in the future (social media)

 

From my rather limited knowledge, best practicse seems to be using Hadoop as data lake and use SAS to process the data. But I dont know if this is correct.

 

If you could part with your knowledge and experience to help this lost one out, that will be much appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
SASKiwi
PROC Star

Hadoop is really designed for managing large amounts of data - terabytes not gigabytes. IMO if your data requirements are modest, say in the gigabyte range, then Hadoop isn't really required for good performance. You would be adding extra complexity with Hadoop for small performance improvements.

 

In my company we manage a small datamart of 2-3 terrabytes in total. We just use SAS, including VA and performance is still great. If we grew to over 5 terrabytes then Hadoop might then be worth considering.  

View solution in original post

3 REPLIES 3
Patrick
Opal | Level 21

@Zeroster

From below link: "When you use SAS with Hadoop, you combine the power of analytics with the key strengths of Hadoop".

http://support.sas.com/documentation/cdl/en/hadoopov/68100/PDF/default/hadoopov.pdf

 

You've got the perfect question for contacting your local SAS office. I'm sure they'll be more than happy to support you in your decision process.

 

SASKiwi
PROC Star

Hadoop is really designed for managing large amounts of data - terabytes not gigabytes. IMO if your data requirements are modest, say in the gigabyte range, then Hadoop isn't really required for good performance. You would be adding extra complexity with Hadoop for small performance improvements.

 

In my company we manage a small datamart of 2-3 terrabytes in total. We just use SAS, including VA and performance is still great. If we grew to over 5 terrabytes then Hadoop might then be worth considering.  

Zeroster
Calcite | Level 5

Many thanks, guys, for your replies.

 

My main stumbling block to making a decision was whether not using Hadoop will mean relatively inefficient implementation. I had a feeling that the size of the data my company is dealing with would not benefit much by having Hadoop (especially considering ROI as we will need to recruit necessary IT people for it) .

 

Thanks SASKiwi for sharing your experience it helps to put my mind at ease when questions come flying in about why arent you using Hadoop.

 

@Patrick: Thank you for the link. I will be sure to read up on that as well! Although it seems to just explain the benfit of using Hadoop and SAS...

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Discussion stats
  • 3 replies
  • 1735 views
  • 0 likes
  • 3 in conversation