12-10-2015 05:30 PM
I am currently researching how SAS can be installed on AWS Cloud. In theory SAS (i.e. SAS Studio via a Localhost) could be installed on an AWS EMR instance as EMR requires an EC2 linux instance to be associated and EBS for any other configuration that is required. This would enable SAS to have the benefits of SAS/ACCESS to Hadoop. Is this possible? Has anybody successfullly trialled this? Would SAS on the EMR instance be able to connnect to SAS Grid and SAS Metadata Server?
It would be great to have a SAS setup on EMR.
12-11-2015 10:33 AM
I know we are doing a bunch of testing here at SAS on connecting to AWS EMR as well as lots of other cloud scenarios and are having excellent results. I would also love to hear from others who are trying this - especially with what you want to do since that helps us prioritize what we work on. I'll see if I can find resources to post in the meantime.
For those of you not familiar with the basics of AWS and how SAS fits in, https://www.youtube.com/watch?v=XHaLA7JSQyU is a great overview.
12-14-2015 03:52 PM
While you may be able to run most SAS products on AWS EC2 instances, doing so on EMR instances is likely not the best way of doing it (assuming it is even possible).
First of all, the OS that the EMR instances will use is not necessarily an Operating System that is officially supported by SAS.
Secondly, it's usually better to run SAS on a server that is dedicated to SAS. In this way, you can configure it to have optimal performance for SAS processing. It's likely that the EMR instances are optimized for running Hadoop, not SAS.
Moreover, running SAS and Hadoop on the same servers would mean that they may end up fighting for the resources (CPU, IO, Memory, etc..) of that server. Although you can, through workload management, allocate these resources, this is usually more advanced, and hard to justify the effort, in the situation you describe.
Finally, running SAS on the same server as Hadoop does nothing to "enable SAS to have the benefits of SAS/Access to Hadoop". The only critieria that matters is whether or not you have licensed and installed SAS/Access to Hadoop on your SAS server. Whether they are on the same machine or on spearate machines has no bearing here.
Another item of note is that EMR is currently not on list of supported Hadoop distributions for SAS. So your results may be variable when going against this particular distribution of Hadoop.
I hope this helps answer some of your questions.