Hello,
I was wondering if someone already used Azure as data source, in a Data Lake approach, being able to create VA reports or run DI jobs. How about the performance? Was it good enough?
Thanks.
Your question is very vague. Azure has dozens of services which could be used as a data source: HDInsight, Blob-Storage, SQL DB, Data Lake, ...
I think you are referring to the HDFS data storage called Azure Data Lake. So the short answer for ADL: no, you cannot access the ADL HDFS with SAS or any other non Microsoft product out of the box. The reason is that ADL requires a user login before you can use the HDFS.
But with some custom programming you can access the data via the Microsoft API or commandline tools. It just won't work with the standard SAS/Access to Hadoop and you have to download the data locally before you can process it.
For the second part of your question about performance: the performance is related to dozens of factors, so it is impossible to give you a general answer. The service itself is performant, but if you can access it in a performant way heavily depends on your company infrastructure.
It also depends on the Azure region where your data is stored. It also makes a difference if your SAS is running on premise or in the cloud and maybe even the same data center.
I hope this gives you some insights and points you to the right direction.
Your question is very vague. Azure has dozens of services which could be used as a data source: HDInsight, Blob-Storage, SQL DB, Data Lake, ...
I think you are referring to the HDFS data storage called Azure Data Lake. So the short answer for ADL: no, you cannot access the ADL HDFS with SAS or any other non Microsoft product out of the box. The reason is that ADL requires a user login before you can use the HDFS.
But with some custom programming you can access the data via the Microsoft API or commandline tools. It just won't work with the standard SAS/Access to Hadoop and you have to download the data locally before you can process it.
For the second part of your question about performance: the performance is related to dozens of factors, so it is impossible to give you a general answer. The service itself is performant, but if you can access it in a performant way heavily depends on your company infrastructure.
It also depends on the Azure region where your data is stored. It also makes a difference if your SAS is running on premise or in the cloud and maybe even the same data center.
I hope this gives you some insights and points you to the right direction.
Thanks AndreasMenrath!
One more addition to the topic:
for Big Data scenarios you should consider to take a look at the big data solution from SAS: SAS Viya. It runs on a public or private cloud or also on premise.
If you are in a Microsoft focussed company you might want to evalute Azure Data Factory to push your data from Azure into a location (on premise or cloud) where your SAS environment can read it.
Hi @fetcs74,
Are you interested in using SAS with Azure HDInsight? Can you tell me your company name so that I can keep a record of this request?
Best wishes,
Jeff
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!