BookmarkSubscribeRSS Feed

How to Increase Speed to Analytical-Driven Decisions - SAS Viya with SingleStore

Started ‎04-11-2023 by
Modified ‎04-11-2023 by
Views 1,194

Watch this Ask the Expert session to learn how to use the power of SAS Viya with SingleStore and SAS Cloud Analytic Services (CAS) to analyze data in SingleStore relational tables and reduce ETL data movements. 

 

Watch the webinar

 

You will learn:

  • How SAS Viya performs accessing data directly in SingleStore.
  • How to optimize infrastructure to deliver faster performance at reduced cost.
  • How to create a true enterprise-level data fabric and associated model development and delivery pipeline.

The questions from the Q&A segment held at the end of the webinar are listed below and the slides from the webinar are attached.

 

Q&A

If you implement SAS Viya with SingleStore, are you limited to working with just SingleStore?

No, you are not. With SAS Viya with SingleSTore, you're getting the strength the embedded process in SingleStore and the ability to leverage the SingleStoreDB across your solution set. If you are currently using SAS to talk with Snowflake, or Synapse, you can still build your Proc SQL statements to execute SQL with those databases, or whatever other database you may currently be using. That whole data fabric that I showed on that slide is still available to you. What you're adding is the embedded processing capability with SingleStore that's delivering the direct interaction and the performance that we're talking about here today.

 

The other question I sometimes get is are you limited to only SingleStore tables.  And the answer to that is now. You don't have to use SingleStore.  If you've got particular SAS datasets that work well and are fit for purpose, they will continue to work as they do today.  You now have both options in VA, you can work directly with tables in the SingleStore database (CASLIB) or continue to work with SAS BDAT/HDAT files as you do today.  The benefit here is that you can leverage the strengths both.  Don’t forget, you do have a slight overhead of the SingleStore SQL database that you're talking to, there are going to be use cases where CAS will be faster, and that's fine, but that's the power of the platform. You can leverage whatever is the best fit for that particular use case that you're dealing.

 

What makes Viya with SingleStore unique?

 

The embedded process that I'm talking about between Viya and SingleStore.

 

It is the only data architecture that we've built to be accessed directly from our cloud analytics service. This ability to multi thread work and stream data back and forth with the embedded process within the leaves of SingleStore means we can move significant work closer to the data.  SingleStore has aggregators that coordinate the movement of data back and forth across the leaves. The embedded process lives in those leaves and allows us to, in real time, push down analytics. Have them run on the leaf and return the result sets.

 

We've got some other embedded processes that we do run with other data architectures, but this is the only one where we have this tight coordination today and there's more coming. We went live in August. We've added some additional capability in our January release, and we have a monthly cadence with SAS Viya with SingleStore. We're constantly updating and improving the performance. Adding additional capabilities of pushing down more analytical actions to SingleStore.

 

This is going to continue to improve over the next 2-3 years as the environment gets faster. But it's the only database where we're doing this type of embedding today. One thing allowed there too is Bill's team has done a really good job of integrating some of the key features that we've discussed about SingleStore to promote both those cost effectiveness as well as performance gains in the solutions, specifically around when we have the separation of compute and storage that allows us to manage a much larger data environment in a much more efficient way. The compression rate allows us to minimize the overall size of that data as well as promote the usability of all data throughout the enterprise.

 

We talk a lot about how different solutions, different technologies connect into the SingleStoreDB. That can be your open source tool sets, that can be your BI tool sets, that can be applications sitting out at your consumer level that are direct to consumer data. It really makes your analytics much more available as well as that performance and promoting really the key differentiators with SAS with SingleStore. This really is something that's not on the industry today, the parallelization that we've done with the workloads allows us to promote really the most performant and cost-effective way to perform analytics.

 

Can SingleStore be used for law enforcement professionals? Such as use for surveillance purposes?

Yes, SAS with SingleStore can be used for any professional use case. When we need to support multiple different at use cases, this is where SAS with SingleStore really shines. It's in these analytic and real-time analytic use cases. We've done projects where we've supported law enforcement around image processing, around quicker data processing.

 

 

To interact with SingleStore using CAS, can we use SAS statements (and others available in SAS) or are we limited to CAS statements?

 

In terms of the CAS actions and the CAS language, both are going to operate directly with SingleStore or with current HDAT datasets.   As I alluded to before, SingleStore will also be available to SAS studio proc statement, SAS SQL proc statements, those will go down to that same SingleStore database and run but they won't leverage the embedded process. If you want to leverage the embedded process, it must be through the CAS actions that has been developed to work interactively with it. That starts to speak to the migration of how we get into our modern analytical applications. We're going to support all historic uses of your data as well as promote how do we start to migrate into the more performant CAS infrastructure along the way.

 

As we start to manage data across multiple systems through that migration process, we're creating that SingleStore of data (a little bit on the nose there) but it essentially allows you to create data once within SingleStore. Then manage it across all your analytic solutions throughout the enterprise.

 

I think the other thing I'll mention is that we are looking at our current SAS Access capabilities with SingleStore and we are looking at data steps and how they're currently executed and looking at how we might add some capabilities into the environment for those data steps to run and leverage in the CAS actions and through some new developments. As I mentioned before, we're constantly looking at how to augment and enhance the performance of SAS Viya with SingleStore.

 

Are Snowflake and SingleStore direct competitors, or do you see scenarios where the two can co-exist?

I will answer this in two different ways. The 1st way is if we look at SingleStore as a business, yes there's the competitive nature to SingleStore and Snowflake, but today we're talking about SAS Viya with SingleStore and SAS Viya solutions. We certainly are promoting an augmentation of what Snowflake can do and then what we can do within SingleStore. They will live together in most enterprises. We are not saying that you have to get rid of all of your Snowflake and only use SingleStore solutions. When we look at the analytic process and we look at how we manage data across the enterprise, we're promoting another tool in your tool belt for really promoting the most cost effective and most performant way to perform all your SAS analytics and all your analytics across the enterprise. Your Snowflake environments will still sit there. Snowflake environments still support the rest of your business, and these solutions will live side by side and really the reason for leveraging the Viya with SingleStore solutions are everything that we've covered today.

 

And promoting how we start to manage data much more efficiently and effectively in that analytics environment within Viya SingleStore. They will live together at most of these enterprises, and you can certainly leverage both.

 

From a sales perspective, we're somewhat agnostic. We talked to all the different databases out there whether it's Oracle, Snowflake, Synapse, we've got different types of embedded processes that we've done. We've got a whole team here at SAS where we've developed some embedded processes with Hadoop with Teradata. Those are all still available and usable but the tight interaction that we rolled out with SAS Viya was SingleStore only exists with SingleStore. You can leverage all those different platforms. Sometimes it depends on the use case and what you're trying to accomplish. SingleStore differentiates itself on that single table type across transactional and analytic workloads. When we look at any OLAP solution like Snowflake or any type of data warehouse, we will start to see a hit on not only performance, but cost effectiveness based off how transactional that data is.

 

I did notice one thing though, Kyle, if I jump in a little bit, the unlimited storage capability that you guys have. You guys were the first ones to have that. The unlimited storage plus the working with analytic and transactional workloads all together in one single table type is really a key differentiator there.

 

So how does one compare the two - Hadoop and SAS SingleStore?

I would assume it would be SAS plus Hadoop versus SAS plus SingleStore and the key differentiator there is a lot of the embedded process and a lot of the push down that we're starting to promote a Hadoop infrastructure is great for. How do we throw the most data that we can possibly do at a cheaper infrastructure? It's great for that. Now getting the data out is the issue and how do we do it efficiently in an analytic ready interface is where we run into a lot of pain inside of those Hadoop environments. When we look at the differentiation between Hadoop and SingleStore, specifically in a SAS environment, the parallelization of workloads, the efficiency of query processing, the MP nature of SingleStore, as well as the specific cost associated to how we start to deliver analytics. The storage itself might look cheap until you start to query the data. We can walk through all the cost and performance. Metrics to show you how the Hadoop infrastructure compares to the SingleStore infrastructure, especially when we start to put Viya on top of it as well as historical sets on top. We've actually got some internal use cases where we've got customers that are looking to move off of Hadoop infrastructure and they're very interested in the SAS Viya with Single Source solution and I can't share those yet because they're specific to those customers, but very significant cost savings in terms of reducing their infrastructure footprint leveraging satisfied with SingleStore. If we look at a Hadoop infrastructure all the way from the data ingest to the storage to the querying to the actual management of data silos to bring data out of Hadoop, consolidate that consolidation of technical debt into SingleStore is really what we're going to start to tie that to.

 

How do we persuade management to invest in SingleStore?

We've got a whole team here at SAS that is continuing to enhance SAS Viya with SingleStore and the capabilities of this new embedded process. We have a road map that's available for those of you that may be interested to learn more about where the technology is going and how it can impact you’re your current business challenges.

 

The performance that we are seeing particularly in very large data sets is what I'm very excited about this new offer.  In addition, we're constantly looking at how to push down, more of the CAS actions, and analytics to the data there by continuing to improve performance.

 

In terms of how you can promote this up to your management and make the investments in VIYA with SingleStore. I think the key point there is we can prove it to you. Let's take a use case and show you how working with SingleStore and your specific SAS workloads, where there are specific metrics that we can enhance. When we say we're going to do this for X amount percent the cost as well as show higher performance in those specific workloads, that's how we get management to buy in. We're not only going to create a cheaper infrastructure, but also support a more performant infrastructure.

 

Let's take those use cases to management and show how and why you would invest in that SingleStore platform. That all comes down to how we prove it out to you and we're more than happy to do that. It makes a big impact. If your management is concerned about the cost particularly in the cloud data movement and trying to consolidate those costs, there's a very strong impact that we can have there but the actual bigger impact is that time to analytical insight.  The ability to enhance the capabilities of your data scientists, a even a lot of your actual business users via the Viya platform. Viya presents a platform that really allows folks to get in and start working with data very quickly. Also then allows you to reduce the costs in the cloud and bring those solutions to light quickly to a much broader, wider breadth of people within your organization and that can really help the bottom line.

 

Other than unlimited space (which is a big one), what advantages can be expected compared to Snowflake?

Snowflake is a traditional OLAP infrastructure. It's a data warehouse, it's not built for transactional data. When we look at this idea of single table type and I'll keep coming back to this, this is a unique feature of SingleStore. If we have a single table that's able to support not only the highest IO possible, also the analytical query processing that is very much performant compared to Snowflake. These are all performance metrics that we can start to promote. When we start to bring SAS into the picture, the ability to push down much more efficiently and minimize the data movement when an environment hooks up to, let's say snowflake environment through a typical ODBC connection. That's a single funnel approach. We're moving more data through a single funnel, and we lose a lot of the parallel nature of how we start to work with data as well as the actual data engine itself is more performant. We're looking at multiple levels to separate that compute and storage as well as performance of the data itself and the querying itself in the SingleStore engine. As data is coming into a SingleStore database platform, you can build out your SAS analytics on top of it in real time. You need to know every minute what's changing out there in the marketplace and that's going to drive an analytical model that drives the decision you have. We have an ability now to just fit so cohesively with that environment leveraging that fast turnaround time from a single perspective, but also leveraging the power of SAS to deliver the analytics on top of it again in real time.

 

 

Recommended Resources

SAS Viya with SingleStore

SAS Viya with SingleStore Solution Brief

Blog: Consolidate Complex Data Workflows Into Fast, Impactful Business Insights

Please see additional resources in the attached slide deck.

 

Want more tips? Be sure to subscribe to the Ask the Expert board to receive follow up Q&A, slides and recordings from other SAS Ask the Expert webinars.  

 

Version history
Last update:
‎04-11-2023 02:04 PM
Updated by:
Contributors

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Article Labels
Article Tags