This article assumes you are planning modernization alongside a migration from SAS 9.4 to SAS Viya. It examines modern options for data formats and cloud storage technologies, with the following considerations:
To address these challenges, we explore two modern data management technologies available in SAS Viya: Parquet file format and SAS SpeedyStore.
Article Structure:
Overview and comparison of Parquet and SAS SpeedyStore
Recommended steps for data modernization
Two technologies have been chosen to consider reflecting customer interest and marketplace activity.
Apache Parquet is a free and open-source column-oriented file format offering efficient compression and encoding for improved performance. SAS Viya introduced Libname support for Parquet in 2021.2.6.
The importance of Parquet in SAS environments grew with the release of a SAS Viya Libname for DuckDB in 2025.07. DuckDB is an open-source, column-oriented OLAP database that runs in Viya, enabling it to better consume Parquet data and open table formats such as Iceberg and Delta Lake.
DuckDB can access this data from a file system or from low-cost cloud object storage.
SAS SpeedyStore is modern cloud-native SAS storage solution built on on SingleStore technology. Its universal database design offers row, column and vector tables, with advanced compression, elastic scalability and virtually unlimited capacity using cloud object storage.
SpeedyStore unifies transactional and analytical workloads in a secure, high-performance environment. Pre-integrated with SAS Viya AI and analytics (including SAS Embedded Process), it enables real-time insights, cost efficiency and accelerated decision-making. SpeedyStore enhances SAS Visual Analytics workloads by offloading SQL processing and is the natural successor to SAS Scalable Performance Data (SPD) Server [SPD/S] ).
Here's a brief comparison of these technologies:
| Technology | Advantages | Considerations |
| Parquet file format |
Open source & open format Improved compression & performance |
Likely requires code changes to existing SAS programs Emerging adoption – limited enterprise-wide deployment |
| SAS SpeedyStore |
Improved compression SQL interface is a defacto open standard High-performance RDBMS with close integration to SAS using the SAS Embedded Process and acceleration of SAS Visual Analytics. |
Likely requires code changes to existing SAS programs Licenced SAS product |
|
.sas7bdat file format |
No code changes |
Proprietary – effectively unavailable to non-SAS tooling |
When developing a modernization strategy, it’s important to recognize that adopting a single data format or technology for all use cases is unlikely. Modernization efforts may not encompass the entire data landscape, and some legacy formats will likely remain post-transition (see adjacent diagram).
While most data is currently stored in SAS datasets, future environments will likely incorporate a mix of .sas7bdat, Parquet, and SAS SpeedyStore formats.
Recommended Steps for Modernization
Define objectives for data modernization.
Assess your current SAS data landscape.
Based on this assessment, determine strategies for managing existing and new data.
Modernization is driven by several key factors:
If you are modernizing alongside a transition from SAS 9.4 to SAS Viya, consider:
Before making decisions, assess your current SAS environment by reviewing:
Tools to help gather this information include:
The table below presents key methods for gathering insights using these two products.
| Insight | Tools |
| Hot/cold data | CA Inventory report or DataMart. See example below: Note that many IT environments disable timestamp maintenance to minimize file system overhead, making timestamps unreliable for hot/cold analysis. |
ESM event monitoring can detect active datasets even when filesystem or OS level timestamps are disabled. This article explains how to capture data and dataset access details, including all actions such as read and write. |
|
| Total datasets & size | Available from the CA Inventory report or DataMart. |
ESM captures detailed data access events, which should be integrated with output from the sas-data-mon utility to include the entire filesystem, including dormant non-SAS files. Overlaying ESM activity data on this comprehensive view enables precise hot-to-cold analysis. |
|
| SAS programs accessing individual datasets | ESM captures details of user sessions and associated events. Data-related events can be retrieved at the session level. For batch sessions, the .sas program name is recorded, enabling direct tracking of usage. For interactive sessions, identifying the code name requires additional steps. This involves matching process IDs (PIDs) to log files and extracting the program or Enterprise Guide name from the log. This process can be automated using code; an example is provided in the SAS Support Communities blog Accelerating SAS9 to Viya Migration with Log Intelligence - SAS Support Communities. Linking usage back to specific code or Enterprise Guide projects falls outside the scope of Eco-Diagnostics. |
| Use of specific .sas7bdat capabilities | CA Profile Content DataMart: all_datasets. See excerpt below: |
At a minimum you need a clear understanding of the active-to-inactive data ratio, the total size of your data estate, and which SAS programs interact with each category.
With a clearer understanding of your current data landscape, you are better positioned to make informed modernization decisions. At a high level, two key considerations emerge:
1. Modernize Existing Data
2. Managing New Data
Converting data sets to new formats can require significant refactoring. For each candidate data set, identify all .sas programs that reference it to assess the scope of changes. Key considerations include:
When implementing a policy for adopting new technologies, address the following:
While new data formats and storage technologies are emerging, the .sas7bdat file remains the default for SAS and is still widely used. Despite being over 30 years old, it continues to provide flexibility and robust functionality.
To reduce storage costs for .sas7bdat datasets, consider the following strategies:
Modernizing your data environment can significantly transform your SAS landscape, especially when paired with a migration to SAS Viya, typically deployed in the cloud.
Over the past decade, data ecosystems have evolved rapidly—Hadoop once dominated, but has since given way to data lakes, new file formats, and open table technologies. SQL remains a critical standard, extending beyond traditional RDBMS to certain file-based systems. Change is inevitable; the challenge lies in adopting technologies and strategies that endure while avoiding short-lived trends.
With over 50 years of expertise, SAS recognizes the importance of consistent, accurate data. I hope this article has provided valuable insight into data modernization. For further guidance, connect with your SAS team for expert advice and practical experience.
Thanks to the following colleagues for their contributions and feedback on this article: James Ochiai-Brown, George Beevers, Neil Griffin and Mike Johansson.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.