BookmarkSubscribeRSS Feed

SAS + DuckDB Series: Using the DuckDB Engine

Started ‎02-25-2026 by
Modified ‎02-25-2026 by
Views 1,199

Now that we have set the landscape with the first two posts describing what DuckDB is and how DuckDB runs inside SAS Viya, it’s time to go deeper and look at how to actually use it from SAS code.

 

As mentioned previously, the DuckDB engine can be used in multiple ways:

 

  • Fast in-memory analytics within a session
  • Persistent file-based DuckDB databases stored as .duckdb files
  • Direct querying of external data files such as CSV, Parquet, JSON, Delta Lake, and Iceberg

 

01_nir_post_109_01_duckdb_library-1024x524.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

Although one of DuckDB’s main strengths is its ability to transparently query open file formats and table formats (Parquet, Delta Lake, Iceberg, etc.) without formally importing them—and we’ll cover that in a future post—it’s important to first understand that DuckDB is a SQL processing engine with its own native storage format and modern data types.

 

Understanding DuckDB’s native format is important because, from a SAS perspective, DuckDB native tables and external files accessed through DuckDB both appear as regular SAS tables. Knowing the difference helps avoid confusion and ensures you understand whether data is stored inside DuckDB itself or being accessed directly from external open file formats.

 

Like Parquet, DuckDB uses a columnar format designed for analytics. In most cases, only the columns referenced in a query are read, which significantly improves performance and reduces I/O.

 

DuckDB can operate entirely in memory, or it can persist data to disk using its native .duckdb database file format.

 

02_nir_post_109_02_duckdb_storage.png

 

 

Using DuckDB in Memory from SAS

 

From a SAS perspective, using DuckDB in memory is extremely simple. You start by assigning a library using the DuckDB engine:

 

libname myduck duckdb ;

 

This creates an in-memory DuckDB database associated with your SAS Compute session.

 

You can then load data into DuckDB:

 

data myduck.yellow_tripdata ;
   set yt.yellow_tripdata_2024_11 ;
run ;

 

At this point, the table exists inside DuckDB, not as a SAS data set. However, from a SAS user perspective, it behaves just like any other SAS library table.

 

With implicit SQL pass-through, SAS automatically pushes eligible operations down to DuckDB. This minimizes data movement and allows DuckDB to perform the heavy processing using its optimized columnar engine.

 

For example, to identify the most popular routes:

 

proc sql ;
   create table myduck.trip_routes as
   select PULocationID,
          DOLocationID,
          count(*) as trip_count
   from myduck.yellow_tripdata
   group by PULocationID, DOLocationID ;
quit ;

 

For maximum efficiency, you want the entire query to execute inside DuckDB.

 

You can confirm this using SAS SQL tracing:

 

options sastrace=",,,d" sastraceloc=saslog nostsuffix ;

 

Review the SAS log after running your query.

 

03_nir_post_109_03_sastrace-1024x164.png

 

If you see the CREATE TABLE statement along with the SELECT portion of the query executed in DuckDB syntax, it means the entire query was executed inside DuckDB, with no data transferred between DuckDB and SAS.

 

These tracing options are extremely useful for understanding what SQL SAS generates and pushes to DuckDB—especially when working with external file formats, which we will cover later in this series.

 

 

Working with DuckDB Tables in SAS Viya

 

DuckDB tables appear just like any other tables in SAS Studio:

 

  • They are visible in the Libraries pane

 

04_nir_post_109_04_duckdb_library-1024x674.png

 

  • They can be explored interactively
  • They can be used in SAS Studio Flows and Steps

 

05_nir_post_109_05_duckdb_flow-1024x243.png

 

  • They can be queried using DATA step, PROC SQL, or other SAS procedures

 

This seamless integration makes DuckDB feel like a native part of the SAS environment.

 

However, when using the in-memory mode, the database exists only for the duration of the SAS Compute session (just like a table in the WORK library). If you clear all DuckDB in-memory libraries, or if your SAS session ends, all in-memory DuckDB tables are lost.

 

 

Persisting DuckDB Data on Disk

 

To persist data across sessions, you can store DuckDB data in a database file:

 

libname myduck duckdb database="/data/duckdb/myduck.duckdb" ;

 

Instead of storing tables in memory, DuckDB stores them in the specified .duckdb file.

 

This is useful when:

 

  • Working with datasets larger than available memory
  • Reusing prepared data across sessions
  • Sharing prepared data with colleagues
  • Building reusable analytics pipelines

 

If multiple users need access to the same database file, you may want to open it in read-only (DuckDB option) mode to prevent conflicts or locking issues:

 

libname myduck duckdb database="/data/duckdb/myduck.duckdb" read_only=yes ;

 

 

Summary

 

DuckDB is a powerful, lightweight analytical SQL engine with a high-performance columnar storage format that integrates seamlessly into SAS Viya.

 

It can operate fully in memory for maximum speed or persist data on disk for reuse and sharing. Through implicit pass-through, SAS allows DuckDB to execute queries directly, minimizing data movement and maximizing performance.

 

Beyond its native format, DuckDB also enables direct querying of modern open file and table formats such as Parquet, Delta Lake, and Iceberg—capabilities we will explore in the next post.

 

 

Learn More

 

These capabilities are covered in depth in the course Working with DuckDB in SAS Viya which explores DuckDB usage, performance, open file formats, and advanced integration scenarios.

 

Thanks for reading.

 

 

Find more articles from SAS Global Enablement and Learning here.

Comments
Tom

Is there actually an underscore in the READ_ONLY option you mentioned?

Why?

The LIBNAME statement has had a READONLY option (no underscore) for a gazillion years.

You’re referring to the ACCESS=READONLY option, not a READONLY option on the LIBNAME statement itself.

The READ_ONLY= option in the DuckDB LIBNAME engine is different. It maps directly to DuckDB’s native read-only database mode and ensures the .duckdb file is opened internally by DuckDB in read-only mode, avoiding write locks and enforcing read-only behavior at the database engine level.

This is the appropriate option when working with DuckDB databases from SAS.

Tom

But does it also support the normal ACCESS=READONLY option (obviously under the hood translating it to the syntax required by DuckDB)?  Or are we forced to remember to use a different option for librefs defined using the DUCKDB engine than we are used to using with the normal SAS engine?

One question though, does duck db operate in a clustered environment?

@Tom 

No, ACCESS=READONLY won't work as expected in that situation. We recommend using READ_ONLY=YES.

@vfarmak 

Before I can answer accurately to this question, would you mind elaborating a little bit?

DuckDB is designed as an embedded, single-node OLAP database. It runs "in-process" in a SAS Compute Server session.

You can also check this article: https://communities.sas.com/t5/SAS-Communities-Library/SAS-DuckDB-Series-How-DuckDB-Runs-Inside-SAS-...

 

Thank you @NicolasRobert I had the same impression about duck db when I read the documentation from their site. 

I was wondering whether we could add data from SAS to duck db to a report (power bi for instance that would be available for mutliple users).

Power BI has an experimental driver that supports only single instance connection. Don't know how the report engine serves the data to the report and multiple users can access the report. 

To be honest I don't how the knowledge how this is supported but thank you for the article!

@vfarmak 

Why this convoluted way (SAS Viya -> DuckDB -> BI Report)?
Why Not directly (SAS Viya -> BI Report)?

If your data already in SAS Viya (CAS) distributed across multiple nodes, one would think, it would be as efficient to query SAS directly via ODBC/JDBC/OLE-DB and feed your BI Report without going through DuckDB!  

 

As @NicolasRobert had mentioned, DuckDB is intentionally built to be a in-process/embedded single instance database, just like SqLite. If you are looking for distributed/clustered DuckDB infrastructure, then you need to look into MotherDuck.

MotherDuck is a Cloud Warehouse platform built on top of DuckDB.

 

Hope this helps

@ahmedalattar 

You are right. In my case I would like to ingest data from SAS 9.4M8 (we don't have Viya) to Power BI Report, since it is fast for OLAP (SAS had web reporting studio but in 9.4 marked that as obsolete).

The question is whether when the data are on the duck db side, still the users will be able to run the report. 

Hi @vfarmak 

Which deployment of SAS 9.4 M8 you currently have?

  • SAS Foundation (BASE SAS+)
    • You can use SAS ODBC/JDBC Driver(s) to provide the contents of *.sas7bdat (SAS Data Sets) to third party software
  • SAS BI/EBI --- While the BI Portal & BI Dashboard have been discontinued,
    • Web Report Studio, and Stored Process are still viable reporting options
    • SAS Providers for OLE DB (About the SAS Providers for OLE DB - 9.3) is another option depending on your licensed products
    • SAS R&D have created an alternative/replacement Web Application to the deprecated SAS Portal. Get in touch with your Customer Success rep or SAS Tech Support for more information and support.

Hope this helps     

Contributors
Version history
Last update:
‎02-25-2026 04:29 PM
Updated by:

Catch up on SAS Innovate 2026

Dive into keynotes, announcements and breakthroughs on demand.

Explore Now →

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started