As modern analytics platforms increasingly move toward cloud‑native data lakes, secure and seamless access to object storage has become critical. DuckDB ,used here as a SAS access engine, enables analysts and engineers to query data directly in cloud object storage with simplicity, performance, and security.
Built on top of the DuckDB query engine, this integration provides a standards‑based approach for accessing cloud data using native authentication, fine-grained access control, and high‑performance I/O.
This blog starts with DuckDB’s authentication and extension architecture, then explains how core DuckDB extensions enable secure access to AWS, Azure, Google Cloud object storage and S3 compatible object storage in On Premise.
DuckDB Extension Architecture (Core Concepts)
DuckDB’s functionality is expanded through loadable extensions, which can be installed and activated at runtime. The complete list of supported extensions is available in the DuckDB Extensions Overview.
From a storage‑access perspective, three extensions are foundational:
These extensions work together to provide a clean separation between authentication and data access.
DuckDB Authentication Model
Secrets‑Based Authentication
DuckDB supports authentication using secrets, which are the recommended and secure way to manage credentials. Older variable‑based authentication mechanisms are deprecated.
Secrets allow credentials to be:
Responsibility Split Between Extensions
DuckDB follows a clear separation of concerns:
This separation ensures consistent behavior and portability across different environments and providers. The below table summarizes the overview,
DuckDB as the Foundation
Why DuckDB?
DuckDB is an open‑source analytical database optimized for OLAP workloads. This access engine builds on DuckDB to leverage:
This makes DuckDB well suited for querying large datasets directly from object storage without data duplication.
Core Extensions for Cloud Object Storage
httpfs – Remote File I/O
The httpfs extension is the backbone of DuckDB’s object storage integration.🔗
Key capabilities:
All cloud object storage reads and writes ultimately pass through httpfs.
AWS S3 Authentication (aws Extension)
DuckDB integrates with Amazon S3 using the aws extension, built on top of httpfs and the AWS SDK.
Supported Authentication Methods:
The aws extension manages identity resolution and request signing, while httpfs carries out data I/O. This design integrates cleanly with AWS IAM and existing enterprise RBAC models.
Google Cloud Storage Authentication (httpfs)
DuckDB does not currently provide a dedicated GCP extension. Access to Google Cloud Storage (GCS) is enabled through the httpfs extension using standard Google authentication mechanisms.
Supported Authentication Methods
Authentication is handled externally by Google’s SDKs and libraries, while httpfs performs the actual data access.
This aligns with Google Cloud IAM, enabling fine‑grained bucket‑ and object‑level permissions.
Azure Authentication (azure Extension)
The azure extension provides seamless connectivity to Azure Blob Storage and ADLS Gen2.
Installing and Loading the Extension
Supported Authentication Methods
OAuth Access Token Example
Azure’s hierarchical access model , spanning storage account, container, and file-level ACLs ,provides precise control and compliance alignment for enterprise use cases
OAuth‑based authentication enables:
Access Control in ADLS Gen2
Access can be enforced at multiple layers as shown below:
This layered model supports enterprise governance and compliance requirements . The diagram below illustrates the separation of permissions at the storage account level.
The below figure shows the overview of access control separation at container level.
On-Premises S3-Compatible Object Storage
In addition to public cloud providers, DuckDB can securely access on-premises and private-cloud S3-compatible object storage platforms such as MinIO, Red Hat Ceph (RGW), and other S3-API–compatible systems.
These platforms expose an S3-compatible API, allowing DuckDB to integrate using the same aws + httpfs extension model used for Amazon S3.
Commonly Supported Platforms
Authentication Model
Authentication is handled using access key and secret key credentials, stored securely using DuckDB secrets.
Typical authentication characteristics:
Example secret definition:
Once configured, DuckDB can process data directly:
SELECT * FROM 's3://my-bucket/path/*.parquet';
Extension Responsibilities
Enterprise and Hybrid Cloud Benefits
Supporting on-premises S3-compatible storage enables:
DuckDB’s reliance on open standards and S3 compatibility makes it a strong fit for enterprise environments that combine public cloud, private cloud, and on-premises infrastructure.
Summary
DuckDB combines the strengths of open data formats, cloud‑native IAM, and a modular extension architecture to deliver a modern, secure, and scalable approach to object storage access. By leveraging:
Organizations can:
As cloud‑native analytics continues to evolve, DuckDB provides a strong foundation for secure and high‑performance access to enterprise data lakes.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.