With the growing adoption of cloud-native analytics, organizations are looking for solutions that combine performance, simplicity, and security. SAS recently released SAS Access to DuckDB, an access engine that connects to DuckDB, a high-performance, in-process analytics engine. DuckDB supports open file formats like Parquet, CSV, and JSON and can natively connect to object stores such as Amazon S3, Google Cloud Storage (GCS), and Azure Data Lake Storage (ADLS).
This makes it possible to run ad-hoc analytics on cloud-native datasets without standing up heavy database infrastructure. However, when deploying in production, especially on Amazon EKS (Elastic Kubernetes Service) ,security and governance become crucial.
In this post, we’ll walk through how to integrate SAS viya on EKS with Amazon S3 using IAM roles for service accounts (IRSA), ensuring secure, fine-grained access control without hardcoding AWS credentials.
Why EKS + DuckDB + S3?
This combination enables secure, cloud-native analytics workflows, where data remains in S3 while compute scales dynamically in EKS. The integration can be configured in two stages, IAM integration between EKS and S3 and incorporating the necessary secrets into SAS code. Let’s walk through the detailed steps below.
In this stage, we configure the EKS cluster to securely access S3 using IAM roles for service accounts (IRSA). This eliminates the need for hard-coded credentials and ensures that access is scoped according to least-privilege principles. We will cover:
These steps allow your DuckDB workloads running in EKS to seamlessly and securely read/write data in S3.
Amazon EKS integrates with IAM Roles for Service Accounts (IRSA) by relying on an OpenID Connect (OIDC) identity provider. The OIDC provider enables Kubernetes service accounts in your cluster to be linked with IAM roles. This is a critical step because it allows workloads (pods) to securely obtain temporary AWS credentials, instead of relying on long-lived static access keys.
By enabling the OIDC provider, EKS creates a trust relationship between the Kubernetes service accounts in your cluster and AWS IAM, so that pods can request tokens and assume IAM roles as needed.
If your EKS cluster doesn’t already have an OIDC provider configured, you can create and associate one with the following command:
This command:
Queries your EKS cluster for its OIDC issuer URL.
Creates an IAM OIDC identity provider in your AWS account (if it does not exist).
Associates that OIDC provider with your cluster.
Uses the --approve flag to skip interactive approval and automatically confirm the association.
Once this step is completed, you can safely use IRSA to map Kubernetes service accounts to IAM roles, allowing fine-grained access control for different workloads in the cluster.
To confirm that the OIDC provider has been successfully associated with your cluster, execute the below command,
Create an IAM Policy for S3 Access
The first step in enabling secure access between your EKS workloads and Amazon S3 is to define a dedicated IAM Policy. This policy ensures that your Kubernetes pods via their service accounts only get the minimum permissions required to interact with S3.
Why this step is important
In many analytics use cases, your workloads need to both read and write data in Amazon S3. For example, you may read existing datasets, write query results, or update files as part of your pipeline. Below is a sample policy for a bucket named my-duckdb-data-bucket that grants read, write, list, and update permissions.
Create a JSON file (s3-duckdb-policy.json) with the following content:
Note: This IAM policy is only a sample. It should be customized based on the customer’s security model and compliance requirements. For example, you may:
Once you have defined your policy JSON file (s3-duckdb-policy.json), you can create the policy in your AWS account either using the AWS CLI or through the AWS Management Console.
SAS Viya uses the existing service account “sas-programming-environment” for starting compute pods. To allow these pods to securely access S3 without embedding AWS credentials we leverage IAM Roles for Service Accounts (IRSA), which allows Kubernetes pods to assume an AWS IAM role automatically via the EKS cluster’s OIDC provider. This approach provides secure, temporary, and auditable access to S3 for SAS workloads without embedding credentials. The process consists of three main steps:
Create an IAM role that can be assumed by the sas-programming-environment service account. This role’s trust policy specifies the OIDC provider of your EKS cluster and the namespace/service account combination. Replace placeholders with your values:
This can be done either using the AWS CLI or through the AWS Management Console:
Notes:
Attach the S3 access policy you created earlier (S3DuckDBAccess) to the IAM role.
This step ensures that the SAS pods running in your EKS cluster can use the role to interact with Amazon S3. The policy controls which specific operations (such as reading objects, writing new ones, listing buckets, or updating content) are permitted.
At this point, the SASDuckDBAccessRole is associated with the S3DuckDBAccess policy. This means the role now has the necessary S3 permissions defined in the policy. However, the permissions are not yet available to pods in EKS until the role is linked to a Kubernetes service account using IRSA (through service account annotation).
The final step in enabling IAM Roles for Service Accounts (IRSA) is to annotate your Kubernetes service account with the IAM role ARN. This annotation creates the link between the service account that your SAS pods use and the IAM role you created earlier. Once this mapping is in place, any pod running under that service account will automatically inherit the S3 permissions defined in the S3DuckDBAccess policy—without the need for static AWS credentials.
Use the following command:
You should see the IAM role ARN under the annotations section.
SAS compute pods using the sas-programming-environment service account can now securely access S3 via IAM, with no AWS access keys or secrets stored in code. This enables safe and auditable integration with DuckDB or other S3-based workload.
Once the IAM Role for Service Account (IRSA) is configured, SAS compute pods can seamlessly authenticate to AWS S3.The following example shows how to connect to Amazon S3 from SAS studio and query Parquet data directly. Let’s break down the code step by step.
In this step, we enable DuckDB to interact with cloud storage by installing the necessary extensions.
httpfs : Adds support for accessing remote files over HTTP and Amazon S3.
aws : Provides native integration with AWS services for authentication and secure access.
We also configure the extension_directory, which defines where DuckDB stores the downloaded extension binaries inside the pod. This ensures the extensions are kept in a consistent location and don’t need to be re-installed each time.
To securely connect DuckDB to Amazon S3, we define a secret. This secret tells DuckDB how to authenticate when accessing S3 resources.
Finally, we query a Parquet file directly from S3
Finally, we query a Parquet file directly from S3. Because DuckDB uses the IRSA-based IAM credentials of the SAS compute pod, it can fetch the file securely from S3 without any manual key management
Conclusion
By integrating SAS Viya, DuckDB, and AWS S3 through IAM Roles for Service Accounts (IRSA), we’ve created a secure, scalable, and efficient way to access cloud data directly from SAS Studio.No hardcoded credentials are required, authentication is handled by AWS IAM.
DuckDB extensions provide fast access to structured data formats such as Parquet and CSV stored in S3.SAS Studio users can now seamlessly query and analyze cloud-hosted datasets with familiar SQL and SAS procedures.This approach not only strengthens security but also simplifies operations, enabling organizations to unlock the full potential of cloud-native analytics in SAS Viya
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.