BookmarkSubscribeRSS Feed

Accessing Airflow Logs in SAS Viya Kubernetes Deployments

Started 3 weeks ago by
Modified a week ago by
Views 316

Apache Airflow is a powerful orchestration tool that SAS Viya users can leverage to automate, schedule, and monitor SAS (and other) jobs. When Airflow runs in Kubernetes, accessing task execution logs from outside the Airflow UI can be useful for effective monitoring, but finding them can be confusing. This post provides some guidance on options for accessing Airflow logs externally.

 

 

Where do Airflow logs live?

 

This depends on your individual setup and can vary. In my lab environment, Airflow is deployed to a separate namespace in my Kubernetes cluster where SAS Viya also resides. Here, several components are deployed:

root@server:~# kubectl get po -n airflow

 

NAME                                     READY   STATUS    RESTARTS       AGE
airflow-api-server-84c4446994-sp8kg      1/1     Running   45 (26h ago)   111d
airflow-dag-processor-6b55f97b96-6w64q   2/2     Running   25 (26h ago)   111d
airflow-postgresql-0                     1/1     Running   21 (26h ago)   111d
airflow-redis-0                          1/1     Running   4 (26h ago)    111d
airflow-scheduler-7d6c44b4d7-9f2mn       2/2     Running   42 (26h ago)   111d
airflow-statsd-78b94c6899-r6xlt          1/1     Running   4 (26h ago)    111d
airflow-triggerer-0                      2/2     Running   24 (26h ago)   111d
airflow-worker-0                         2/2     Running   14 (26h ago)   111d

 

In this setup, Airflow tasks are executed inside existing worker pods (airflow-worker-0), which are long-running and already up in the airflow namespace. The workers run multiple tasks inside the same pod/container (Kubernetes does not create new pods per task in my setup).

 

When flows are triggered, Airflow writes task logs inside the worker pods, typically under the directory /opt/airflow/logs, which are surfaced in the Airflow UI.

 

01_AF_airflow-ui-logs.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

The first line shows the path of the generated log file inside the pod. To persist logs beyond the typical ephemeral pod lifecycle, this directory is mounted on Kubernetes PersistentVolumeClaims (PVC), often backed by NFS storage as a manual task as part of the Airflow deployment process. Accessing these logs externally requires connecting to the exact NFS export path backing the PVC, as logs aren’t typically directly stored on your local Kubernetes nodes. Within pods, you can run mount | grep logs to discover the actual NFS path backing the PVC. For example, the external storage location could look like:

nfs-server.gelcorp.com:/srv/nfs/kubedata/airflow-airflow-logs-pvc-xxxxxx

 

Mounting this exact path on an external server or your workstation via NFS client tools enables you to browse and tail Airflow logs without pod access. In my lab:

 

02_AF_airflow-nfs-logs.png

 

 

How Logs Are Structured for Easy Identification

 

Airflow organises logs hierarchically using key=value directories such as:

->dag_id=my_first_dag

---->run_id=manual__2025-11-05T10:31:59.912269+00:00

-------->task_id=bash

 

This naming scheme uniquely identifies logs belonging to specific DAG runs and tasks, even when multiple flows operate simultaneously. Shell utilities may display these directory names enclosed in quotes due to special characters (like =), but the names are standard and intentional.

 

With this approach, users and admins can view logs for each run directly from the command line, providing robust control and insight into Airflow-managed workflows. Log messages are stored in standard JSON for flexibility.

 

 

Alternative Log Access Methods

 

The Airflow Web UI remains the primary and user-friendly method to view DAG and task logs in real-time. It aggregates logs and contextualises them with run metadata, filtering, and search.

 

You can also view pod logs with kubectl logs -n airflow airflow-worker-0 to see logs streamed from Airflow worker containers, which include some task execution output. This method can be helpful for quick debugging, but:

 

  • messages are not those displayed in the Airflow UI; these are are less granular than task log files
  • it requires appropriate RBAC permissions to be granted
  • it can be trickier to match messages with the tasks that generated them

 

2025-11-05 10:32:00.710584 [info     ] Task execute_workload[0d8c4993-55b7-4aae-a0fa-81af04d5cc63] received [celery.worker.strategy]
2025-11-05 10:32:00.720534 [info     ] [0d8c4993-55b7-4aae-a0fa-81af04d5cc63] Executing workload in Celery: token='eyJ***' ti=TaskInstance(id=UUID('019a5392-c8b7-7b00-9138-482da0a9f290'), task_id='bash', dag_id='my_first_dag', run_id='manual__2025-11-05T10:31:59.912269+00:00', try_number=1, map_index=-1, pool_slots=1, queue='default', priority_weight=2, executor_config=None, parent_context_carrier={}, context_carrier={}, queued_dttm=None) dag_rel_path=PurePosixPath('first_dag.py') bundle_info=BundleInfo(name='dags-folder', version=None) log_path='dag_id=my_first_dag/run_id=manual__2025-11-05T10:31:59.912269+00:00/task_id=bash/attempt=1.log' type='ExecuteTask' [airflow.providers.celery.executors.celery_executor_utils]
2025-11-05 10:32:00.771932 [info     ] Secrets backends loaded for worker [supervisor] backend_classes=['EnvironmentVariablesBackend'] count=1
2025-11-05 10:32:02.464392 [info     ] Task finished                  [supervisor] duration=1.7101506830003927 exit_code=0 final_state=success

 

Although tedious, you can also exec into the worker pods with kubectl exec -it -n airflow airflow-worker-0 to directly access and browse the logs directories inside pods. This method also requires appropriate access to be granted by your Kubernetes administrator.

 

 

Integration with observability tools

 

Another possible option is to collect log messages via Fluent Bit, Logstash, or similar tools for forwarding logs to a visualisation platform. With additional configuration, Airflow's task and component logs can be directed through these logging pipelines into tools like OpenSearch Dashboards, providing centralised log search, alerting, and dashboarding.

 

Depending on your existing logging infrastructure, it may be possible to configure your EFK/ELK stack to pick up Airflow logs by including the relevant namespaces and pod logs in the collection inputs. This approach enables integration of Airflow logs alongside other Kubernetes application logs, improving observability and monitoring.

 

If deployed, SAS Viya Monitoring for Kubernetes captures and indexes Airflow logs, but they are as limited as those coming from running a <code>kubectl logs</code> command, and not as granular as they are for SAS Viya log messages. The Fluent Bit configuration used by SAS Viya Monitoring for Kubernetes is targeted primarily at the SAS Viya namespaces and does not include Airflow's namespace in its logging filters. Take care if modifying; changes to the configuration are untested and may cause issues. Check latest documentation and support for extending log collection scopes before making changes.

 

 

Summary

 

External log access is important for Airflow observability in SAS Viya Kubernetes deployments. By identifying and mounting the correct NFS export path backing the airflow-logs PVC, you can browse logs from the command line without exec-ing into pods. The Web UI is great for monitoring interactive runs, and kubectl logs provides another complementary access method, but the ability to access them externally is helpful for users to monitor task progress quickly from the command line, and for administrators to perform maintenance, auditing, backups, and more on the aggregated logs. Integrating Airflow logs into observability streams further enhances log management and monitoring capabilities.

 

 

Find more articles from SAS Global Enablement and Learning here.

Contributors
Version history
Last update:
a week ago
Updated by:

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started