While the majority of the SAS Administrator’s interactions with the SAS Viya infrastructure happens at the Kubernetes level (with tools like kubectl or Lens) or with SAS proprietary tools (such as the SAS Environment Manager or the sas-viya CLI), there could be some situations where an access to the underlying node host is required : troubleshooting, maintenance, system logs collection, etc…
As an example, SAS tech support and field consultants already encountered a few scenarios where getting on the k8s nodes and querying journalctl/kubelet/kernel message logs were necessary (for example to determine if CAS pods are being killed by the OOM killer).
"How do I login to my Viya nodes ?" has become a frequently asked question, the purpose of this blog is to provide some guidance about the best way to do it.
In the follow up of this blog we will discuss various methods to log into the Viya nodes. But first of all, let’s make things clear :
SAS does not officially support the solutions presented below in any form. Logging onto your cluster nodes is strongly discouraged. Development and debugging of ones pod should be done via logging and other means, i.e. APIs, etc. |
The most obvious way that comes to mind is to connect through SSH to the cluster underlying Linux nodes.
If your Kubernetes cluster is running in the Cloud, another potential option could be to leverage the cloud CLI, for example the Azure CLI has a the az vm run-command invoke
commands to submit individual commands or run a script.
However these techniques are generally not very efficient: specific network rules or policies could prevent this type of SSH access from the outside, and the Cloud CLI option is cumbersome, not always possible and is different depending on the cloud provider.
Cloud providers (Azure, AWS, GCP, …) are giving you a "Kubernetes Managed Service" (AKS, EKS, GKE) and usually don’t like you to directly log into the underlying VMs by-passing the natural way to interact with the service. That’s why there are often barriers preventing you to directly get access to the underlying Cloud VM.
In Azure, if you are using the IaC tool (viya4-iac-azure) to provision your AKS infrastructure and have opted for the creation of a jump host VM, then you can use the private ssh key (required for the IaC build) to connect via SSH to the Jump Host VM in the Azure and from there, access the AKS worker nodes (the corresponding public key is distributed and “injected” in the AKS nodes as part of the IaC execution).
But it remains a pretty cumbersome, with a “two hops” process and this solution is not necessarily working for other Kubernetes platforms (not working in AWS EK for example).
However, there is a method that should work for any type of Kubernetes platform and does not require any specific host account password or SSH private key.
This generic solution is what we call the "container based" way because the technique used is to start a new pod with a container and exec into it to get access to the underlying host.
We will show and explain two methods : with an open source utility called node-shell
or directly using the kubectl
command, but whatever the method you choose, the requirements are the same :
KUBECONFIG
file being used to access the cluster must have admin rights to the cluster.
kubectl
from a network perspective either VPN, direct, or other means.
The first method one relies on a small utility called node-shell that lets you start a root shell in the node's host OS running. If you have been using the Lens application (great UI administration interface for Kubernetes) you are actually already using it without knowing it 😊. Indeed Lens has a feature in the Nodes view to let you open an SSH connection to the Kubernetes nodes. Behind the scene the node-shell utility is used.
The node-shell utility can be pulled and installed either in stand-alone mode, directly with 3 commands or it can be installed through krew
.
Krew is a plugin manager for kubectl, it allows you to install various plugin to increase even more the capabilities of the kubectl
command.
After having installed krew
by following the instructions from there it is very easy to install the node-shell
plugin (as well as other nice plugins) :
kubectl krew install node-shell
From here, we simply list our nodes:
kubectl get nodes
For example, in azure we would see something like:
NAME STATUS ROLES AGE VERSION
aks-cas-17945816-vmss000000 Ready agent 4m54s v1.23.12
aks-compute-27817762-vmss000000 Ready agent 4m30s v1.23.12
aks-stateful-16007966-vmss000000 Ready agent 4m55s v1.23.12
aks-stateless-37479317-vmss000000 Ready agent 4m58s v1.23.12
aks-system-48059844-vmss000000 Ready agent 11m v1.23.12
Then from that output, we simply target a node. For this example we'll say the node we want to connect to is aks-cas-17945816-vmss000000
:
kubectl node-shell aks-cas-17945816-vmss000000
At this point you should see a prompt from that node, just like if you had effectively "SSHed" into the node.
spawning "nsenter-2njj8b" on "aks-cas-17945816-vmss000000"
If you don't see a command prompt, try pressing enter.
root@aks-cas-17945816-vmss000000:/# id
uid=0(root) gid=0(root) groups=0(root),1(daemon),2(bin),3(sys),4(adm),6(disk),10(uucp),11,20(dialout),26(tape),27(sudo)
root@aks-cas-17945816-vmss000000:/# ls
NOTICE.txt bin boot dev etc home initrd.img initrd.img.old lib lib64 lost+found media mnt opt proc root run sbin srv sys tmp usr var vmlinuz vmlinuz.old
root@aks-cas-17945816-vmss000000:/#
If you don’t want (or are not allowed) to install anything on your client/jump host machine in addition to kubectl, you can use a kubectl native command called kubectl debug
.
This feature is provided out of the box and appears in the Kubernetes official documentation. Here is the syntax :
kubectl debug node/<node name> -it --image=<containerized OS>
All you need to provide is the name of the node that you want to connect to and the image of the container used to debug (the image must at least have the bash command).
For example in Azure , it would be something like :
[cloud-user@pdcesx03094 ~]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
aks-cas-17945816-vmss000000 Ready agent 3m37s v1.23.12
aks-compute-27817762-vmss000000 Ready agent 3m13s v1.23.12
aks-stateful-16007966-vmss000000 Ready agent 3m38s v1.23.12
aks-stateless-37479317-vmss000000 Ready agent 3m41s v1.23.12
aks-system-48059844-vmss000000 Ready agent 10m v1.23.12
[cloud-user@pdcesx03094 ~]$ kubectl debug node/aks-cas-17945816-vmss000000 -it --image=busybox
Creating debugging pod node-debugger-aks-cas-17945816-vmss000000-m9xm8 with container debugger on node aks-cas-17945816-vmss000000.
If you don't see a command prompt, try pressing enter.
/#
Note that in this example, we’ve been using busybox
which is a minimal linux system which does not necessarily include the system debugging tools that you might need to troubleshoot an infrastructure level issue.
Instead, you could use a more complete image that would include additional system debugging tools, such as ubuntu for example, but it would pull a larger image and run a heavier container in the cluster.
This way to get SSH access to the host is the simplest and the most generic, I was able to test it with success for every supported Kubernetes platform.
Example in GCP:
Example in AWS:
Example in opensource K8s:
Example in RHEL OpenShift:
Finally, please note that there are a few important things to know about the kubectl debug
command though, here is an extract from the Kubernetes documentation :
As we’ve seen, the two techniques (node-shell
and kubectl debug
) work, no matter what the cloud provider is, how the network has been secured or if host accounts public keys have been "cloud-init" loaded or not.
However, while required in some situation, such type of direct access should remain an exception and should be done very carefully only by the SAS or Kubernetes administrator (not every developer !) for ad hoc debugging or troubleshooting.
As rightfully noted by the SAS R&D:
“This kind SSH access could be abused to do additional node setup, but it is a bad practice since nodes are ephemeral and can come and go with autoscaling etc. Node modifications should be performed via DaemonSets or similar.
While abuse cannot be ruled out, there are still legitimate use cases for access nodes directly, to explore node configurations, to debug issues (e.g., to investigate why the Daemonset does not work as intended) etc.”
That's all, thanks for reading!
Hi Raphael,
Great article, thanks. The last image is a link to an internal URL on sww, can you make it available externally?
Thanks,
Andy
The image should now be available. Thanks for letting us know!
Test a comment
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.