BookmarkSubscribeRSS Feed
JPM1
Fluorite | Level 6

Hello.

 

We have a Viya 3.4 MMP installation (Web and Microservices; CAS Controller; 2x CAS Worker nodes).

 

When I start all services using ansible-playbook --ask-become-pass --ask-pass viya-ark-master/playbooks/viya-mmsu/viya-services-start.yml, sas-viya-cascontroller-default.service fails to start on the CAS Controller.  It used to work just fine before when had to change some faulty RAM on a few of the servers.  I have applied all the possible solutions that I could find here and elsewhere on the web, but to no avail.  I receive the following error:

 

TASK [Start SAS CAS servers] ************************************************************************************************************************************************************fatal: [sas-con]: FAILED! => {"changed": true, "msg": "non-zero return code", "rc": 1, "stderr": "Shared connection to CASControllerHostName closed.\r\n", "stderr_lines": ["Shared connection to CASControllerHostName closed."], "stdout": "Job for sas-viya-cascontroller-default.service failed because the control process exited with error code. See \"systemctl status sas-viya-cascontroller-default.service\" and \"journalctl -xe\" for details.\r\nERROR: service start failed\r\n", "stdout_lines": ["Job for sas-viya-cascontroller-default.service failed because the control process exited with error code. See \"systemctl status sas-viya-cascontroller-default.service\" and \"journalctl -xe\" for details.", "ERROR: service start failed"]}

 

NO MORE HOSTS LEFT

**********************************************************************************

 

PLAY RECAP

**********************************************************************************

localhost : ok=1 changed=0 unreachable=0 failed=0

sas-con : ok=3 changed=1 unreachable=0 failed=1

sas-web : ok=13 changed=7 unreachable=0 failed=0

sas-wk1 : ok=3 changed=1 unreachable=0 failed=0

sas-wk2 : ok=3 changed=1 unreachable=0 failed=0

 

After then running journalctl -xe on the CAS Controller, I get the following message:

 

--

-- A new session with the ID 27 has been created for the user root.

--

-- The leading process of the session is 19340.

Oct 09 14:47:13 CASControllerHostName systemd[1]: Started Session 27 of user root.

-- Subject: Unit session-27.scope has finished start-up

-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

--

-- Unit session-27.scope has finished starting up.

--

-- The start-up result is done.

Oct 09 14:47:13 CASControllerHostName sshd[19340]: pam_unix(sshd:session): session opened for user root by (uid=0)

Oct 09 14:47:14 CASControllerHostName systemd[1]: Starting LSB: SAS CAS Controller...

-- Subject: Unit sas-viya-cascontroller-default.service has begun start-up

-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

--

-- Unit sas-viya-cascontroller-default.service has begun starting up.

Oct 09 14:47:14 CASControllerHostName sas-viya-cascontroller-default[19396]: Running sas_setup_external_network

Oct 09 14:47:14 CASControllerHostName sas-viya-cascontroller-default[19396]: Checking status of Consul leader...

Oct 09 14:47:14 CASControllerHostName sas-viya-cascontroller-default[19396]: Consul is up

Oct 09 14:47:15 CASControllerHostName sas-viya-cascontroller-default[19396]: Checking for orphaned pids and pid files.

Oct 09 14:47:15 CASControllerHostName sas-viya-cascontroller-default[19396]: No orphaned pids or pid files found for viya-default instance on the controller node.

Oct 09 14:47:15 CASControllerHostName runuser[19593]: pam_unix(runuser:session): session opened for user root by (uid=0)

Oct 09 14:47:15 CASControllerHostName runuser[19593]: pam_unix(runuser:session): session closed for user root

Oct 09 14:47:15 CASControllerHostName sas-viya-cascontroller-default[19396]: [ERROR] Unable to start sas-viya-cascontroller-default due to issue in CAS setup

Oct 09 14:47:15 CASControllerHostName sas-viya-cascontroller-default[19396]: [WARN] sas-viya-cascontroller-default is dead

Oct 09 14:47:15 CASControllerHostName systemd[1]: sas-viya-cascontroller-default.service: control process exited, code=exited status=1

Oct 09 14:47:15 CASControllerHostName systemd[1]: Failed to start LSB: SAS CAS Controller.

-- Subject: Unit sas-viya-cascontroller-default.service has failed

-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

--

-- Unit sas-viya-cascontroller-default.service has failed.

--

-- The result is failed.

Oct 09 14:47:15 CASControllerHostName systemd[1]: Unit sas-viya-cascontroller-default.service entered failed state.

Oct 09 14:47:15 CASControllerHostName systemd[1]: sas-viya-cascontroller-default.service failed.

Oct 09 14:48:15 CASControllerHostName sshd[19340]: Received disconnect from CASControllerIP port 55866:11: disconnected by user

Oct 09 14:48:15 CASControllerHostName sshd[19340]: Disconnected from CASControllerIP port 55866

Oct 09 14:48:15 CASControllerHostName sshd[19340]: pam_unix(sshd:session): session closed for user root

Oct 09 14:48:15 CASControllerHostName systemd-logind[3243]: Removed session 27.

-- Subject: Session 27 has been terminated

-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- Documentation: http://www.freedesktop.org/wiki/Software/systemd/multiseat

--

-- A session with the ID 27 has been terminated.

Oct 09 14:50:01 CASControllerHostName systemd[1]: Started Session 28 of user root.

-- Subject: Unit session-28.scope has finished start-up

-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

--

-- Unit session-28.scope has finished starting up.

--

-- The start-up result is done.

 

Any assistance will be greatly appreciated.

6 REPLIES 6
alexal
SAS Employee

Have you had a chance to check the CAS server logs in /opt/sas/viya/config/var/log/cas/default directory?

JPM1
Fluorite | Level 6
Thank you. I know that this is obvious to SAS admins, but I am a lecturer (and Windows user) trying my best to be SAS Viya admin as well.

The log file includes: ERROR: CAS_DISK_CACHE directory "/sascache" does not exist. Exiting with rc=1

Since I set this up a while ago using NFS (I know it does not make sense, but it's an interim solution), I should be able to fix it.

Thank again!
alexal
SAS Employee
Not a problem. Just make sure CAS_DISK_CACHE is available on all of your CAS server nodes (controller and workers).
JPM1
Fluorite | Level 6
Will do, thanks.

I see that I neglected to add the relevant NFS mount commands in /etc/fstab on the different servers along with the other ones, for some reason. Therefore, it did not auto mount after boot.
JPM1
Fluorite | Level 6
Everything is working again!

Actually, it was the command of the main NFS share in /etc/fstab that wasn't correct, and therefore the symbolic links to its subdirectories (representing sascache on the different nodes) were broken. It worked previously when I mounted the main NFS share manually (BUT I never tested the auto-mount after reboot!).

It was ...

hostname:/nfs_share mount_point nfs defaults 0 0

(missing front slash for mount point), instead of ...

hostname:/nfs_share /mount_point nfs defaults 0 0
alexal
SAS Employee

I'm glad the problem has been resolved.

suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

Get Started with SAS Information Catalog in SAS Viya

SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 3176 views
  • 1 like
  • 2 in conversation