When trying to deploy the SAS Viya 3.4 I'm receiving
I didn’t specify any home directory, should I update the home directory and re-deploy?
Yes.
I would like to review /etc/sysconfig/sas/sas-viya-consul-default from each machine. Also, show me the output from a command below:
ansible all -m shell -a "ifconfig"
/etc/sysconfig/sas/sas-viya-consul-default info from each machine.
363748lp42sg001.geicoddc.net
# Consul option: -advertise
# Specify the name of a network interface or IPv4 address.
# Defaults to the bind IP address if not specified.
#CONSUL_ADVERTISE_INTERNAL=
# Consul option: -bind
# Holds the desired name of a network interface or IPv4 address.
# Please do not edit this. Instead, set it via your Ansible run.
export CONSUL_BIND_EXTERNAL='eth0'
CONSUL_CONFIG_DIR="/opt/sas/viya/config/etc/consul.d"
CONSUL_OPTIONS="-client 0.0.0.0"
CONSUL_URL="http://localhost:8500"
export CONSUL_BOOTSTRAP_EXPECT='3'
export CONSUL_DATACENTER_NAME='sas-viya-pd-03may2019'
export CONSUL_SERVER_FLAG='true'
export CONSUL_SERVER_LIST='10.190.103.251, 10.190.103.210, 10.190.103.238'
export DISABLE_CONSUL_HTTP_PORT='True'
export SECURE_CONSUL='True'
363748lp42sg002.geicoddc.net
# Consul option: -advertise
# Specify the name of a network interface or IPv4 address.
# Defaults to the bind IP address if not specified.
#CONSUL_ADVERTISE_INTERNAL=
# Consul option: -bind
# Holds the desired name of a network interface or IPv4 address.
# Please do not edit this. Instead, set it via your Ansible run.
export CONSUL_BIND_EXTERNAL='eth0'
CONSUL_CONFIG_DIR="/opt/sas/viya/config/etc/consul.d"
CONSUL_OPTIONS="-client 0.0.0.0"
CONSUL_URL="http://localhost:8500"
export CONSUL_BOOTSTRAP_EXPECT='3'
export CONSUL_DATACENTER_NAME='sas-viya-pd-03may2019'
export CONSUL_SERVER_FLAG='true'
export CONSUL_SERVER_LIST='10.190.103.251, 10.190.103.210, 10.190.103.238'
export DISABLE_CONSUL_HTTP_PORT='True'
export SECURE_CONSUL='True'
363748lp42sg003.geicoddc.net
# Consul option: -advertise
# Specify the name of a network interface or IPv4 address.
# Defaults to the bind IP address if not specified.
#CONSUL_ADVERTISE_INTERNAL=
# Consul option: -bind
# Holds the desired name of a network interface or IPv4 address.
# Please do not edit this. Instead, set it via your Ansible run.
export CONSUL_BIND_EXTERNAL='eth0'
CONSUL_CONFIG_DIR="/opt/sas/viya/config/etc/consul.d"
CONSUL_OPTIONS="-client 0.0.0.0"
CONSUL_URL="http://localhost:8500"
export CONSUL_BOOTSTRAP_EXPECT='3'
export CONSUL_DATACENTER_NAME='sas-viya-pd-03may2019'
export CONSUL_SERVER_FLAG='true'
export CONSUL_SERVER_LIST='10.190.103.251, 10.190.103.210, 10.190.103.238'
export DISABLE_CONSUL_HTTP_PORT='True'
export SECURE_CONSUL='True'
363748lp42sg004.geicoddc.net
# Consul option: -advertise
# Specify the name of a network interface or IPv4 address.
# Defaults to the bind IP address if not specified.
#CONSUL_ADVERTISE_INTERNAL=
# Consul option: -bind
# Holds the desired name of a network interface or IPv4 address.
# Please do not edit this. Instead, set it via your Ansible run.
export CONSUL_BIND_EXTERNAL=
CONSUL_CONFIG_DIR="/opt/sas/viya/config/etc/consul.d"
CONSUL_OPTIONS="-client 0.0.0.0"
CONSUL_URL="http://localhost:8500"
363748lp42sg005.geicoddc.net
# Consul option: -advertise
# Specify the name of a network interface or IPv4 address.
# Defaults to the bind IP address if not specified.
#CONSUL_ADVERTISE_INTERNAL=
# Consul option: -bind
# Holds the desired name of a network interface or IPv4 address.
# Please do not edit this. Instead, set it via your Ansible run.
export CONSUL_BIND_EXTERNAL=
CONSUL_CONFIG_DIR="/opt/sas/viya/config/etc/consul.d"
CONSUL_OPTIONS="-client 0.0.0.0"
CONSUL_URL="http://localhost:8500"
363748lp42sg006.geicoddc.net
# Consul option: -advertise
# Specify the name of a network interface or IPv4 address.
# Defaults to the bind IP address if not specified.
#CONSUL_ADVERTISE_INTERNAL=
# Consul option: -bind
# Holds the desired name of a network interface or IPv4 address.
# Please do not edit this. Instead, set it via your Ansible run.
export CONSUL_BIND_EXTERNAL=
CONSUL_CONFIG_DIR="/opt/sas/viya/config/etc/consul.d"
CONSUL_OPTIONS="-client 0.0.0.0"
CONSUL_URL="http://localhost:8500"
363748lp42sg007.geicoddc.net
# Consul option: -advertise
# Specify the name of a network interface or IPv4 address.
# Defaults to the bind IP address if not specified.
#CONSUL_ADVERTISE_INTERNAL=
# Consul option: -bind
# Holds the desired name of a network interface or IPv4 address.
# Please do not edit this. Instead, set it via your Ansible run.
export CONSUL_BIND_EXTERNAL=
CONSUL_CONFIG_DIR="/opt/sas/viya/config/etc/consul.d"
CONSUL_OPTIONS="-client 0.0.0.0"
CONSUL_URL="http://localhost:8500"
363748lp42sg008.geicoddc.net
# Consul option: -advertise
# Specify the name of a network interface or IPv4 address.
# Defaults to the bind IP address if not specified.
#CONSUL_ADVERTISE_INTERNAL=
# Consul option: -bind
# Holds the desired name of a network interface or IPv4 address.
# Please do not edit this. Instead, set it via your Ansible run.
export CONSUL_BIND_EXTERNAL=
CONSUL_CONFIG_DIR="/opt/sas/viya/config/etc/consul.d"
CONSUL_OPTIONS="-client 0.0.0.0"
CONSUL_URL="http://localhost:8500"
363748vp42sg001.geicoddc.net
# Consul option: -advertise
# Specify the name of a network interface or IPv4 address.
# Defaults to the bind IP address if not specified.
#CONSUL_ADVERTISE_INTERNAL=
# Consul option: -bind
# Holds the desired name of a network interface or IPv4 address.
# Please do not edit this. Instead, set it via your Ansible run.
export CONSUL_BIND_EXTERNAL=
CONSUL_CONFIG_DIR="/opt/sas/viya/config/etc/consul.d"
CONSUL_OPTIONS="-client 0.0.0.0"
CONSUL_URL="http://localhost:8500"
/etc/sysconfig/sas/sas-viya-consul-default and ansible output.
In the inventory.ini file, you did set consul_bind_adapter=eth0 for each of your hosts, but the network adapter with name eth0 exist only on Ansible_Controller. You have to set a proper network adapter for each of your hosts in the inventory file.
This error has been fixed but the deployment failed sasrabbitmq user error.
fatal: [Service_Layer_I]: FAILED! => {"changed": true, "cmd": ["/opt/sas/viya/home/bin/setup_rabbit_cluster", "--hostlist", "363748lp42sg001.geicoddc.net, 363748lp42sg002.geicoddc.net, 363748lp42sg003.geicoddc.net", "--home", "/opt/sas/viya/home", "--config", "/opt/sas/viya/config", "--service", "sas-viya-rabbitmq-server-default", "--logfile", "/tmp/sas_setup_rabbit_cluster.log"], "delta": "0:00:02.064688", "end": "2019-05-14 15:23:32.091912", "msg": "non-zero return code", "rc": 3, "start": "2019-05-14 15:23:30.027224", "stderr": "chown: invalid user: ‘sasrabbitmq’", "stderr_lines": ["chown: invalid user: ‘sasrabbitmq’"], "stdout": "Tue May 14 15:23:30 EDT 2019 setup_rabbit_cluster *************\nTue May 14 15:23:30 EDT 2019 setup_rabbit_cluster Beginning SAS RabbitMQ clustering setup\nTue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster SSL is true\nTue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster CHECK_PORT is 5671\nTue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster First host in list is 363748lp42sg001.geicoddc.net\nTue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster First host shortname is 363748lp42sg001\nTue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster Primary Host is 363748lp42sg001.geicoddc.net\nTue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster Primary Short Host is 363748lp42sg001\nTue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster My Host is 363748lp42sg001.geicoddc.net\nTue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster I am the primary host\nTue May 14 15:23:30 EDT 2019 setup_rabbit_cluster Copy generated Erlang shared secret to Rabbit.\nTue May 14 15:23:32 EDT 2019 setup_rabbit_cluster Copying generated Erlang shared secret to Rabbit.\nTue May 14 15:23:32 EDT 2019 setup_rabbit_cluster Could not change ownership of generated shared secret file /opt/sas/viya/config/var/lib/rabbitmq-server/sasrabbitmq/.erlang.cookie\nTue May 14 15:23:32 EDT 2019 setup_rabbit_cluster Target owner is sasrabbitmq\nTue May 14 15:23:32 EDT 2019 setup_rabbit_cluster Returned status was 1\nTue May 14 15:23:32 EDT 2019 setup_rabbit_cluster Setup failed. Beware of independent Rabbit hosts running with different shared secrets.", "stdout_lines": ["Tue May 14 15:23:30 EDT 2019 setup_rabbit_cluster *************", "Tue May 14 15:23:30 EDT 2019 setup_rabbit_cluster Beginning SAS RabbitMQ clustering setup", "Tue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster SSL is true", "Tue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster CHECK_PORT is 5671", "Tue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster First host in list is 363748lp42sg001.geicoddc.net", "Tue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster First host shortname is 363748lp42sg001", "Tue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster Primary Host is 363748lp42sg001.geicoddc.net", "Tue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster Primary Short Host is 363748lp42sg001", "Tue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster My Host is 363748lp42sg001.geicoddc.net", "Tue May 14 15:23:30 EDT 2019 (debug) setup_rabbit_cluster I am the primary host", "Tue May 14 15:23:30 EDT 2019 setup_rabbit_cluster Copy generated Erlang shared secret to Rabbit.", "Tue May 14 15:23:32 EDT 2019 setup_rabbit_cluster Copying generated Erlang shared secret to Rabbit.", "Tue May 14 15:23:32 EDT 2019 setup_rabbit_cluster Could not change ownership of generated shared secret file /opt/sas/viya/config/var/lib/rabbitmq-server/sasrabbitmq/.erlang.cookie", "Tue May 14 15:23:32 EDT 2019 setup_rabbit_cluster Target owner is sasrabbitmq", "Tue May 14 15:23:32 EDT 2019 setup_rabbit_cluster Returned status was 1", "Tue May 14 15:23:32 EDT 2019 setup_rabbit_cluster Setup failed. Beware of independent Rabbit hosts running with different shared secrets."]
I created local account "sasrabbitmq" on rabbitmq servers and added it to be part of sas group.
Now the deployment fails with below error.
2019-05-17 11:44:44,337 p=20286 u=root | fatal: [Service_Layer_III]: FAILED! => {"changed": true, "cmd": ["/opt/sas/viya/home/bin/setup_rabbit_cluster", "--hostlist", "363748lp42sg001.geicoddc.net, 363748lp42sg002.geicoddc.net, 363748lp42sg003.geicoddc.net", "--home", "/opt/sas/viya/home", "--config", "/opt/sas/viya/config", "--service", "sas-viya-rabbitmq-server-default", "--logfile", "/tmp/sas_setup_rabbit_cluster.log"], "delta": "0:00:26.460912", "end": "2019-05-17 11:44:44.285005", "msg": "non-zero return code", "rc": 8, "start": "2019-05-17 11:44:17.824093", "stderr": "Error: unable to perform an operation on node 'rabbit@363748lp42sg001.geicoddc.net'. Please see diagnostics information and suggestions below.\n\nMost common reasons for this are:\n\n * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)\n * CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)\n * Target node is not running\n\nIn addition to the diagnostics info below:\n\n * See the CLI, clustering and networking guides on http://rabbitmq.com/documentation.html to learn more\n * Consult server logs on node rabbit@363748lp42sg001.geicoddc.net\n\nDIAGNOSTICS\n===========\n\nattempted to contact: ['rabbit@363748lp42sg001.geicoddc.net']\n\nrabbit@363748lp42sg001.geicoddc.net:\n * connected to epmd (port 4369) on 363748lp42sg001.geicoddc.net\n * epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic \n * TCP connection succeeded but Erlang distribution failed \n\n * Authentication failed (rejected by the remote node), please check the Erlang cookie\n\n\nCurrent node details:\n * node name: 'rabbitmqcli17@363748lp42sg003.geicoddc.net'\n * effective user's home directory: /home/sasrabbitmq\n * Erlang cookie hash: DC2DocmZ/sdjb8gZkaqU8w==", "stderr_lines": ["Error: unable to perform an operation on node 'rabbit@363748lp42sg001.geicoddc.net'. Please see diagnostics information and suggestions below.", "", "Most common reasons for this are:", "", " * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)", " * CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)", " * Target node is not running", "", "In addition to the diagnostics info below:", "", " * See the CLI, clustering and networking guides on http://rabbitmq.com/documentation.html to learn more", " * Consult server logs on node rabbit@363748lp42sg001.geicoddc.net", "", "DIAGNOSTICS", "===========", "", "attempted to contact: ['rabbit@363748lp42sg001.geicoddc.net']", "", "rabbit@363748lp42sg001.geicoddc.net:", " * connected to epmd (port 4369) on 363748lp42sg001.geicoddc.net", " * epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic ", " * TCP connection succeeded but Erlang distribution failed ", "", " * Authentication failed (rejected by the remote node), please check the Erlang cookie", "", "", "Current node details:", " * node name: 'rabbitmqcli17@363748lp42sg003.geicoddc.net'", " * effective user's home directory: /home/sasrabbitmq", " * Erlang cookie hash: DC2DocmZ/sdjb8gZkaqU8w=="], "stdout": "Fri May 17 11:44:17 EDT 2019 setup_rabbit_cluster *************\nFri May 17 11:44:17 EDT 2019 setup_rabbit_cluster Beginning SAS RabbitMQ clustering setup\nFri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster SSL is true\nFri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster CHECK_PORT is 5671\nFri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster First host in list is 363748lp42sg001.geicoddc.net\nFri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster First host shortname is 363748lp42sg001\nFri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster Primary Host is 363748lp42sg001.geicoddc.net\nFri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster Primary Short Host is 363748lp42sg001\nFri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster My Host is 363748lp42sg003.geicoddc.net\nFri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster I am a clustering host\nFri May 17 11:44:18 EDT 2019 setup_rabbit_cluster Copy generated Erlang shared secret to Rabbit.\nFri May 17 11:44:22 EDT 2019 setup_rabbit_cluster Copying generated Erlang shared secret to Rabbit.\nFri May 17 11:44:22 EDT 2019 setup_rabbit_cluster Starting sas-viya-rabbitmq-server-default\nFri May 17 11:44:37 EDT 2019 setup_rabbit_cluster Wait for a Rabbit listener instance on the primary host 363748lp42sg001.geicoddc.net\nFri May 17 11:44:37 EDT 2019 (debug) setup_rabbit_cluster wait_for_host 363748lp42sg001.geicoddc.net succeeded, listener detected.\nFri May 17 11:44:39 EDT 2019 (debug) setup_rabbit_cluster Issuing clustering commands\nFri May 17 11:44:39 EDT 2019 (debug) setup_rabbit_cluster Stop the Rabbit app (not the server though)\nFri May 17 11:44:41 EDT 2019 (debug) setup_rabbit_cluster USE_LONGNAME = true, MY_PRIMARY is rabbit@363748lp42sg001.geicoddc.net\nFri May 17 11:44:41 EDT 2019 (debug) setup_rabbit_cluster Cluster with the primary node\nFri May 17 11:44:44 EDT 2019 setup_rabbit_cluster Attempts to join_cluster with the node rabbit@363748lp42sg001.geicoddc.net failed. Clustering failed.\nFri May 17 11:44:44 EDT 2019 setup_rabbit_cluster Returned status was 69", "stdout_lines": ["Fri May 17 11:44:17 EDT 2019 setup_rabbit_cluster *************", "Fri May 17 11:44:17 EDT 2019 setup_rabbit_cluster Beginning SAS RabbitMQ clustering setup", "Fri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster SSL is true", "Fri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster CHECK_PORT is 5671", "Fri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster First host in list is 363748lp42sg001.geicoddc.net", "Fri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster First host shortname is 363748lp42sg001", "Fri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster Primary Host is 363748lp42sg001.geicoddc.net", "Fri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster Primary Short Host is 363748lp42sg001", "Fri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster My Host is 363748lp42sg003.geicoddc.net", "Fri May 17 11:44:18 EDT 2019 (debug) setup_rabbit_cluster I am a clustering host", "Fri May 17 11:44:18 EDT 2019 setup_rabbit_cluster Copy generated Erlang shared secret to Rabbit.", "Fri May 17 11:44:22 EDT 2019 setup_rabbit_cluster Copying generated Erlang shared secret to Rabbit.", "Fri May 17 11:44:22 EDT 2019 setup_rabbit_cluster Starting sas-viya-rabbitmq-server-default", "Fri May 17 11:44:37 EDT 2019 setup_rabbit_cluster Wait for a Rabbit listener instance on the primary host 363748lp42sg001.geicoddc.net", "Fri May 17 11:44:37 EDT 2019 (debug) setup_rabbit_cluster wait_for_host 363748lp42sg001.geicoddc.net succeeded, listener detected.", "Fri May 17 11:44:39 EDT 2019 (debug) setup_rabbit_cluster Issuing clustering commands", "Fri May 17 11:44:39 EDT 2019 (debug) setup_rabbit_cluster Stop the Rabbit app (not the server though)", "Fri May 17 11:44:41 EDT 2019 (debug) setup_rabbit_cluster USE_LONGNAME = true, MY_PRIMARY is rabbit@363748lp42sg001.geicoddc.net", "Fri May 17 11:44:41 EDT 2019 (debug) setup_rabbit_cluster Cluster with the primary node", "Fri May 17 11:44:44 EDT 2019 setup_rabbit_cluster Attempts to join_cluster with the node rabbit@363748lp42sg001.geicoddc.net failed. Clustering failed.", "Fri May 17 11:44:44 EDT 2019 setup_rabbit_cluster Returned status was 69"]}
I created local account "sasrabbitmq" on rabbitmq servers and added it to be part of sas group.
That user should be created automatically during the deployment process. Anyway, what home directory did you set for that user? The home directory for that user should be /opt/sas/viya/config/var/lib/rabbitmq-server/sasrabbitmq and the user should have read/write access to it in order to write Erlang cookie.
Deployment didn’t create the account. That is the reason it failed during initial deployment.
chown: invalid user: ‘sasrabbitmq’", "stderr_lines": ["chown: invalid user: ‘sasrabbitmq’"], "stdout": "Tue May 14 15:23:30 EDT 2019 setup_rabbit_cluster *************\nTue
I didn’t specify any home directory, should I update the home directory and re-deploy?
Thank you!
I didn’t specify any home directory, should I update the home directory and re-deploy?
Yes.
Thank you for the confirmation. I modifying the home directory for "sarabbitmq" on last Friday and was able to successfully deploy SAS Viya without any issue.
I'll be restarting the services on all nodes to makes sure the system is healthy. I'll keep you posted on any updated.
Thank you,
Sukesh Talasani
Thanks for the update. I'm glad that all the issues were resolved. Please do not forget to update the track and mark this thread as resolved.
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.