Hello,
I am having problems restarting SAS services.
Could be a postgres/ consul issue or is it a network issue?
Thank you in advance.
the log gives the same error:
failed 'f_operate_node_request PoolNode:All:Stop'. rC=1
ERROR: f_operate_node_request: The consul-template-operation-node service may be down or failed to invoke its template script. node: pgpool0, wait time=16s
2025-12-16 12:15:37.697 LOG: f_read_kv_to_var: /opt/sas/viya/home/bin/sas-bootstrap-config kv read config/postgres/admin/pgpool2/operation_status
2025-12-16 12:15:37.928 LOG: f_read_kv_to_var: config/postgres/admin/pgpool2/operation_status=Successful
2025-12-16 12:15:37.943 LOG: f_operate_node_request: f_release_lock CurrentLockNodeOperation
2025-12-16 12:15:37.958 LOG: f_release_lock: f_assert_env AnyTypeNode sas
2025-12-16 12:15:37.973 LOG: f_release_lock: Release lock: /opt/sas/viya/home/bin/sas-bootstrap-config kv lock release --key config/postgres/session/CurrentLockNodeOperation --session-id f08b016d-cbd0-9265-6757-9392118acad4
true
2025-12-16 12:15:38.237 LOG: f_release_lock: Released the Consul lock
2025-12-16 12:15:38.252 LOG: f_delete_lock_session: f_assert_env AnyTypeNode sas
2025-12-16 12:15:38.266 LOG: f_delete_lock_session: Delete session: /opt/sas/viya/home/bin/sas-bootstrap-config session delete --id f08b016d-cbd0-9265-6757-9392118acad4
2025-12-16 12:15:38.545 LOG: f_delete_lock_session: Consul lock session f08b016d-cbd0-9265-6757-9392118acad4 was deleted
2025-12-16 12:15:38.555 ERROR: f_operate_node_request: Failed PoolNode:All:Stop for postgres
2025-12-16 12:15:38.564 ERROR: f_stop_cluster: failed 'f_operate_node_request PoolNode:All:Stop'. rC=1
ERROR: f_operate_node_request: The consul-template-operation-node service may be down or failed to invoke its template script. node: pgpool0, wait time=16s
ERROR: f_operate_node_request: Failed PoolNode:All:Stop for postgres
ERROR: f_stop_cluster: failed 'f_operate_node_request PoolNode:All:Stop'. rC=1
2025-12-16 12:15:38.574 ERROR: main: failed 'f_stop_cluster'. rC=1
The virtual IP address is set somewhere in the pgpool config files ? Can it be modified?
I tried to restart pgpool and postgresql services but the same error appears.
the service is running:
./sas-viya-sasdatasvrc-postgres-pgpool0-consul-template-operation_node status
./sas-viya-sasdatasvrc-postgres-pgpool0-consul-template-operation_node sourcing sds_set_env_variable with /opt/sas/viya/config/etc/sasdatasvrc/postgres/pgpool0/sds_env_var.sh...
SAS_HOSTNAME=sas1.onrc.ro, SAS_SERVICE_ADDR=sas1.onrc.ro, SAS_BIND_ADDR=172.16.18.44
VAULT_ADDR: https://sas3.onrc.ro:8200
SASCONFIG=/opt/sas/viya/config, PGHOME=/opt/sas/viya/home/postgresql11, PGPOOLHOME=/opt/sas/viya/home/pgpool-II40
POSTGRES_FULL_VERSION=11.9, PGPOOL_FULL_VERSION=4.0.6
Sourced. HA_PGPOOL_COUNT=3, SASCONFIG=/opt/sas/viya/config, SAS_LOG_DIR=/opt/sas/viya/config/var/log
Service sas-viya-sasdatasvrc-postgres-pgpool0-consul-template-operation_node is running with PID 202254.
this, i think is where the virtual IP and other parameters are defined:
cat /opt/sas/viya/config/etc/sasdatasvrc/postgres/pgpool0/sds_env_var.sh
export CONSUL_PORT="8500"
export DEFAULT_POSTGRES_SERVICE="postgres"
export INSTALL_GROUP="sas"
export INSTALL_USER="sas"
export INVENTORY_HOSTNAME="casactive"
export SASCONFIG="/opt/sas/viya/config/etc/sasdatasvrc"
export SASHOME="/opt/sas/viya/home"
export SAS_LOG_DIR="/opt/sas/viya/config/var/log/sasdatasvrc"
export HA_PGPOOL_VIRTUAL_IP=172.16.18.70
export HA_PGPOOL_WATCHDOG_PORT=5433
export PCP_PORT=5430
export PERMS_OVERRIDE=false
export PGPOOL_PORT=5431
export POOL_NUMBER=0
export SANMOUNT=/opt/sas/viya/config/data/sasdatasvrc
export SERVICE_NAME=postgres
In the same folder i have the watchdog output file:
cat pcp_watchdog_info.out
ERROR: connection to host "172.16.18.70" failed with error "No route to host"
I should mention that SAS is configured on a cluster so I don't know if the virtual IP shoul be assigned to that particular node.
All commands from from the documentation give some kind of error:
[root@sas1 pgpool0]# nslookup 172.16.18.70
** server can't find 70.18.16.172.in-addr.arpa.: NXDOMAIN
[root@sas1 pgpool0]# arping -c 3 172.16.18.70
arping: Suitable device could not be determined. Please, use option -I.
Usage: arping [-fqbDUAV] [-c count] [-w timeout] [-I device] [-s source] destination
-f : quit on first reply
-q : be quiet
-b : keep broadcasting, don't go unicast
-D : duplicate address detection mode
-U : Unsolicited ARP mode, update your neighbours
-A : ARP answer mode, update your neighbours
-V : print version and exit
-c count : how many packets to send
-w timeout : how long to wait for a reply
-I device : which ethernet device to use
-s source : source ip address
destination : ask for what ip address
[root@sas1 pgpool0]# arping -I eth0 172.16.18.70
arping: Device eth0 not available.
[root@sas1 pgpool0]# arping -I bond0 172.16.18.70
ARPING 172.16.18.70 from 172.16.18.44 bond0
^CSent 189 probes (189 broadcast(s))
Received 0 response(s)
[root@sas1 pgpool0]# ipneigh grep 172.16.18.70
bash: ipneigh: command not found...
[root@sas1 pgpool0]# ip neigh grep 172.16.18.70
Command "grep" is unknown, try "ip neigh help".
[root@sas1 pgpool0]# ip neigh | grep 172.16.18.70
172.16.18.70 dev bond0 FAILED
[root@sas1 pgpool0]# arp -a | grep 172.16.18.70
? (172.16.18.70) at <incomplete> on bond0
[root@sas1 pgpool0]# ifconfig | egrep 'RUNNING|broadcast|Ethernet'
bond0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500
inet 172.16.18.44 netmask 255.255.255.0 broadcast 172.16.18.255
ether 38:68:dd:1f:44:50 txqueuelen 1000 (Ethernet)
bond0:0: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500
inet 172.16.18.71 netmask 255.255.255.255 broadcast 0.0.0.0
ether 38:68:dd:1f:44:50 txqueuelen 1000 (Ethernet)
bond1: flags=5187<UP,BROADCAST,RUNNING,MASTER,MULTICAST> mtu 1500
inet 172.16.18.45 netmask 255.255.255.0 broadcast 172.16.18.255
ether b0:26:28:f7:97:25 txqueuelen 1000 (Ethernet)
eno1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500
ether 38:68:dd:1f:44:50 txqueuelen 1000 (Ethernet)
eno2: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500
ether 38:68:dd:1f:44:50 txqueuelen 1000 (Ethernet)
ens1f0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
ether b0:26:28:f7:97:24 txqueuelen 1000 (Ethernet)
ens1f1: flags=6211<UP,BROADCAST,RUNNING,SLAVE,MULTICAST> mtu 1500
ether b0:26:28:f7:97:25 txqueuelen 1000 (Ethernet)
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 192.168.122.1 netmask 255.255.255.0 broadcast 192.168.122.255
ether 52:54:00:30:ae:1e txqueuelen 1000 (Ethernet)
[root@sas1 pgpool0]# ip addr | egrep 'LOWER_UP|link/ether|inet '
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
inet 127.0.0.1/8 scope host lo
2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000
link/ether 38:68:dd:1f:44:50 brd ff:ff:ff:ff:ff:ff
3: ens1f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether b0:26:28:f7:97:24 brd ff:ff:ff:ff:ff:ff
4: ens1f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1 state UP group default qlen 1000
link/ether b0:26:28:f7:97:25 brd ff:ff:ff:ff:ff:ff
5: eno2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000
link/ether 38:68:dd:1f:44:50 brd ff:ff:ff:ff:ff:ff
7: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 38:68:dd:1f:44:50 brd ff:ff:ff:ff:ff:ff
inet 172.16.18.44/24 brd 172.16.18.255 scope global bond0
inet 172.16.18.71/32 scope global bond0:0
8: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether b0:26:28:f7:97:25 brd ff:ff:ff:ff:ff:ff
inet 172.16.18.45/24 brd 172.16.18.255 scope global bond1
link/ether 52:54:00:30:ae:1e brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
link/ether 52:54:00:30:ae:1e brd ff:ff:ff:ff:ff:ff
link/ether 3a:68:dd:1f:44:57 brd ff:ff:ff:ff:ff:ff
UPDATE:
After a while, ip neigh | grep 172.16.18.70 was showing a STALE status and after that REACHABLE.
I was able to restart SAS and now is running.
Thank you very much for the support!
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.