- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
We have installed Viya 4 Kubernetes, and deployed OpenSearch Dashboards for logging . I see the error
Logsource: sas-run-tool
message : Tool executor failed starting service: exit status 1
Level : Fatal
Logsource: sas-run-tool
message : Service executor failed to execute successfully: exit status 1
Level : Fatal
Did it happen in your environment and you know what this is? I googled but did not find much for this error.
Thanks a lot,
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
kubectl -n namespace get jobs -l app.kubernetes.io/name=sas-backup-purge-job
Greg Wootton | Principal Systems Technical Support Engineer
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Greg Wootton | Principal Systems Technical Support Engineer
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @gwootton , sorry for delay, missed your reply.
kube.pod - sas-backup-purge-job
level -fatal
kube.container - sas-backup-purge-job
kube.annotations.sas_com/component-name | sas-backup-job |
Please let me know if any other info needed.
thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
kubectl -n namespace get jobs -l app.kubernetes.io/name=sas-backup-purge-job
Greg Wootton | Principal Systems Technical Support Engineer
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @gwootton , yes the purge failed multiple times. Sorry by mistake I accepted my reply as solution instead of your reply. I fixed this now.
thank you for your help!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Greg,
sometimes my backup gives failed in viya 24.03.
I cannot find the reason, because if I fire this job again, it succeeds.
Maybe you have a solution ?
Regards,
Dik
{"level":"info","version":1,"source":"sas-backupjob","messageKey":"Checking if job is allowed to run.","properties":{"logger":"backupjob","caller":"backupjob/main.go:314"},"timeStamp":"2024-12-13T02:02:04.280263+01:00","message":"Checking if job is allowed to run."}
{"level":"info","version":1,"source":"sas-backupjob","messageKey":"backupjob-log-icu.updating.k8s.job.labels.info.log","messageParameters":{"jobName":"sas-scheduled-backup-dpyxjx3i-28900860"},"properties":{"logger":"backupjob/kubeutils","caller":"kubeutils/kube_client.go:126"},"timeStamp":"2024-12-13T02:02:04.334724+01:00","message":"Updating the Kubernetes job sas-scheduled-backup-dpyxjx3i-28900860 with given label."}
{"level":"info","version":1,"source":"sas-backupjob","messageKey":"backupjob-log-icu.updated.k8s.job.info.log","messageParameters":{"jobName":"sas-scheduled-backup-dpyxjx3i-28900860"},"properties":{"logger":"backupjob/kubeutils","caller":"kubeutils/kube_client.go:152"},"timeStamp":"2024-12-13T02:02:04.345526+01:00","message":"Updated the Kubernetes job the sas-scheduled-backup-dpyxjx3i-28900860."}
{"level":"error","version":1,"source":"sas-backupjob","messageKey":"Job is not allowed to run.","properties":{"logger":"backupjob","caller":"backupjob/main.go:476"},"timeStamp":"2024-12-13T02:02:04.345565+01:00","message":"Job is not allowed to run."}
{"level":"error","version":1,"source":"sas-backupjob","messageKey":"backupjob-log-icu.remote.backup.agent.process.grep.pid.error.log","messageParameters":{"error":"exit status 1"},"properties":{"logger":"backupjob","caller":"backupjob/main.go:529"},"timeStamp":"2024-12-13T02:04:14.495204+01:00","message":"backupjob-log-icu.remote.backup.agent.process.grep.pid.error.log [error:exit status 1]"}
{"level":"error","version":1,"source":"sas-run-tool","messageKey":"Tool executor failed starting service: exit status 1","properties":{"logger":"tools/run","caller":"impl/executor.go:69"},"timeStamp":"2024-12-13T02:04:14.497860+01:00","message":"Tool executor failed starting service: exit status 1"}
{"level":"fatal","version":1,"source":"sas-run-tool","messageKey":"Service executor failed to execute successfully: exit status 1","properties":{"logger":"tools/run","caller":"impl/tooling.go:139"},"timeStamp":"2024-12-13T02:04:14.497901+01:00","message":"Service executor failed to execute successfully: exit status 1"}
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Greg Wootton | Principal Systems Technical Support Engineer
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You may want to turn on debug logging of the backup, this should provide more detail on the failure. To do that, you could set the environment variable SAS_LOG_LEVEL_BACKUPJOB to debug on the cronjob, for example:
kubectl -n namespace set env cronjob/sas-scheduled-backup-job SAS_LOG_LEVEL_BACKUPJOB=DEBUG
Greg Wootton | Principal Systems Technical Support Engineer
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Greg, thank you, now we have more output, we will investigate further..
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Greg, we found it.
the sas-backup-purge-job cannot run simultaneous with the sas-scheduled-backup-job.
So this explains why it went ok if we run it by hand in the morning.
We will reschedule the full backups.
kubectl -n $ns get cronjob | grep backup | grep False Adjusted to :
|
Tomorrow we know it for sure.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Greg,
the reason is that the purge job runs for hours .
The pg_dump must be dumping with -Fd , we get 469000 *.dat.gz files . The purge proces has a lot of work to do with so many files.
Is it possible to adjust this to -Fc just one bigfile, or some other solution ...
Regards
Dik
We are on 2403 lts , maybe this behavior will change in the next upgrade.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As far as how long purge is taking to complete though, that seems high. My oldest backup's postgres dump contains 36,544 files and my last purge job took 2 minutes to complete. In the logs for the purge job do you see any indication of why it's taking so long?
Greg Wootton | Principal Systems Technical Support Engineer
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
We see entries in the table pg_catalog.pg_largeobject, for example job history and report transfer data. We export the reports with the sas-viya cli tool.
These entries are exported as blob<loid>.dat.gz
- How can I delete the job history in viya ?
- For the reports we are investigating.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Greg, more information :
- I made a sas studio job ;
cas; run ; |
- Logged into postgres pod : kn exec sas-crunchy-platform-postgres-00-b56g-0 -it -- sh
psql \c SharedServices SharedServices=# select loid from pg_catalog.pg_largeobject where data like '%ZZZZZ%' ; select loid,substr(encode(data, 'escape'),1100) as data from pg_catalog.pg_largeobject where data like '%ZZZZZ%' limit 1; loid | data
|
|
- Login the backup pod OR another pod which has the backup directory mounted and goto the directory where the pgdump file is and zcat the file blob_<loid>.dat.gz
/sasviyabackup/20250127-010206F/__default__/postgres/SharedServices_pg_dump.dmp $ zcat blob_6612731.dat.gz | tail -40 | fold -80 | tail -10
The user is <d.pater> and the active caslib is CASUSER(<d.pater>)
."},{"version":1,"type":"note","line":"NOTE: The SAS option SESSREF was updated
with the value CASAUTO."},{"version":1,"type":"note","line":"NOTE: The SAS macro
_SESSREF_ was updated with the value CASAUTO."},{"version":1,"type":"note","lin
e":"NOTE: The session is using 2 workers."},{"version":1,"type":"source","line":
"593 %put \"ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ\" ;"},{"version":
1,"type":"normal","line":"\"ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ\""
},{"version":1,"type":"source","line":"594 "},{"version":1,"type":"source","lin
e":"595 run ;"},{"version":1,"type":"source","line":"596 %JESEND"},{"version":
1,"type":"note","line":""}],"count":674,"limit":674,"links":[]}
Questions :
1.is this normal behavior ?
2. Must i register a case with SAS for this behavior ?
Regards
Dik