Hi All,
we have seen below ERRORs in JFD logs. how can we fix this.
Thanks
SS
Have you tried stopping and restarting the Platform Process Manager services? If not it would be worth trying.
This is happening only after restart .
We have stopped everything
<>/profile.lsf;
<> /profile.js;
jadmin stop
badmin hshutdown
lsadmin resshutdown
lsadmin limshutdown
and restarted again
<>/profile.lsf;
<> /profile.js;
lsadmin limstartup;
lsadmin resstartup;
badmin hstartup;
jadmin start;
and also we are getting this error in lim.log
Oct 4 08:54:43 2021 21763 4 3.4.0 main: Received request <5> from non-EGO host 11.201.77.21:27553
Oct 4 08:54:43 2021 21763 4 3.4.0 main: IP of Host compute.eng.prod has changed, this IP now belongs to ip-11.201.77.21.eu-east-3.compute (11.201.77.21:33249)
Oct 4 08:54:43 2021 21763 4 3.4.0 main: Received request <5> from non-EGO host 11.201.77.21:33249
Oct 4 08:54:43 2021 21763 4 3.4.0 main: IP of Host compute.eng.prod has changed, this IP now belongs to ip-11.201.77.21.eu-east-3.compute (11.201.77.21:33249)
Oct 4 08:54:43 2021 21763 4 3.4.0 main: Received request <5> from non-EGO host 11.201.77.21:33249
Oct 4 08:54:44 2021 21763 4 3.4.0 main: IP of Host compute.eng.prod has changed, this IP now belongs to ip-11.201.77.21.eu-east-3.compute (11.201.77.21:15311)
Oct 4 08:54:44 2021 21763 4 3.4.0 main: Received request <5> from non-EGO host 11.201.77.21:15311
Oct 4 08:54:44 2021 21763 4 3.4.0 main: IP of Host compute.eng.prod has changed, this IP now belongs to ip-11.201.77.21.eu-east-3.compute (11.201.77.21:15311)
and also in JFD.log
2021 Oct 04 09:07:00 21967 22060 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:07:35 21967 22061 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:07:35 21967 22065 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:07:35 21967 22063 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:07:35 21967 22058 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:07:35 21967 22062 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:07:35 21967 22060 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:07:35 21967 22059 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:09:03 21967 22009 3 JFJobExecutionAgent::checkReturnStatus: Failed to execute command <"/opt/home/lsf/10.1/linux2.6-glibc2.3-x86_64/bin//bsub" -J '260195:lsfuser:CLI_OUTCOME_MON:CLI_OUTCOME_MON' -o '/dev/null' -fid 260195 '/opt/config/Lev1/SASApp/BatchServer/sasbatch.sh -log ~/logs/CLI_OUTCOME_MON_CLI_OUTCOME_MON_#Y.#m.#d_#H.#M.#s.log -batch -noterminal -logparm "rollover=session" -sysin /sas/deployed_jobs/CLI_OUTCOME_MON.sas'>. Exited with <9>. .
2021 Oct 04 09:09:03 21967 22009 3 JFLSFExecutionAgent::_submitToLSF: The job submission script has been running for too long, and is killed by JFD; error code '118'.
2021 Oct 04 09:10:22 21967 22056 3 JFJobExecutionAgent::checkReturnStatus: Failed to execute command <bkill 517227>. Exited with <9>. .
2021 Oct 04 09:12:36 21967 22062 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:12:36 21967 22060 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:12:36 21967 22064 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:12:36 21967 22056 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:12:36 21967 22059 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:12:36 21967 22057 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:12:36 21967 22061 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 04 09:14:03 21967 22009 3 JFJobExecutionAgent::checkReturnStatus: Failed to execute command <"/opt/home/lsf/10.1/linux2.6-glibc2.3-x86_64/bin//bsub" -J '260196:lsfuser:CLI_FREEZE:CLI_FREEZE' -o '/dev/null' -fid 260196 '/opt/config/Lev1/SASApp/BatchServer/sasbatch.sh -log ~/logs/CLI_FREEZE_CLI_FREEZE_#Y.#m.#d_#H.#M.#s.log -batch -noterminal -logparm "rollover=session" -sysin /sas/deployed_jobs/CLI_FREEZE.sas'>. Exited with <9>. .
2021 Oct 04 09:14:03 21967 22009 3 JFLSFExecutionAgent::_submitToLSF: The job submission script has been running for too long, and is killed by JFD; error code '118'.
something has changed in the hosts references but all looks fine except flow manager.
Note: this is happening only after linux box restarting .
You could identify where the request is coming from by setting the debug log for jfd to level 10. You can do this by running:
jreconfigdebug -l 10
Wait until the next failure (up to 5 minutes it looks like) and then set the level back to 5:
jreconfigdebug -l 5
Then find the failure in the jfd log to get the thread ID (in this case 18093):
$ tail -500 jfd.log.* | grep eauth.*failed ... 2021 Oct 05 09:13:32 17925 18093 3 JFEauthManager::verifyEauth: eauth len=9 failed; rc=0.
And finally grep for that thread ID and "uData" to identify the source IP:
$ tail -500 jfd.log.* | grep 18093.*uData 2021 Oct 05 09:13:30 17925 18093 10 JFEauthManager::verifyEauth: uData = 2147483647 2147483647 lsfadmin 10.1.2.3 62693 9 NULL NULL NULL NULL
Thanks for this
I have done the debug
I am getting as below
if I do
tail -500 jfd.log.* | grep eauth.*failed
2021 Oct 06 08:58:43 32000 32094 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 06 08:58:43 32000 32094 7 JFGenericDaemon::startup:Authentication failed for user [llm720]
2021 Oct 06 08:58:44 32000 32097 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 06 08:58:44 32000 32097 7 JFGenericDaemon::startup:Authentication failed for user [llm720]
2021 Oct 06 08:58:44 32000 32101 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 06 08:58:44 32000 32101 7 JFGenericDaemon::startup:Authentication failed for user [llm720]
2021 Oct 06 08:58:44 32000 32095 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 06 08:58:44 32000 32095 7 JFGenericDaemon::startup:Authentication failed for user [llm720]
2021 Oct 06 08:58:44 32000 32096 3 JFEauthManager::verifyEauth: eauth len=22 failed; rc=0.
2021 Oct 06 08:58:44 32000 32096 7 JFGenericDaemon::startup:Authentication failed for user [llm720]
If I do
tail -500 jfd.log.* | grep 18093.*uData
2021 Oct 06 08:58:44 32000 32097 10 JFEauthManager::verifyEauth: uData = 2147483647 2147483647 llm720 11.90.188.112 24265 22 NULL NULL NULL NULL
so what is this mean
Thanks.
Is there a way we can fix these ERRORs.every user is getting the same ERROR. looks like that IP is their local/PC IP(they are trying to connect from ) and they are connecting to flow manager fine.
users are in the batch user group(file lsb.users) so they control their flows.
We configured PAM correctly and user is able to login to SAS or to flow manager. If we disable the user in AD(we didn't tidy up few users in SAS metadata or in lsb.users but disabled in AD) , do we need to restart any PM services to get affect(ie: so we don’t see this error in jfd log)?
Found the root cause.
User forgot to close/logout from flow manager(user is not a SAS user anymore but user is still in the business)
Thanks for your help
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.