That is going into the hell by inventing out, trying to get these processes aligned into security policies as of regulations and the in-house IT policies. Those machines should be able to be operated by common support staff. That is not Ronan/Jaap or any other platform admin personally. The big problem is that SAS is violating those guidelines and doing different things. The SAS documentation can tell a lot but when its wrong it is wrong. You didn't follow my advice and do the research on the way the status is solved in those scripts. I did had problems with a reliable weekly restart of the services using the adviced SAS approach. So I did debug those scripts and analyzed them. There was no help from SAS TS as not understanding or willing to understand the issues. What I have found is: - The metadataserver is started (detached background). The object spawner is started after some planned time (sleep function) not really checking the availability of the metadataserver. That was problem 1/ It could happen the objectspawner did start but could not retrieve the needed metadata. Only by seeing missing content (comparing) that problem could be recognized. It can be checked fully trustworthy by using a proc metaoperate doing a real login. Seeing the start en status check being coded, that is the pid-number check. If you are convinced the pinging of a server is the proof the server runs, than there is long way to go for some learning. - The same issue was recently seen with the connectspawner as also changed to retrieve information from metadata. It failed and customers complained. - With Eminer there is a Rmi service needed hardly to recognize. Customers complained they could not login. Than found the service was corrupted started by validating the test-url as documented. Not it is not in the servers.sas. script. While reviewing those scripts Than seeing another coding problem. You need some background on C-language. The coding issue: the $* is giving the string unchanged to another script (the whole string as is) would you code $1 $2 $3 than the string is first evaluated and than composed to an array of strings (arg argcount C functions) the effect is that the original string gets broken. The difference looks not that big but when you quoted arguments and spaces you get into trouble. Do you have a batch-script (batchserver) with logging configured than try to use sas code with spaces in the name. It will dump with all kind of messages where the first troublesome one is the wrong default directory setting (sas note can be given). Some more issues? - sasauth did not check the normal Unix root conventions (locked users after x-attempts) There is change on that. - having tested that locking of the account (working now) finding sas did add unmentioned actions on top of that. Locking out 5 minutes at the same moment you can do a login with putty. There is no need to put something on top of that . The Unix policies are set and accepted by business policies. - The restart of the services (complete system) was planned because every wrong login added 5 additional seconds for all users (somewhere internal). It could happen the metadataserver did not react for some some 1or 2 hours and than processing again. Where on earth did you learn that SAS institute delivers failure free software and you can trust them for everything on that. Please get critical and do the research and the validation of what is there and than react accordingly. I mentioned the experiences by solving issues, that is going beyond what is documented and expected.
... View more