03-10-2018 06:12 AM
I'm just doing an upgrade of 9.4 M2 to 9.4 M5 on AIX.
In several steps, the SDW throws an error. Examining the log reveals a problem of the type
Caused by: java.io.FileNotFoundException: /sas/SASFoundation/9.4/sasexe/uwfcptwd (Cannot open or remove a file containing a running program.)
Hitting "Retry" makes the step run a lttle farther, ending up at the same error with a different file. Hitting Retry often enough finally ends up with the step completing successfully.
This of course slows a simple upgrade that should be done with in less than an hour to to a multi-day, if not multi-week project.
Has anybody else ever experienced similar problems?
A track has been opened with the German SAS TS.
03-10-2018 08:34 AM
Are there SAS processes still running that are using that library / those libraries, stopping them from being updated? Maybe the Metadata Server, or random leftover user sessions or scheduled jobs?
I'd start with lsof or fuser & take it from there.
03-10-2018 09:10 AM
I made sure that all SAS processes were stopped. lsof or fuser wouldn't help me, as the files the SDW ia complaining about are vanishing anyway.
Right now I'm successfully through the installation, but I must have hit Retry a thousand times. While the installation was running, I did a
/usr/linux/bin/find . ! -writable
on the SASFoundation/sasexe directory, and found that the SDW seems to change permissions there quite often. Maybe it then stumbles across itself in certain situations, which then fix themselves.
Am quite eager to see with what SAS TS comes up.
03-10-2018 09:34 AM
What filesystem are those binaries on? Is it GPFS?
For what it's worth, we've seen some weird behaviour with Java and file locks being held when they logically shouldn't be (this was ages ago, on AIX 6.1). We got round it by handling that exception you're seeing & adding a delay + retry to the operation. It's far from an elegant solution, and I'm not sure it helps you or anyone else, but it fixed the problem for us (and quickly). It might be that it's the same problem exacerbating a race condition already present in the SDW code.
Keep us posted on what TS say.
03-10-2018 09:40 AM
The filesystem is a very standard JFS2.
I would not be surprised at all if some Java peculiarity that was not properly taken care of in the SDW code is behind all that.