09-18-2013 05:53 AM
I have created a parallel processing on SAS Data Integration Studio 4.5 with 4 parallel processes. In each process (SAS Session) I have a procedure IML that reads and writes in an appropriate SAS catalog with the statement LOAD and STORE.
The example of my SAS code in each process is the following:
reset storage=intres.sim_losses_cat_&IDParallelProcess.; /* set the storage catalog IDParallelProcess can be 1,2,3 or 4 */
/* load Solvency Indices */
When I run this process in the process number >4 (5, 6, 7 .... and so on), I have the following error:
ERROR: Unable to open storage library = INTRES.SIM_LOSSES_CAT_1.
ERROR: A lock is not available for INTRES.SIM_LOSSES_CAT_1.CATALOG.
ERROR: Lock held by process 24183222.
Could you help me to resolve this issue?
09-27-2013 09:37 AM
You need to close the storage libary in the first process before you try to access it in a second process. What you are doing is not "thread safe": you can't have multiple processes writing the same file (Sim_Losses) potentially at the same time.
The better approach is to use the process number, rather than the IDParallelProcess.
That said, the issue you are encountering is because you left the storage library open.
Close the storeage library after you read Solvency by using RESET storage again.
Each time you specify the STORAGE= option, the previously opened catalog is closed before the new one is opened, so yuo can restore the default storage libarary with
09-11-2014 12:09 PM
my question is very related to this topic:
is there a (safe) way to use stored modules from some storage in multiple processes? It is very inconvenient to define all needed modules in each new process or to copy the storage into a new file before starting the new process. From my understanding a read-only (module) library should be globally available so you need just to update the library and the new version of a module is available for all processes.
If there is no way, will there be something like a "global module library" in a future release?
09-11-2014 01:32 PM
Have you tried it? What error are you getting?
I've never done what you are attempting, but if you are running into problems perhaps you can us the SLEEP function to avoid whatever error you are encountering. The i_th process could sleep 5*i seconds to allow time for the previous process to read the module library.
09-12-2014 04:24 AM
Yes I have tried it, but unfortunately, I don’t have the not working codes anymore. Actually, it is not easy to replicate it because you would require that the different processes access the library at the same time. In my case, I have a macro which starts a number of processes and executes some IML code with lots of modules (at least 20). The error was something like “Cannot load library, there is a lock on mylib.mystore from task X”
As far as I understand it, IML loads the modules when they occur the first time. Since they are all in the same library, I do not exactly know when each task is accessing the library. Furthermore when one task has finished, I start an new one, so it is also not known when the new process has to access the module library.
But I think, it should work if I load all modules (LOAD module=_all_) only once after starting a new process and then I wait bit before I start any other process. Maybe, I’ll implement this when I update myparallel-processing-macro (as I have written, the current version just defines again all modules from the “original” code of the modules in a local storage). I guess, I still could get in trouble if there are a lot other (non-IML) steps before IML needs the modules then I am not sure when I need the access. In this case, I still would need to make a copy of the original lib in the work-library, for example. Maybe something like a read-only storage which is accessible by multiple processes at the same time could be an improvement for IMl (in case it doesn’t already exist ;-) )
Thanks for your help :-)
09-11-2014 01:56 PM
running parallel processes is also requiring thinking on locks possible deadlocks.
This theory is quite common to DBMS systems having multiple users possible updating the same data or multi users operating systems designed to support many users and processes. On a mainframe using z/OS locking / enqueing is a culture.
A catalog entry is a dedicated structure in SAS. You can have many users to read catalog-entries at the same time but when just one process will do an update it must be only process that is accessing that one and nobody else is allowed to read at the same time.
This locking is getting more standardized in the latest releases. of SAS and the unix operating systems, I believe also at Windows. It was a structural failure is was not there before. SAS(R) 9.2 Companion for UNIX Environments , SAS(R) 9.2 Companion for Windows, Second Edition - 43381 - File locking and access problems occur when saving an environment in a PROC RISK step
Knowing this you can go trying to find a way around that:
- use unique names for the first level of the catalog name, that is the library name part.
the sas7bcat files will get unique and cause no locking issues or possible corruption. You could use a PID number from one of the sys- autovars.
- use of sas/share with possible RLS to offer a shared version of the library where the locking is handle by that service.
You should use a sas service on excact the same version and OS type to get native file support. By that you can uses RLS.
The disadvantage is the communciationoverhead by SAS/Share not really for big data.