Architecting, installing and maintaining your SAS environment

Long delay in starting a grid launched workspace server

Reply
Super Contributor
Posts: 408

Long delay in starting a grid launched workspace server

Hi fellow admins,

 

We are in the process of rolling out a multiple grid environments for a large population of data scientists. We use LSF for the grid management. One of the key components is the grid launched workspace server. Now I am struggling to bring down the time it takes to start a workspace server. The time is now at a minimum of 20 seconds. This is a big stumbling block in the acceptance by the users and I understand why. When using DI Sudio wss's are started all over the place. In EG one experiences an agonizing half minute of hourglass watching. 

 

I have already tweaked a few parameters in lsb.params according to a blog post from Edoardo Riva but I am now out of ideas. That's why I turn to you.

 

Many thanks in advance,

- Jan.

 

lsb.params:

Begin Parameters
MAX_JOB_NUM=10000
NEWJOB_REFRESH=Y
DEFAULT_QUEUE=normal
ABS_RUNLIMIT=Y
MIN_SWITCH_PERIOD=3600
JOB_SCHEDULING_INTERVAL=1
JOB_ACCEPT_INTERVAL=1
JOB_DEP_LAST_SUB=1
ENABLE_EVENT_STREAM=n
MAX_CONCURRENT_QUERY=100
ENABLE_HOST_INTERSECTION=Y
MBD_REFRESH_TIME=10
#MBD_SLEEP_TIME=10
MBD_SLEEP_TIME=1
#SBD_SLEEP_TIME=5
SBD_SLEEP_TIME=1
End Parameters
PROC Star
Posts: 392

Re: Long delay in starting a grid launched workspace server

Hi Jan,

 

Which versions of SAS & LSF are you using and on which platform? When you look through the logs can you see where most of the delay occurs?

 

Have you seen the following note?: SAS Problem Note 57577: You encounter delays when you start grid-launched workspace servers or when ... Does it apply to your situation?

 

Cheers

Paul

Super Contributor
Posts: 408

Re: Long delay in starting a grid launched workspace server

Hi Paul,

 

This is SAS 9.4M3 and LSF 9.1.3.

 

I have seen the note. It does not apply:

 

Job <1154>, Job Name <SAS Enterprise Guide_SASApp - Workspace Server node 01_F7
                     52E162-0AE0-0345-842F-EA85270DCC20>, User <klavj10>, Proje
                     ct <default>, Command </srv/SASConfig/Lev1/SASApp/Workspac
                     eServer/WorkspaceServer.sh -noterminal -netencryptalgorith
                     m AES -encryptfips -metaserver osasigmdl03.ont.belastingdi
                     enst.nl -metaport 8561 -metarepository Foundation -locale
                     en_US -objectserver -objectserverparms "delayconn sph=osas
                     igndl01.ont.belastingdienst.nl protocol=bridge spawned spp
                     =36720 cid=18 pb classfactory=440196D4-90F0-11D0-9F41-00A0
                     24BB830C server=OMSOBJ:SERVERCOMPONENT/A52BHKER.AY00000Q c
                     el=everything lb grid" -METAUSER '"klavj10@!*(generatedpas
                     sworddomain)*!"' -METAPASS 7720093ab3A185107f65931940859c7
                     1 >
Tue Apr 12 11:30:41: Submitted from host <osasigndl01.ont.belastingdienst.nl>,
                     to Queue <eguide>, CWD <$HOME>, Specified Hosts <osasigcll
                     01.ont.belastingdienst.nl>, <osasigndl01.ont.belastingdien
                     st.nl>;
Tue Apr 12 11:30:41: Dispatched 1 Task(s) on Host(s) <osasigndl01.ont.belasting
                     dienst.nl>, Allocated 1 Slot(s) on Host(s) <osasigndl01.on
                     t.belastingdienst.nl>, Effective RES_REQ <select[type == a
                     ny] order[r15s:pg] >;
Tue Apr 12 11:30:41: Starting (Pid 4890);
Tue Apr 12 11:30:42: Running with execution home </home/ONT/klavj10>, Execution
                      CWD </home/ONT/klavj10>, Execution Pid <4890>;

 This shows what the note calls a "healthy grid" with a one second delay. I will continue investigating log files to sdee where the delay happens. We have an additional app server for SASEM that is not grid launched. There we see apporox. 5 seconds. So that's what we're aiming at. Minus of course some overhead.

 

Cheers Jan.

Ask a Question
Discussion stats
  • 2 replies
  • 276 views
  • 3 likes
  • 2 in conversation