Network "Hiccup" Prevents File Write

Reply
Super Contributor
Posts: 291

Network "Hiccup" Prevents File Write

Folks;

 

Have a number of scheduled programs (SAS 9.4, 64 bit, Win 7) that write to Sharepoint areas through the company network. All too often get a "Physical file does not exist ..." message and then a program crash. Waiting for a little and/or re-connecting usually works, but that's manual work after a failure has occurred.

 

Is there a way to test the network, collect the result, and proceed when / pause until ready before atttempting to write a file?

 

Thanks,
Bill

Grand Advisor
Posts: 10,251

Re: Network "Hiccup" Prevents File Write

Since you issue appears to be intermittent, only fails sometimes, I'm not sure any "pretest" would help as the test could succeed and then still have the failure latter.

 

I would say this is more of an IT / network infrastructure issue and may need to be brought to them.

One very faint idea though. Is there any pattern such as another job that someone else runs such as a back up or data transfer that may be running when your jobs fail? If so try scheduling around them.

 

Once upon a time I had the job that interfered with others and I worked with folks to reschedule to minimize conflicts.

Super Contributor
Posts: 291

Re: Network "Hiccup" Prevents File Write

Have to agree that a pretest might not do much for me. It might only be useful to let IT know how many hiccups there are. Just now, I'm hearing 'they happen, code around it".  huh?

Grand Advisor
Posts: 17,461

Re: Network "Hiccup" Prevents File Write

You could write a loop within your process. If it executes successfully it's done, otherwise it keeps trying every 30 mins.

Grand Advisor
Posts: 10,251

Re: Network "Hiccup" Prevents File Write

If you can identify some timestamps to provide to your IT folks they may be able to identify a bottleneck or conflict.

Esteemed Advisor
Posts: 6,706

Re: Network "Hiccup" Prevents File Write

I'm with @Reeza here. I'd write a shell script that runs the SAS program, checks the return code, and in case of RC != 0 checks the log for an indicator that it IS the network problem; if yes, wait a certain time and retry, if not, exit with return code. I'd also implement a counter so that only a limited number of retries is done.

 

We had a similar problem where a mainframe job would end OK, but the file created by that job was not yet ready for the FTP access (some arcane peculiarities of the MF catalog). Since this led to a very unique FTP response, we could check on that and rerun the job.

 

Looked like

ERRCOUNT=0
MAXREPEAT=2

while [[ $ERRCOUNT -lt $MAXREPEAT ]]
do

  $SASEXE -config ..... -autoexec ..... ${PROGPATH}/${JOBNAME}.sas -log ${LOGPATH}/${JOBNAME}.log.${DATE}

  export RC=$?

  if [[ $RC -eq 0 ]]
  then
    if grep ERROR < $LOGPATH/$JOBNAME.log.$DATE
    then
      export RC=5
      let "ERRCOUNT = MAXREPEAT"
    elif grep -E "aborted|<<< 45|<<< 55|<<< 53|<<< 42|Invalid Reply received" < ${LOGPATH}/${JOBNAME}.log.${DATE}
    then
      # FTP interrupted
      let "ERRCOUNT = ERRCOUNT + 1"
      touch ${LOGPATH}/${JOBNAME}.err6.${DATE}.${ERRCOUNT}
      if [[ $ERRCOUNT -ge $MAXREPEAT ]]
      then
        export RC=6
      else
        sleep 1800
      fi
    else
      # everything OK
      let "ERRCOUNT = MAXREPEAT"
    fi
  else
    # SAS set RC ne 0
    let "ERRCOUNT = MAXREPEAT"
  fi
done

exit $RC
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Ask a Question
Discussion stats
  • 5 replies
  • 87 views
  • 3 likes
  • 4 in conversation