BookmarkSubscribeRSS Feed
misul
Fluorite | Level 6

Hello,

I am experimenting with the sas-airflow-provider (https://github.com/sassoftware/sas-airflow-provider). I am trying to use the SASJobExecutionOperator to launch SAS Job Definition objects from Airflow. Note: I am using SAS Viya 3.5, not 4.0!

It seems that any SAS job which lasts more than 10 minutes gets this message on the Airflow side (the SAS job runs as long as needed without any issues):

requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')).

Most likely, it is related to a WAF timeout: the connection is terminated after 10 minutes of inactivity. Does the SASJobExecutionOperator or SAS Hook have a TCP keep-alive option, or can I pass this option somehow?

If I understand correctly, the SAS Hook under the hood uses Python's requests.Session, which implicitly supports keep-alive for HTTP connections, or is that not the case?

Mindaugas

4 REPLIES 4
gwootton
SAS Super FREQ
It looks like requests.Session supports HTTP keep-alive (meaning reuse of HTTP connections). You're looking to send keepalive packets over the TCP session to prevent the 10 minute idle timeout.
Other than modifying the hook to make use of the socket keepalive options in python you could try setting the sysctl ipv4.net.tcp_keepalive_time on the host running the python from the default of 7200 to something below 10 minutes (e.g. 540 -- 9 minutes)
--
Greg Wootton | Principal Systems Technical Support Engineer
misul
Fluorite | Level 6

Thank you. I will try adding some TCP keep-alive logic.
Something similar was discussed here:
https://blog.panagiks.com/2019/05/python-tcp-keepalive-on-http-request.html


gwootton
SAS Super FREQ
Consider creating an issue in Github (and a pull request if you successfully incorporate this new feature) as well.
--
Greg Wootton | Principal Systems Technical Support Engineer
misul
Fluorite | Level 6

It seems that TCP keep-alive logic in production environments may not be sufficient. Most likely, there will be a WAF and load balancers between the client and the server. TCP keep-alive will only cover the client-WAF connection, but load balancers may still impose restrictions. A better approach would be to initiate the command in the background without waiting for a response, then separately check the status and retrieve the results, ideally in the same Airflow task.

suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

Get Started with SAS Information Catalog in SAS Viya

SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 959 views
  • 0 likes
  • 2 in conversation