BookmarkSubscribeRSS Feed
Shakti_Sourav
Quartz | Level 8

Dear Team,

Some ETLs are failed in SAS DI due to communication link failure as per the log and some other jobs are executing in a long running process like 1 Lakhs records taking 10 minutes or above to complete.

We have checked the Ping and Telnet in SAS Env. and DB Connection also which is SQL Server, all are connecting successfully and also checked the I/O wait time, it's 2.1 and reduce to 0. still we are facing same issue like Communication link failure. 

Please suggest how to resolve this problem ?

I have attached some screenshots for your references.

 

Shakti_Sourav_1-1708503614829.png

 

Shakti_Sourav_2-1708503658368.png

 

 

Thanks & Regards

Shakti Sourav Mohapatra

 

 

 

 

 

5 REPLIES 5
mgallardo
Calcite | Level 5
I am not 100% sure but it looks like odbc driver is not properly configure or it is not working
Patrick
Opal | Level 21

This has nothing to do with DIS and likely not even with SAS but with the communication of the ODBC driver with the SQL server. 

Just use the exact error message to Google possible causes and remedies. Also ensure that the ODBC driver used meets both the SQL Server and SAS requirements.

Shakti_Sourav
Quartz | Level 8
Hello Patrick & Mgallardo ,
We have checked with odbc driver and restarted the db & SAS server and also checked the network connection. All are connected successfully as per the admin statement. Problem is, sometimes ETL executing successfully and sometimes it's showing link failure.
Please suggest what to do ? &
Is there any code or tool to check the proper root cause is SAS ?

Thanks
Patrick
Opal | Level 21

Asking chatGPT what the possible root cause for intermittent communication link failures could be returned below answer. 

 

One of the usual questions: Did this ever work in the past without issues? And if yes: What has changed?

 

I'd be looking first into the network, concurrency/resource contention and timeouts (like: long running process on the DB).

Looking at the time when such errors occur and how many SAS and other processes connecting to the database exist at that time could help narrowing down the root cause.

 

I believe if all available connection to a DB are consumed the error would be different (connection refused) but it's eventually still worth to check and potentially increase the number of allowed connections. You also need certainly to talk to your DBA to get the info if there has been a connection attempt to the DB or if the communication got interrupted earlier - and how busy the DB was at the time when you've run into the error.

 

Also consider to contact SAS Tech Support directly and get their guidance.

 

Below the chatGPT answer:

 

If the communication link failure occurs intermittently, it can be more challenging to pinpoint the exact root cause since it suggests that the issue is not consistently reproducible. However, here are some possible root causes for intermittent communication link failures between a client and a database:

  1. Network Fluctuations: Variations in network conditions such as intermittent packet loss, congestion, or bandwidth fluctuations can cause occasional communication failures.

  2. Transient Hardware Issues: Temporary hardware malfunctions such as a loose cable, overheating components, or failing network equipment may intermittently disrupt communication.

  3. Software Glitches: Transient software issues such as memory leaks, race conditions, or intermittent bugs in the client application or the database management system can cause intermittent communication failures.

  4. Load Balancing or Failover Issues: If load balancing or failover mechanisms are in place, occasional misconfigurations or failures in these systems can lead to intermittent communication issues.

  5. Concurrency Problems: In multi-threaded or concurrent environments, occasional race conditions or synchronization issues may lead to intermittent communication failures.

  6. Resource Contentions: Intermittent contention for resources such as CPU, memory, or disk I/O between the client and the database server can cause intermittent communication failures, especially during periods of high load.

  7. Environmental Factors: External factors such as electromagnetic interference, power fluctuations, or environmental conditions (e.g., temperature, humidity) can intermittently disrupt communication.

  8. Intermittent Software Updates: If software updates or patches are applied periodically, intermittent compatibility issues between different versions of software components may cause occasional communication failures.

  9. Transient Security Issues: Occasionally, security mechanisms such as firewalls, intrusion detection systems, or access control lists may intermittently block or interfere with communication between the client and the database server.

  10. Temporary DNS Resolution Problems: Intermittent DNS resolution issues, such as DNS server outages or transient network latency, can occasionally prevent the client from resolving the database server's hostname.

  11. Intermittent Client-Side Issues: Issues specific to the client environment, such as sporadic software conflicts, background processes consuming resources, or intermittent network connectivity on the client side, can cause intermittent communication failures.

Identifying the root cause of intermittent communication link failures may require thorough monitoring, logging, and diagnostic efforts to capture the issue when it occurs and analyze relevant system and network parameters.

 
Kurt_Bremser
Super User

Please do ALWAYS post logs as text, using the appropriate window; it saves you the hassle of creating and uploading a screenshot, and makes it much easier to read for us. On top of it, we can easily make annotations.

 

Now, less than 20 seconds of CPU time took 1026 seconds of real time (a factor of 50!), so there's most probably a network bottleneck involved. During that time, 819175 (close to 1 million, which translates to at least 8 "lakh") observations were processed, so less than 1000/sec. What is your observation size, which variables do you have?

Do you have the same rate of CPU time vs. real time in successful jobs?

 

Also check with your DB admins if they can find something in their server logs.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 523 views
  • 5 likes
  • 4 in conversation