Hello,
We have discovered that when returning a large amount of data from our Hadoop cluster (on LINUX) to SAS (9.4M5 on AIX), the query runs very slowly if it first passes through the Hadoop load balancer rather than being directed to a specific node in the connection URI. Currently, we have only two nodes in our cluster. Node 01 returns the data in approximately 13 minutes. Node 02 returns the same data in approximately 18 minutes. The LB returns the same data in approximately 2 *hours*. Number of records is 19,932,993, with 40 columns. It's a basic SELECT statement. Same exact query each time, just changing the server in the URI to point directly to a node or to the LB. Results are similar regardless of date/time run.
Our network administrator doesn't see anything unusual when he examines the traffic while the query runs directly to either of the nodes or through the LB and he says this LB is a pretty basic setup. We also tested running the query through the LB with him alternately removing one of the nodes from the LB pool, which we hoped would identify whether the LB was having an issue with one of the nodes, but for both runs in that manner (effectively LB-->01, then LB-->02) we received the same 2-hour slow run time.
His current thought/question: "The only thing I can think of now is when running through the LB the source IP is the LB backend IP and not the initiating server. Is there something within the application that may be looking at the source or the host name used to connect to Hive? On Windows servers some applications require SPNs when using alias cnames or other DNS names. Not sure if that is applicable here."
Both the Hadoop cluster and our SAS integration with it are new to my company as of earlier this year, so we're only just discovering this issue. As far as we know, no one accessing the cluster via a non-Hadoop application (mainly Tableau and SAP Analysis for Office) is experiencing this problem - but perhaps they are and they just don't know it yet due to the still-limited use of the nascent Hadoop environment.
Any suggestions as to what the problem might be and/or how to resolve it so we can utilize the load balancer?
... View more