2 weeks ago
Does It Matter Where the Various Components of Your SAS Infrastructure are Installed?
With today’s emphasis on keeping SAS applications available to end users around the clock, customers and IT Administrators are asking SAS if they can use different physical locations for SAS servers, SAS clients, and the data used for SAS applications. We have even been asked if the various servers in a SAS metadata cluster or the nodes in the SAS Grid can be in different physical locations. The answer to these questions is: It may technically function; however, geographically distributing components will greatly impact performance. This performance impact is most commonly seen when the compute servers and data are in separate locations, especially if the SAS applications are sequentially going through large volumes (100s of GB or more) of data.
Let’s review a SAS infrastructure my team was recently asked to resolve performance issues with:
In the example SAS 9.4 infrastructure illustrated above, the SAS clients are running on a Citrix system in New England and they access SAS servers running in the South. The data used for their analytics resides in a relational database in their Texas data center.
Can this work? Technically, yes. But the distribution has a performance cost. It means that when a SAS User in the Midwest wants to use a SAS client in NE to review a list of available SAS tables, or the columns in a SAS table, the request travels from the SAS client in NE to the SAS servers in the South, to the data center in Texas, back to the SAS servers in the South, and back to the SAS client in NE, where it is finally sent and displayed on the User’s monitor in the Midwest. All the overhead from the long distance traveling of requests and data causes slow UI performance.
In addition, because all the source data for the SAS servers is in a relational database in Texas, the data must travel across a WAN to the SAS servers every time a SAS job needs to access it or write out permanent results/data. Again, data traveling over a long distance every time it is needed can severely impact performance.
If the SAS customer wants their SAS applications to perform optimally, they need to have all of their SAS clients, SAS servers (compute, mid-tier and metadata), authentication services, and source data files as closely located as possible. The diagram below illustrates this.
The above placement of SAS infrastructure components applies to SAS 9.4, SAS 9.4 Grid Manager, and SAS Viya – whether on-premise or in a cloud offering (public or private). Each of the SAS infrastructures mentioned has many SAS servers.
Let’s drill down into what that means to keep all the SAS 9.4 infrastructure components close to one another: To optimize the communication and data paths between components – and thereby to optimize performance – we need to make sure we keep all the SAS 9.4 servers (compute, mid-tier and metadata) physically/geographically close.
Here are two specific examples:
Bottom-line: To ensure your SAS applications run as optimally as possible, it is important to keep all SAS clients, SAS servers (compute, mid-tier and metadata), authentication services, and source data files as closely located as possible. If, for some reason, this cannot happen, you must understand that the performance of your SAS applications will be degraded compared to what it would be if the all the components were physically/geographically close.