Architecting, installing and maintaining your SAS environment

CPU flaws, OS patches, and SAS performance

Accepted Solution Solved
Reply
SAS Employee
Posts: 3
Accepted Solution

CPU flaws, OS patches, and SAS performance

If you have opened your web browser or listened to the news in the past 24 hours you have probably heard that the sky is falling and it is all the computer's fault Smiley Happy (okay so I exaggerate a little - just call me Chicken Little) If you have read this article in The Register or others that build on this content there are fears of up to a 30% performance impact due to OS patches to mitigate the risk of hackers getting access to sensitive data.  Rest assured that not all applications will have such an impact.

 

Based upon some information from our partners at Red Hat my current expectation is that SAS jobs will see a 3%-7% performance degradation after applying fixes for this vulnerability. It is clear that some software has seen a larger performance impact.  I do not yet have any indication of the impact on other platforms such as Microsoft Windows.

 

To better understand the impact of these patches on SAS performance my team along with some of our peers in the SAS QA organization will be running workloads and collecting performance metrics with and without the patches. The performance of SAS is very near and dear to my heart. I imagine it is the same for many of you. If you are wondering or if you get questions from your user community regarding the impact of this bug or the patches it is important to know:

  • SAS is aware of the patches and the rumored performance impact
  • SAS R&D is working with our partners to better understand the potential impact on SAS Software
  • The changes to the OS may have a modest impact on performance
  • That this is an issue with the hardware and OS and while SAS may be affected by the patches the problems being mitigated have nothing to do with SAS specifically.

We want SAS to perform as well as possible - and we do not expect SAS to fall into the worst case performance impact (for those of you that wonder why... SAS Sessions tend to do input/output operations in large block sequential operations.  These operations pay a smaller penalty with this specific patch than small block random IO operations typical of a relational database for example.  In this specific case we believe that the changes to the Linux kernel will have a smaller impact on SAS because the OS can consolidate multiple requests in a single context switch)  There may be little that we can do to offset the performance impact of this specific patch but rest assured that the performance of SAS 9 and SAS Viya are extremely important to us!


Accepted Solutions
Solution
‎01-19-2018 01:04 PM
SAS Employee
Posts: 3

Re: CPU flaws, OS patches, and SAS performance

Some findings so far:

  • The performance impact is generally small. 
  • The performance impact is higher on older processors
  • The performance impact is greater where CPU utilization is already high
  • The overall performance impact of patching the kernel depends on when the kernel was last patched and other updates that may be included in the update.
  • The same workload running on two different hosts with similar performance capabilities but different drivers may result in differing performance impact.  

Some anecdotes:

  • Starting with the initial RHEL 6.7 kernel which is the baseline for SAS Viya and updating the kernel to current with Spectre and Meltdown patches results in almost every measured transaction completing FASTER (yes faster - the performance enhancements made since the release of 6.7 outweigh the performance impact of the spectre and meltdown patches)
  • When running with a resource constrained "Sandy Bridge" based server RHEL 7.3 initial release kernel -> Spectre / Meltdown patches we saw a transaction throughput reduction of ~15% overall but the timing of any specific transaction was negligible.  (this was a very intensive workload for the server capabilities) 
  • Running the same workload above on a Sky Lake based server with the same starting and ending kernel versions there was no measured throughput degradation however CPU utilization was higher with the updated kernel.

When measuring the impact of the Spectre and Meltdown patches alone - by this I mean update to the latest kernel then run the workload with all the patches enabled vs disabling all of the side-channel vulnerability mitigation techniques (one of them can not be disabled in the RHEL kernel) the impact varies from about 3% to about 11% in the workloads that we have measured to date.

 

Your mileage will assuredly vary as well.

 

It is after 6 on a Friday at the end of a very long week - forgive me for these personal ramblings on this topic:

New vulnerabilities get patched on a regular basis - many of them far easier to exploit than Spectre or Meltdown.  Yet rarely do these exploits make the national news.  There have been performance impacts to some of these patches in the past - but by and large IT departments patch our servers and we go on about our business without ever noticing that our jobs are running slightly faster or slower.  Occasionally we may notice but invariably the press of work that needs doing demands our attention more than chasing down the reason our jobs are running slightly slower - does anyone EVER ASK WHY MY JOBS ARE RUNNING FASTER???  If one of the "trade e-rags" had not sensationalized the performance impact of this particular fix most of us would have applied the fixes and gotten on with life.  A few of us would have written a blog post about how performance had changed but largely we would have seen the impact as "a fact of life" and gotten on with things.  At the end of the day there is little that can be done about whatever the impact ends up being for your specific situation.  This does not mean that I trivialize the situation - just that the demands of work that demands doing is beginning to press on me.  The next release is calling.  Some of the assessments we have undertaken are still in progress.  If there is anything interesting turns up I will update this thread again however I anticipate that we will see more of the same.  I wish for all of us 12 days into this new year that the hyperbole of the year has already peaked and that we all have many occasions to be delightfully surprised that our jobs have been mysteriously running faster before getting on with the next thing and that 12 days into 2019 we are all happier, healthier, and wiser than before :-)  Unfortunately the thing that sticks with me the most is that exploitation of this vulnerability can not be detected.  It occurs completely in the un-observable cycles within a device.  Had this vulnerability been discovered by those of nefarious intent and coupled with a remote execution exploit what could they have accomplished without anyone ever knowing?  If that doesn't send a shiver down your spine - just wonder what else is out there awaiting discovery by Google Project Zero...

 

Have a great weekend everyone!

View solution in original post


All Replies
Super User
Posts: 8,590

Re: CPU flaws, OS patches, and SAS performance

Posted in reply to KenGahagan

I'm very curious if other CPU's that also employ speculative execution suffer from similar flaws, most prominently the POWER architecture, as our data warehouse runs on a pSeries.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
SAS Employee
Posts: 3

Re: CPU flaws, OS patches, and SAS performance

Posted in reply to KurtBremser

Kurt,

 

IBM has issued a statement covering Power and System Z here: https://www.ibm.com/blogs/psirt/potential-cpu-security-issue/.  https://www.ibm.com/blogs/psirt/pote... indicates that firmware and Linux patches for Power will be available on 09 January and AIX patches scheduled for 12 February.  

 

I hope these links and the info they provide prove helpful to your questions.

Super User
Posts: 8,590

Re: CPU flaws, OS patches, and SAS performance

Posted in reply to KenGahagan

Wow. These two vulnerabilities have really laid open a conceptual flaw in the speculative execution mechanism as such, as it seems.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super Contributor
Posts: 277

Re: CPU flaws, OS patches, and SAS performance

Posted in reply to KenGahagan

Hi @KenGahagan

 

Has there been updates to this ?

 

Thanks,

Solution
‎01-19-2018 01:04 PM
SAS Employee
Posts: 3

Re: CPU flaws, OS patches, and SAS performance

Some findings so far:

  • The performance impact is generally small. 
  • The performance impact is higher on older processors
  • The performance impact is greater where CPU utilization is already high
  • The overall performance impact of patching the kernel depends on when the kernel was last patched and other updates that may be included in the update.
  • The same workload running on two different hosts with similar performance capabilities but different drivers may result in differing performance impact.  

Some anecdotes:

  • Starting with the initial RHEL 6.7 kernel which is the baseline for SAS Viya and updating the kernel to current with Spectre and Meltdown patches results in almost every measured transaction completing FASTER (yes faster - the performance enhancements made since the release of 6.7 outweigh the performance impact of the spectre and meltdown patches)
  • When running with a resource constrained "Sandy Bridge" based server RHEL 7.3 initial release kernel -> Spectre / Meltdown patches we saw a transaction throughput reduction of ~15% overall but the timing of any specific transaction was negligible.  (this was a very intensive workload for the server capabilities) 
  • Running the same workload above on a Sky Lake based server with the same starting and ending kernel versions there was no measured throughput degradation however CPU utilization was higher with the updated kernel.

When measuring the impact of the Spectre and Meltdown patches alone - by this I mean update to the latest kernel then run the workload with all the patches enabled vs disabling all of the side-channel vulnerability mitigation techniques (one of them can not be disabled in the RHEL kernel) the impact varies from about 3% to about 11% in the workloads that we have measured to date.

 

Your mileage will assuredly vary as well.

 

It is after 6 on a Friday at the end of a very long week - forgive me for these personal ramblings on this topic:

New vulnerabilities get patched on a regular basis - many of them far easier to exploit than Spectre or Meltdown.  Yet rarely do these exploits make the national news.  There have been performance impacts to some of these patches in the past - but by and large IT departments patch our servers and we go on about our business without ever noticing that our jobs are running slightly faster or slower.  Occasionally we may notice but invariably the press of work that needs doing demands our attention more than chasing down the reason our jobs are running slightly slower - does anyone EVER ASK WHY MY JOBS ARE RUNNING FASTER???  If one of the "trade e-rags" had not sensationalized the performance impact of this particular fix most of us would have applied the fixes and gotten on with life.  A few of us would have written a blog post about how performance had changed but largely we would have seen the impact as "a fact of life" and gotten on with things.  At the end of the day there is little that can be done about whatever the impact ends up being for your specific situation.  This does not mean that I trivialize the situation - just that the demands of work that demands doing is beginning to press on me.  The next release is calling.  Some of the assessments we have undertaken are still in progress.  If there is anything interesting turns up I will update this thread again however I anticipate that we will see more of the same.  I wish for all of us 12 days into this new year that the hyperbole of the year has already peaked and that we all have many occasions to be delightfully surprised that our jobs have been mysteriously running faster before getting on with the next thing and that 12 days into 2019 we are all happier, healthier, and wiser than before :-)  Unfortunately the thing that sticks with me the most is that exploitation of this vulnerability can not be detected.  It occurs completely in the un-observable cycles within a device.  Had this vulnerability been discovered by those of nefarious intent and coupled with a remote execution exploit what could they have accomplished without anyone ever knowing?  If that doesn't send a shiver down your spine - just wonder what else is out there awaiting discovery by Google Project Zero...

 

Have a great weekend everyone!

Super User
Posts: 8,590

Re: CPU flaws, OS patches, and SAS performance

Posted in reply to KenGahagan

Thank you all for your work, @KenGahagan!

So it seems that some places might contemplate to do the next server upgrade a little sooner than planned, but that's all.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super Contributor
Posts: 277

Re: CPU flaws, OS patches, and SAS performance

Posted in reply to KenGahagan

I echo @KurtBremser's comment. Thank you for the work you've done in this field. Good job.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 1459 views
  • 23 likes
  • 3 in conversation