If you have opened your web browser or listened to the news in the past 24 hours you have probably heard that the sky is falling and it is all the computer's fault (okay so I exaggerate a little - just call me Chicken Little) If you have read this article in The Register or others that build on this content there are fears of up to a 30% performance impact due to OS patches to mitigate the risk of hackers getting access to sensitive data. Rest assured that not all applications will have such an impact.
Based upon some information from our partners at Red Hat my current expectation is that SAS jobs will see a 3%-7% performance degradation after applying fixes for this vulnerability. It is clear that some software has seen a larger performance impact. I do not yet have any indication of the impact on other platforms such as Microsoft Windows.
To better understand the impact of these patches on SAS performance my team along with some of our peers in the SAS QA organization will be running workloads and collecting performance metrics with and without the patches. The performance of SAS is very near and dear to my heart. I imagine it is the same for many of you. If you are wondering or if you get questions from your user community regarding the impact of this bug or the patches it is important to know:
We want SAS to perform as well as possible - and we do not expect SAS to fall into the worst case performance impact (for those of you that wonder why... SAS Sessions tend to do input/output operations in large block sequential operations. These operations pay a smaller penalty with this specific patch than small block random IO operations typical of a relational database for example. In this specific case we believe that the changes to the Linux kernel will have a smaller impact on SAS because the OS can consolidate multiple requests in a single context switch) There may be little that we can do to offset the performance impact of this specific patch but rest assured that the performance of SAS 9 and SAS Viya are extremely important to us!
Some findings so far:
Some anecdotes:
When measuring the impact of the Spectre and Meltdown patches alone - by this I mean update to the latest kernel then run the workload with all the patches enabled vs disabling all of the side-channel vulnerability mitigation techniques (one of them can not be disabled in the RHEL kernel) the impact varies from about 3% to about 11% in the workloads that we have measured to date.
Your mileage will assuredly vary as well.
It is after 6 on a Friday at the end of a very long week - forgive me for these personal ramblings on this topic:
New vulnerabilities get patched on a regular basis - many of them far easier to exploit than Spectre or Meltdown. Yet rarely do these exploits make the national news. There have been performance impacts to some of these patches in the past - but by and large IT departments patch our servers and we go on about our business without ever noticing that our jobs are running slightly faster or slower. Occasionally we may notice but invariably the press of work that needs doing demands our attention more than chasing down the reason our jobs are running slightly slower - does anyone EVER ASK WHY MY JOBS ARE RUNNING FASTER??? If one of the "trade e-rags" had not sensationalized the performance impact of this particular fix most of us would have applied the fixes and gotten on with life. A few of us would have written a blog post about how performance had changed but largely we would have seen the impact as "a fact of life" and gotten on with things. At the end of the day there is little that can be done about whatever the impact ends up being for your specific situation. This does not mean that I trivialize the situation - just that the demands of work that demands doing is beginning to press on me. The next release is calling. Some of the assessments we have undertaken are still in progress. If there is anything interesting turns up I will update this thread again however I anticipate that we will see more of the same. I wish for all of us 12 days into this new year that the hyperbole of the year has already peaked and that we all have many occasions to be delightfully surprised that our jobs have been mysteriously running faster before getting on with the next thing and that 12 days into 2019 we are all happier, healthier, and wiser than before 🙂 Unfortunately the thing that sticks with me the most is that exploitation of this vulnerability can not be detected. It occurs completely in the un-observable cycles within a device. Had this vulnerability been discovered by those of nefarious intent and coupled with a remote execution exploit what could they have accomplished without anyone ever knowing? If that doesn't send a shiver down your spine - just wonder what else is out there awaiting discovery by Google Project Zero...
Have a great weekend everyone!
I'm very curious if other CPU's that also employ speculative execution suffer from similar flaws, most prominently the POWER architecture, as our data warehouse runs on a pSeries.
Kurt,
IBM has issued a statement covering Power and System Z here: https://www.ibm.com/blogs/psirt/potential-cpu-security-issue/. https://www.ibm.com/blogs/psirt/pote... indicates that firmware and Linux patches for Power will be available on 09 January and AIX patches scheduled for 12 February.
I hope these links and the info they provide prove helpful to your questions.
Wow. These two vulnerabilities have really laid open a conceptual flaw in the speculative execution mechanism as such, as it seems.
Some findings so far:
Some anecdotes:
When measuring the impact of the Spectre and Meltdown patches alone - by this I mean update to the latest kernel then run the workload with all the patches enabled vs disabling all of the side-channel vulnerability mitigation techniques (one of them can not be disabled in the RHEL kernel) the impact varies from about 3% to about 11% in the workloads that we have measured to date.
Your mileage will assuredly vary as well.
It is after 6 on a Friday at the end of a very long week - forgive me for these personal ramblings on this topic:
New vulnerabilities get patched on a regular basis - many of them far easier to exploit than Spectre or Meltdown. Yet rarely do these exploits make the national news. There have been performance impacts to some of these patches in the past - but by and large IT departments patch our servers and we go on about our business without ever noticing that our jobs are running slightly faster or slower. Occasionally we may notice but invariably the press of work that needs doing demands our attention more than chasing down the reason our jobs are running slightly slower - does anyone EVER ASK WHY MY JOBS ARE RUNNING FASTER??? If one of the "trade e-rags" had not sensationalized the performance impact of this particular fix most of us would have applied the fixes and gotten on with life. A few of us would have written a blog post about how performance had changed but largely we would have seen the impact as "a fact of life" and gotten on with things. At the end of the day there is little that can be done about whatever the impact ends up being for your specific situation. This does not mean that I trivialize the situation - just that the demands of work that demands doing is beginning to press on me. The next release is calling. Some of the assessments we have undertaken are still in progress. If there is anything interesting turns up I will update this thread again however I anticipate that we will see more of the same. I wish for all of us 12 days into this new year that the hyperbole of the year has already peaked and that we all have many occasions to be delightfully surprised that our jobs have been mysteriously running faster before getting on with the next thing and that 12 days into 2019 we are all happier, healthier, and wiser than before 🙂 Unfortunately the thing that sticks with me the most is that exploitation of this vulnerability can not be detected. It occurs completely in the un-observable cycles within a device. Had this vulnerability been discovered by those of nefarious intent and coupled with a remote execution exploit what could they have accomplished without anyone ever knowing? If that doesn't send a shiver down your spine - just wonder what else is out there awaiting discovery by Google Project Zero...
Have a great weekend everyone!
Thank you all for your work, @KenGahagan!
So it seems that some places might contemplate to do the next server upgrade a little sooner than planned, but that's all.
I echo @Kurt_Bremser's comment. Thank you for the work you've done in this field. Good job.
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.