— Ray’s condition (@_rayjenkins) January 7, 2018
The SolarWinds AppOptics blog has a great collection of their observed effects from recent Spectre/Meltdown patches to AWS. My favorite one is the slight bump in cache failures embedded above. While SolarWinds eventually saw an overall reduction in the CPU impact, it was still easily observable. As subsequent patches are able to take a more refined approach to the problems of the two bugs, we’ll probably see less of an overwhelming performance difference.
Still, as predicted, while the CPU usage increased up to 25% after the patch, the bigger impact appears to be overall latency. This saw huge increases across the board, from 45-100% depending on the instance and service. While throwing more capacity at the problem is possible, the added latency can have a long tail of issues for many services. While public cloud providers absorb a lot of the headache in the wake of the Spectre/Meltdown patch, it doesn’t mean their customers are entirely unaffected from it.
In this post, I’ll highlight the impacts we at SolarWinds Cloud® noticed across our AWS infrastructure during the last several weeks due to Meltdown. We, along with many SaaS companies, were impacted by these changes and suffered partial downtime due to AWS efforts to mitigate Meltdown. We’ll be posting a full postmortem for our incident in the coming days. Although this is the impact we saw in our environment, we realize this may not be the same for other environments.