Last week on the Gestalt IT Rundown, Tom Hollingsworth and I discussed Nvidia’s new Turing architecture. At the time we recorded we had only heard details about their workstation Quadro cards based on Turing. But the news about the architecture was significant. Nvidia had added dedicated Tensor and ray tracing processors to their GPU, sharing die space with their traditional shaders.
Unsurprisingly, Nvidia launched their consumer focused derivatives of Turing with the same dedicated processors. On a traditional GPU performance front, performance gains look to be substantial, with the flagship RTX 2080 Ti capable of 13.4 TFLOPs of single precision floating point performance. That’s a 50% increase from the outgoing generation GTX 1080.
Added to that are ray-tracing the tensor performance that largely mirror what’s available on their workstation cards. Of course, all this has some very physical implications for the cards. Power consumption increases almost 40% on the new cards, despite using on a smaller 12nm process. Transistor count also skyrockets 250%, with Nvidia now jamming a cool 18.6 billion transistors on the RTX 2080Ti.
As Tom and I discussed on the Rundown, I have to wonder if this strategy will pay off. Obviously Nvidia realizes that it’s place as the ML/AI acceleration darling isn’t set in stone. GPUs work the best right now for those workloads, but aren’t purpose made for the task. Adding in dedicated tensor cores addresses that, and probably helps stave off competition from more dedicated chips. We’ve already seen Google and Tesla developing dedicated ML chips. Those companies have resources beyond the reach of most organizations, but all it takes is for them to start selling them to a larger market for Nvidia’s position to change.
It’s obvious why Nvidia added dedicated processors to the Turing architecture. But let’s take a second to recognize how significant this new generation is. Ever since Nvidia’s Tesla architecture in 2006, the company has embraced a Unified Shade Model for their GPUs. This was specifically a reaction against fixed-pipeline architectures with a series of specialized processors for rendering, pixels, vertex, and texture.
With Turing, Nvidia is still using a Unified Shader model, but marks a shift away from the trend of the GPU as a general purpose chip. By adding in specialized processors, Nvidia is making a frank admission that a general approach is inadequate to near term workloads. Of course in ten or so years, we’ll surely be seeing these all pulled into general purpose processors again, as hardware becomes powerful enough to not make specialization worth the expense.
But the big question for today is whether there is any downside to this approach for Nvidia? Ray tracing has been the Holy Grail of graphics for a while, so I can see a lot of applications taking advantage of that very quickly in both consumer and workstation markets. I think the tensor processors are a harder sell. If you’re making chips more power hungry, hotter, and complex to add them, you have to make the benefits clear to your customers. On the workstation front, that seems easier. For consumers, I wonder if Nvidia will bin a GPU without these processors to serve that market.