Microsoft and Intel Push One Million iSCSI IOPS

iSCSI is as fast as your hardware can handle. How fast? Try 1,030,000 IOPS over a single 10 Gb Ethernet link!

In March, Microsoft and Intel demonstrated that the combination of Windows Server 2008 R2 and the Xeon 5500 could saturate a 10 GbE, pushing data throughput to wire speed. Today, they showed that this same combination can deliver an astonishing million I/O operations per second, too.

Microsoft and Intel demand “One Million IOPS!”

We’ve heard of science experiments dishing out millions of IOPS before. Texas Memory Systems even offers a 5 million IOPS monster! All of these are impressive, to be sure, but they are proof of concept designs for back-end storage, not SAN performance. Actually pulling a million IOPS across a network has required the use of multiple clients and Fibre Channel links.

How fast can iSCSI get?The Microsoft/Intel demo demanded some creativity to be sure, but they pushed a million IOPS over a single Gigabit Ethernet link using a software initiator. That’s right, this was a single client with a single 10 GbE NIC pulling a million I/O operations from a SAN. Amusingly, it took many storage targets working together to actually service this kind of I/O demand!

Another thing to note is this was done with the software iSCSI stack in Windows Server 2008 R2, not some crazy iSCSI HBA hardware. Again, an iSCSI HBA would have trouble servicing this kind of I/O load, but an Intel Xeon 5500 has plenty of CPU horsepower to handle the task.

This bests the already-impressive 919,268 IOPS put up by an Emulex FCoE HBA earlier this week. It’s the equivalent of over 5,000 high-end enterprise disk drives and five times greater than the total database I/O operations of eBay.

So what’s the takeaway message? There are a few:

  1. Performance is not an issue for iSCSI – Sure, not every iSCSI stack can handle a million IOPS, but the protocol is not the problem. iSCSI can saturate a 10 GbE link and deliver all the IOPS you might need.
  2. Performance is not an issue for software – Today’s CPUs are crazy fast, and optimized software like the Windows Server 2008 R2 TCP/IP and iSCSI stacks can match or exceed the performance of specialized offload hardware.
  3. Storage vendors need to step up their game – Whose storage array can service a million iSCSI IOPS? Raise your hands, please! I can’t hear you! Hello? Anyone there?
  4. Fibre Channel and FCoE don’t rule performance – I don’t know of a Fibre Channel SAN that can push this kind of throughput or IOPS through a single link. Even FCoE over the same 10 GbE cable can’t quite do it. If they are to stay relevant, they had better come up with a compelling advantage over iSCSI!

Under The Hood

Microsoft and Intel passed a million IOPS with small blocks, but larger sizes still delivered solid performance

Looking at the release a bit closer, we see that this high IOPS limit was reached with 512 byte blocks, with 1k and larger blocks delivering much better throughput but decreasing IOPS. Typical FC HBAs can only reach 160,000 IOPS or so, a rate the iSCSI initiator could handle with large blocks and low CPU utilization.

A Xeon server fanned out through a Cisco Nexus switch to 10 iSCSI targets in the lab

The client side of the test was a quad-core 3.2 GHz Xeon 5580 server running Windows Server 2008 R2. It was connected with an Intel X520-2 10 Gb Ethernet Server Adapter using the 82599EB controller. A Cisco Nexus 5020 switch fanned this out to 10 servers running iSCSI target software. The IOPS measurement was done with the industry-standard IOMeter software.

Hyper-V performance matched the native number at higher block sizes

The team also benchmarked this configuration running Microsoft’s Hyper-V server virtualization hypervisor. Since Hyper-V leverages the Windows Server codebase, performance remained remarkably similar. Intel’s VMDq and Microsoft’s VMQ allowed guest operating systems to reach these performance levels, routing ten iSCSI targets to ten guest instances over virtual network links.

Update: Check out this Intel whitepaper on the test!

Dr. Evil/Ballmer mashup image from “The Big Deal

  • http://twitter.com/sokratisg Sokratis Galiatsis

    It would be nice if Intel could do the same test with vSphere's initiator and compare the measurements…

    PS: The photoshop mashup just rocks! LoL

  • http://twitter.com/sokratisg Sokratis Galiatsis

    It would be nice if Intel could do the same test with vSphere's initiator and compare the measurements…

    PS: The photoshop mashup just rocks! LoL

  • Pingback: Internets of Interest:16 Jan 10 | My Etherealmind

  • http://twitter.com/vcdxnz001 Michael Webster

    It’s a pity that the 1 million IOPS wasn’t at 8kb like the comparative vSphere 1 million IOPS test that VMware published. But I guess that shows the difference between the two systems. a Single VM on vSphere can push 300K IOPS at 8KB. The entire Windows 2008 system can only do that. vSphere would push 1 million IOPS at 8KB, not 512b as in this example.  

  • Pingback: Microsoft’s Hyper-v Server 2012 and System Center 2012 Unleash KO Punch to VMware | Blog by Raihan Al-Beruni