Exclusives

FCoE isn’t a replacement for Infiniband, it’s a cheaper copy that customers will buy

Infiniband is something of a forgotten protocol these days but many of the marketing features of FCoE are directly derived from Infiniband concepts and architecture.

Paradigms of Data Center Virtualization

For example, this page is taken from the “Paradigms of Data Center Virtualization” by Cisco at Networkers 2005. If you are following Cisco UCS then it should be very familiar, perhaps surprising the the Unified Fabric idea is not new but either Infiniband or 10GB was going to be the networking fabric.

cisco-unified-fabric-1.jpg

If you are reasonably new to Data Centres then you may not remember that Cisco has a moderately successful Infiniband product, and in 2005, it was the centrepiece of Cisco’s marketing strategy for a Unified Fabric for connectivity in the Data Centre. As an aggregator of connections from the server down to an Infiniband backbone it was the Unified Fabric that was going to unite the Data Centre for Grid Computing (the precursor of what is now virtualization).

cisco-unified-fabric-2.jpg

Unified Fabric Server Clusters

And you can see a Case Study of this using a pure Infiniband backbone for a large scale server cluster in this slide (from the same presentation)

cisco-unified-fabric-3.jpg

So what changed ?

So why didn’t Cisco continue with Infiniband ? I think it’s worth looking back and thinking how we got FCoE instead and looking for lessons to be learned. I can only speculate on what happened.

Recap on Key Infiniband Features

  1. InfiniBand is a high speed, low latency technology used to interconnect servers, storage and networks within the datacenter
  2. Standards Based – InfiniBand Trade Association http://www.infinibandta.org and has been working successfully for more than ten years.
  3. Scalable Interconnect speeds in multiple of 2.5Gb/s, products are currently shipping at 40Gb/s and 120Gb/s products have been announced.
  4. Low latency networking with delays around 20 microseconds (end-to-end) is one thousand times less than 20 milliseconds for a data centre ethernet network.

I also find it interesting that the silicon from QLogic and Mellanox has rapidly been repurposed for FCoE HBA’s. There must be strong similarities is these products for this to happen.

Ethernet is easy to sell

People will buy and use what they know. In that sense, faster ethernet or ‘newer’ ethernet is an easier decision. I assume that offering customers an easy choice, what they perceive as a simple upgrade path, is a much easier sale, than one that needs to teach customers new technologies. People tend to laziness, and this is a probable cause.

Cheap always wins

Ethernet is cheap and not very good at a lot of things but, there is a lot of it about and plenty of existing technology. And don’t forget skills, lots of people understand Ethernet and that has a price tag.

And history shows that cheap always wins.

Maybe Cisco wasn’t dominant

Cisco has a policy of being number one or two in any market. If they can’t do that, they will walk away or do something to kill that market. For example, in the early days of IPSec VPN, Cisco didn’t have a good story. They bought two or three companies before converging the code in the Cisco PIX (and later the ASA). They grew to number one by charging no license fee for their solution thus blocking all their competitors from making a profit. They same system appears to be happening today with SSL VPN which is now a small license fee for most customers.

If Cisco couldn’t dominate the Infiniband market, then perhaps Cisco spun out the Nuova Systems (the company that built FCoE and the Nexus family silicon as a startup) to give it a shot at collapsing that market.

FCoE cheaper but not better ?

If you spend some time with Infiniband, you realise that FCoE isn’t specifically a world beating technology innovation. FCoE is the VHS to Infiniband Betamax. And you can see that many of the ideas the FCoE Unified Fabric promotes are the same as those promoted in the past. In that sense, FCoE isn’t new, just a rehash of old ideas mashed onto existing technologies.

Now, I’m sure that FCoE and Cisco UCS strategy is working just fine for customers and it’s going to do well in the market. Given that Cisco has spent anything up to US$1billion buying, manufacturing and marketing the product, it’s a guaranteed success. But its worth looking back on older technologies with regret and cynically viewing these new developments to determine if they are really the best solutions for our networks.

So far, I’m not so sure. We could have had better, but FCoE will probably work just fine for a while until we need to scale and get faster than Ethernet can ever go. Maybe Ethernet will work for us in the future but Infiniband will still be there waiting for us to reuse it. Just like all the other technologies that we keep reusing.

About the author

Greg Ferro

Greg Ferro is the co-host of Packet Pushers. After surviving 25 years in Enterprise IT with only minor damage, he uses his networking expertise for good in the service of others by deep diving on technology and industry. His unique role as an inspirational cynicist brings a sense of fun, practicality and sheer talent to world of data networking and its place in a world of clouds.

He blogs regularly at http://etherealmind.com and the podcasts are at http://packetpushers.net.

3 Comments

  • Have you done any actual price-checking? Last I checked, on an equivalent-bandwidth basis, InfiniBand switches and adapters were about 4x cheaper, in $$ per Gigabit, than plain-vanilla 10G Ethernet gear — and FCoE gear is more expensive than plain-vanilla 10G. If you want to compare a 1 Gigabit Ethernet port to a 40 Gigabit InfiniBand port, and say the Ethernet version is cheaper, you'd be right.

    Reality is, for equivalent bandwidth, Ethernet is *more* expensive than InfiniBand — and FCoE is more expensive yet.

  • HI Greg,

    Interesting comment, but not very accurate from a technical perspective. The original motivation for Infiniband was as a converged datacenter fabrice….the term System Area Network was coined, but it was envisioned to do a LOT more than you mention above. The original design target was for it to be a replacement for PCI…..AND support RDMA…..as well as layering networking and storage protocols on top of it. Remember, Intel was going to use IB as the native chipset interconnect….before Hypertransport threw a monkey wrench into the mix.

    The real differentiator between IB and FCoE or actually CEE….is this RDMA capability AND the fact that it is a fabric topology. By fabric, I mean that all devices are automatically discovered within the fabric and traffic paths are defined by the subnet manager to avoid loops, and, of course, the latency.

    More important than FCoE is the fact that the IBTA is now in the process of standardizing RDMA over CEE (RoCEE….pronounced Rocky). The key here is that NOW CEE will have a significant piece of the capability that IB has and can actually replace IB in large scale clusters….so why is RDMA important?

    1. HPC is the fastest growing market for servers….so all major players sell there and IB is the market leader for obvious reasons.

    2. Cloud computing using Hypervisors……are now becomign large cluster architectures. Imagine having 10,000 hypervisors….and all fo those hypervisors need to talk to each other…..sync on storage, VM Migration, fault-tolerance, back-up, etc……the hypervisor to hypervisor and hypervisor to storage and network are really behaving like HPC. Lower latency, higher bandwidth, share infrastructure means lower cost, better performance, and more scalable cloud.

    3. IB drivers are ALL OPEN SOURCE. IB today has all drivers open source and support storage transport, network transport, RDMA, MPI, iSCSI, etc……all open source. When you get RDMA (IB) on Ethernet transport…..all of these open source drivers can now be leveraged on top of Ethernet…..REMEMBER, FCoE still uses FC….so you still need a CNA from Qlogic, Emulex, etc….and pay a premium to get it.

    4. SSD Storage – SSD Storage is based on NAND Flash memory technology. Today, that NAND flash has a block like interface that lends well to plugging in behind HDD-type controllers, but remember this is MEMORY technology…..so this begs the bigger question,…..why do I need SCSI again? If I have high speed Flash memory and I have RDMA capabilities……it is significantly more efficient to just RDMA my data do and from my solid state flash memory……better throughput, better transactional perofrmance, lower cost, better scalability….AND no storage qualification ( although there are some tricks to qualifying Flash memory).

    FCoE ONLY addresses the Storage and Networking converged fabric…..it doesnt touch the rest of the capabilities that IB has. So, a more significant fact is that RoCEE is on its way!

  • HI Greg,

    Interesting comment, but not very accurate from a technical perspective. The original motivation for Infiniband was as a converged datacenter fabrice….the term System Area Network was coined, but it was envisioned to do a LOT more than you mention above. The original design target was for it to be a replacement for PCI…..AND support RDMA…..as well as layering networking and storage protocols on top of it. Remember, Intel was going to use IB as the native chipset interconnect….before Hypertransport threw a monkey wrench into the mix.

    The real differentiator between IB and FCoE or actually CEE….is this RDMA capability AND the fact that it is a fabric topology. By fabric, I mean that all devices are automatically discovered within the fabric and traffic paths are defined by the subnet manager to avoid loops, and, of course, the latency.

    More important than FCoE is the fact that the IBTA is now in the process of standardizing RDMA over CEE (RoCEE….pronounced Rocky). The key here is that NOW CEE will have a significant piece of the capability that IB has and can actually replace IB in large scale clusters….so why is RDMA important?

    1. HPC is the fastest growing market for servers….so all major players sell there and IB is the market leader for obvious reasons.

    2. Cloud computing using Hypervisors……are now becomign large cluster architectures. Imagine having 10,000 hypervisors….and all fo those hypervisors need to talk to each other…..sync on storage, VM Migration, fault-tolerance, back-up, etc……the hypervisor to hypervisor and hypervisor to storage and network are really behaving like HPC. Lower latency, higher bandwidth, share infrastructure means lower cost, better performance, and more scalable cloud.

    3. IB drivers are ALL OPEN SOURCE. IB today has all drivers open source and support storage transport, network transport, RDMA, MPI, iSCSI, etc……all open source. When you get RDMA (IB) on Ethernet transport…..all of these open source drivers can now be leveraged on top of Ethernet…..REMEMBER, FCoE still uses FC….so you still need a CNA from Qlogic, Emulex, etc….and pay a premium to get it.

    4. SSD Storage – SSD Storage is based on NAND Flash memory technology. Today, that NAND flash has a block like interface that lends well to plugging in behind HDD-type controllers, but remember this is MEMORY technology…..so this begs the bigger question,…..why do I need SCSI again? If I have high speed Flash memory and I have RDMA capabilities……it is significantly more efficient to just RDMA my data do and from my solid state flash memory……better throughput, better transactional perofrmance, lower cost, better scalability….AND no storage qualification ( although there are some tricks to qualifying Flash memory).

    FCoE ONLY addresses the Storage and Networking converged fabric…..it doesnt touch the rest of the capabilities that IB has. So, a more significant fact is that RoCEE is on its way!

Leave a Comment