InfiniBand


InfiniBand
The panel of an InfiniBand switch

InfiniBand is a switched fabric communications link used in high-performance computing and enterprise data centers. Its features include high throughput, low latency, quality of service and failover, and it is designed to be scalable. The InfiniBand architecture specification defines a connection between processor nodes and high performance I/O nodes such as storage devices.

InfiniBand forms a superset of the Virtual Interface Architecture.

Contents

Description

Effective theoretical throughput
(actual data rate, not signaling rate)
  SDR DDR QDR FDR EDR HDR NDR
1X 2 Gbit/s 4 Gbit/s 8 Gbit/s 14 Gbit/s 25 Gbit/s 125 Gbit/s 750 Gbit/s
4X 8 Gbit/s 16 Gbit/s 32 Gbit/s 56 Gbit/s 100 Gbit/s 500 Gbit/s 3000 Gbit/s
12X 24 Gbit/s 48 Gbit/s 96 Gbit/s 168 Gbit/s 300 Gbit/s 1500 Gbit/s 9000 Gbit/s

Like Fibre Channel, PCI Express, Serial ATA, and many other modern interconnects, InfiniBand offers point-to-point bidirectional serial links intended for the connection of processors with high-speed peripherals such as disks. On top of the point to point capabilities, InfiniBand also offers multicast operations as well. It supports several signalling rates and, as with PCI Express, links can be bonded together for additional throughput.

Signaling rate

The SDR serial connection's signalling rate is 2.5 gigabit per second (Gbit/s) in each direction per connection. DDR is 5 Gbit/s and QDR is 10 Gbit/s. FDR is 14.0625 Gbit/s and EDR is 25.78125Gbit/s per lane.

For SDR, DDR and QDR, links use 8B/10B encoding — every 10 bits sent carry 8bits of data — making the effective data transmission rate four-fifths the raw rate. Thus single, double, and quad data rates carry 2, 4, or 8 Gbit/s useful data, respectively. For FDR and EDR, links use 64B/66B encoding — every 66 bits sent carry 64bits of data. (Neither of these calculations take into account the additional physical layer overhead requirements for comma characters or protocol requirements such as StartOfFrame and EndOfFrame).

Implementers can aggregate links in units of 4 or 12, called 4X or 12X. A 12X QDR link therefore carries 120 Gbit/s raw, or 96 Gbit/s of useful data. As of 2009 most systems use a 4X aggregate, implying a 10 Gbit/s (SDR), 20 Gbit/s (DDR) or 40 Gbit/s (QDR) connection. Larger systems with 12X links are typically used for cluster and supercomputer interconnects and for inter-switch connections.

The Infiniband future roadmap also has "HDR" (High Data rate), due in 2014, and "NDR" (Next Data Rate), due "some time later", but as of June 2010, these data rates were not yet tied to specific speeds.[1]

Latency

The single data rate switch chips have a latency of 200 nanoseconds, DDR switch chips have a latency of 140 nanoseconds and QDR switch chips have a latency of 100 nanoseconds. The end-to-end latency range ranges from 1.07 microseconds MPI latency (Mellanox ConnectX QDR HCAs) to 1.29 microseconds MPI latency (Qlogic InfiniPath HCAs) to 2.6 microseconds (Mellanox InfiniHost DDR III HCAs).[citation needed] As of 2009 various InfiniBand host channel adapters (HCA) exist in the market, each with different latency and bandwidth characteristics. InfiniBand also provides RDMA capabilities for low CPU overhead. The latency for RDMA operations is less than 1 microsecond (Mellanox ConnectX HCAs).

Topology

InfiniBand uses a switched fabric topology, as opposed to a hierarchical switched network like traditional Ethernet architectures, although emerging Ethernet fabric architectures propose many benefits which could see Ethernet replace Infiniband.[2] Most of the network topologies are Fat-Tree (Clos), mesh or 3D-Torus. Recent papers (ISCA'10) demonstrated butterfly topologies as well.[citation needed]

As in the channel model used in most mainframe computers, all transmissions begin or end at a channel adapter. Each processor contains a host channel adapter (HCA) and each peripheral has a target channel adapter (TCA). These adapters can also exchange information for security or quality of service.

Messages

InfiniBand transmits data in packets of up to 4 KB that are taken together to form a message. A message can be:

Programming

InfiniBand has no standard programming API within the specification. The standard only lists a set of "verbs"; functions that must exist. The syntax of these functions is left to the vendors. The de-facto standard to date has been the syntax developed by the OpenFabrics Alliance, which was adopted by most of the InfiniBand vendors, for Linux, FreeBSD, and Windows. The Infiniband software stack developed by OpenFabrics Alliance is released as "OpenFabrics Enterprise Distribution (OFED)", under a choice of two licenses GPL2 or BSD license for Linux and FreeBSD, and as "WinOF" under a choice of BSD license for Windows.

History

InfiniBand originated from the 1999 merger of two competing designs:

  1. Future I/O, developed by Compaq, IBM, and Hewlett-Packard
  2. Next Generation I/O (ngio), developed by Intel, Microsoft, and Sun

From the Compaq side, the roots of the technology derived from Tandem's ServerNet. For a short time before the group came up with a new name, InfiniBand was called System I/O.[3]

InfiniBand was originally envisioned[by whom?] as a comprehensive "system area network" that would connect CPUs and provide all high speed I/O for "back-office" applications. In this role it would potentially replace just about every datacenter I/O standard including PCI, Fibre Channel, and various networks like Ethernet. Instead, all of the CPUs and peripherals would be connected into a single pan-datacenter switched InfiniBand fabric. This vision offered a number of advantages in addition to greater speed, not the least of which is that I/O workload would be largely lifted from computer and storage. In theory, this should make the construction of clusters much easier, and potentially less expensive, because more devices could be shared and they could be easily moved around as workloads shifted. Proponents of a less comprehensive vision saw InfiniBand as a pervasive, low latency, high bandwidth, low overhead interconnect for commercial datacenters, albeit one that might perhaps only connect servers and storage to each other, while leaving more local connections to other protocols and standards such as PCI.[citation needed]

As of 2009 InfiniBand has become a popular interconnect for high performance computing, and its adoption as seen in the TOP500 supercomputers list is faster than Ethernet.[4] In the recent years InfiniBand have been more and more adopted in the Enterprise datacenters, for example Oracle Exadata and Exalogic Machines, financial sectors, cloud computing (InfiniBand based system won the best of VMWorld for Cloud Computing) and more. InfiniBand has been mostly used for high performance clustering computer cluster applications. A number of the TOP500 supercomputers have used InfiniBand including the former[5] reigning fastest supercomputer, the IBM Roadrunner. In another example of InfiniBand use within high performance computing, the Cray XD1 uses built-in Mellanox InfiniBand switches to create a fabric between HyperTransport-connected Opteron-based compute nodes.[citation needed]

SGI, LSI, DDN, Oracle, Rorke Data among others, have also released storage utilizing InfiniBand "target adapters". These products essentially compete with architectures such as Fibre Channel, SCSI, and other more traditional connectivity-methods. Such target adapter-based discs can become a part of the fabric of a given network, in a fashion similar to DEC VMS clustering. The advantage to this configuration is lower latency and higher availability to nodes on the network (because of the fabric nature of the network). In 2009, the Oak-Ridge National Lab Spider storage system used this type of InfiniBand attached storage to deliver over 240 gigabytes per second of bandwidth.

InfiniBand uses copper CX4 cable for SDR and DDR rates — also commonly used to connect SAS (Serial Attached SCSI) HBAs to external (SAS) disk arrays. With SAS, this is known as an SFF-8470 connector, and is referred to as an "Infiniband style" Connector.[6] The latest connectors used with QDR capable solutions are QSFP (Quad SFP).

In 2008 Oracle Corporation released its HP Oracle Database Machine build as a RAC Database (Real Application Clustered Database) with storage provided on its Exadata Storage server which utilises InfiniBand as the backend interconnect for all IO and Interconnect traffic. Updated versions of the Exadata Storage system, now using Sun computing hardware, continue to utilize Infiniband infrastructure.

In 2009, IBM announced a December 2009 release date for their DB2 pureScale offering, a shared-disk clustering scheme (inspired by parallel sysplex for DB2 z/OS) that uses a cluster of IBM System p servers (POWER6/7) communicating with each other over an InfiniBand interconnect.

In 2010, scale-out network storage manufacturers increasingly adopt InfiniBand as primary cluster interconnect for modern NAS designs, like Isilon IQ or IBM SONAS. Since scale-out systems run distributed metadata operations without "master node", internal low latency communication is a critical success factor for highest scalability and performance (see TOP500 cluster architectures).

In 2010, Oracle releases Exadata and Exalogic machines, those implement the Infiniband QDR with 40 Gb/s (32 Gb/s effective) using Sun Switches (Sun Network QDR InfiniBand Gateway Switch). The Inifiniband fabric is used to connect compute nodes and those with the storage, and is used to connect several Exadata and Exalogic machines also.

In June of 2011, FDR switches and adapters were announced at the International Supercomputing Conference.[7]

See also

References

External links



Wikimedia Foundation. 2010.

Look at other dictionaries:

  • InfiniBand — InfiniBand. InfiniBand es un bus de comunicaciones serie de alta velocidad, diseñado tanto para conexiones internas como externas. Sus especificaciones son desarrolladas y mantenidas por la Infiniband Trade Association (IBTA). Descripción Caudal… …   Wikipedia Español

  • InfiniBand — высокоскоростная коммутируемая последовательная шина, применяющаяся как для внутренних (внутрисистемных), так и для межсистемных соединений. Описания Infiniband специфицированы, поддержкой и развитием спецификаций занимается InfiniBand Trade… …   Википедия

  • Infiniband — Façade d un switch InfiniBand L infiniBand est un bus d ordinateur à haut débit. Il est destiné aussi bien aux communications internes qu externes. Il est le fruit de la fusion de deux technologies concurrentes, Future I/O, développée par Compaq …   Wikipédia en Français

  • Infiniband — Kupfer Kabel InfiniBand ist eine Spezifikation zur Beschreibung einer seriellen Hochgeschwindigkeitsübertragungstechnologie. Es ist das Resultat der Vereinigung zweier konkurrierender Systeme: Future I/O von Compaq, IBM und Hewlett Packard und… …   Deutsch Wikipedia

  • InfiniBand — Kupferkabel InfiniBand ist eine Spezifikation zur Beschreibung einer seriellen Hochgeschwindigkeitsübertragungstechnologie. Es ist das Resultat der Vereinigung zweier konkurrierender Systeme: Future I/O von Compaq, IBM und Hewlett Packard und… …   Deutsch Wikipedia

  • InfiniBand — Façade d un switch InfiniBand L infiniBand est un bus d ordinateur à haut débit. Il est destiné aussi bien aux communications internes qu externes. Il est le fruit de la fusion de deux technologies concurrentes, Future I/O, développée par Compaq …   Wikipédia en Français

  • Infiniband — …   Википедия

  • InfiniBand Trade Association — The InfiniBand Trade Association (IBTA) is the standards organization that defines and maintains the InfiniBand specification. It is an industry consortium.The IBTA was established in 1999, and its most prominent members include Cisco, IBM, Intel …   Wikipedia

  • IBM BladeCenter — IBM BladeCenter  это архитектура блэйд серверов IBM. BladeCenter E front side: 8 blade servers (HS20) followed by 6 empty slots …   Википедия

  • IBM BladeCenter — The IBM BladeCenter is IBM s blade server architecture.HistoryOriginally introduced in 2002, based on engineering work started in 1999, the IBM BladeCenter was a relative late comer to the blade market. But, it differed from prior offerings in… …   Wikipedia