![]() ![]() |
![]() ![]() |
|
FPGA-PCI104 enables cluster computing distributed switch
By Graham Campbell, PhD, and Daniel T. O’Leary Ether2 Corp. had acquired the rights to a new communications technology and wanted to demonstrate it with a prototype that interconnected off-the-shelf PCs so that they could operate as a distributed cluster of computers. Prototypes of digital communications systems are typically built using FPGAs, but there were no PCI FPGA development boards available with the right mix of features. The solution was to utilize the FPGA-PCI104, a PCI-104 form factor board offered by Tri-M Systems. This article describes the design considerations that led to the selection of the FPGA-PCI104 and the steps taken to use it while implementing the Distributed Queue Switch Architecture (DQSA) Network Interface Card (NIC) in an off-the-shelf PC. The task Ether2 is a start-up company based in Los Angeles, CA and Vancouver, BC. The company had obtained an exclusive license from the Illinois Institute of Technology for a new, patented communications switching technology. Ether2 had limited resources yet wanted to build a prototype that would demonstrate the technology in a cluster computing environment. The switching technology, DQSA, requires only a DQSA NIC in each of the clustered computers; all switching is managed in the NICs, so no central router or switch is required except for a simple hub at the center of the network. High-performance computing at one time was the realm of very expensive, often custom-designed, supercomputers such as Cray One, IBM, and Deep Blue. However, this type of computing is increasingly carried out using clusters of standard PCs or Macs that are interconnected by a switch. A given problem is solved by partitioning it and sending each segment to a separate processor that solves that part of the problem in parallel with all the other processors. In many such problems, it is necessary for the processors to periodically communicate with each other and to stop processing until they receive a response.
The computers are interconnected using standard network technologies, such as Ethernet, using CAT5 cables and switches. Thus, the speed of the switch – that is, the latency or time required to establish a circuit through the switch – can have a significant impact on the performance of the supercomputer. The choices for the switches range from very low-cost, off-the-shelf “consumer” Ethernet switches to custom non-blocking switches such as the Myricom Myrinet switch. In general, the more expensive the switch, the lower the latency. Typical switch latencies range from 3 µs to more than 100 µs.Ether2’s DQSA technology eliminates the need for a switch and, instead, places all computers on a common bus. A common bus could be regarded as a bottleneck, but if there is sufficient bandwidth, a simple common bus will provide better performance than a non-blocking switch. Transmission speeds of 40 Gbps are now available in the Sonet hierarchy (OC768), and equivalent Ethernet components are under development. A DQSA distributed switch could perform for a large class of programs as well as, or better than, a much more costly, low-latency custom switch. A plus factor is that the latency of a DQSA distributed switch decreases as transmission speed increases, with latency of less than 0.25 µs possible at 40 Gbps. The role of the prototype Although DQSA is well-documented and several proof-of-concept systems had been built, Ether2 required a new prototype system to measure performance and to demonstrate to potential investors. The best approach seemed to be to build the new prototype cluster system using off-the-shelf PCs, with each PC hosting a custom NIC that implemented the DQSA functions but that used a conventional 10/100BASE-TX Ethernet physical layer interface driving CAT5 cables. Prototypes of new digital communications systems, such as the custom DQSA NIC, are typically built using FPGAs. But there were no “PC style” PCI-FPGA development boards available with a mix of I/O features suitable for a DQSA NIC. The solution was to utilize the Tri-M FPGA-PCI104. The FPGA-PCI104 contains an Altera Cyclone FPGA, a PCI-compliant interface, and an innovative, highly flexible daughtercard capability. It was used for the custom NICs for a four-node, PC-based cluster system utilizing DQSA at 100 Mbps. The daughtercard capability was the key feature that made it possible to design and build a small number of custom NICs. It should be noted that DQSA is suitable for use at any speed and over other physical media such as optical fiber, coaxial cable, and wireless. However, since 100 Mbps CAT5 wiring and technology is relatively simple and well-understood, Ether2 chose to do the prototype system at that speed. The hardware The Tri-M FPGA-PCI104 (see Figure 1) contains an Altera Cyclone FPGA; there is a choice of using a 4,000, 12,000, or 20,000 Logic Element Cyclone. There is also a full PCI-104 target interface, plus various support circuits such as a 60 MHz oscillator (with an SMB connector for an external oscillator), a real-time counter with SuperCap backup, and an SmBus interface. Furthermore, and of key interest to Ether2, there is a set of connectors that provides a highly flexible daughtercard system featuring four identical connectors, each with 24 pins directly wired to its own set of unique pins on the Cyclone FPGA, plus 10 more FPGA signals shared among all the connectors. This daughtercard capability let Ether2 design a relatively simple, easy-to-build, low-cost daughtercard for use in the prototype, which contained just the parts needed to support the 10/100BASE-TX Ethernet physical layer for use with CAT5 cables. Off-the-shelf versus custom
In today’s electronics design world, there is a constant push to put components in ever-smaller, ever-denser packaging. The Cyclone FPGA on the FPGA-PCI104 board is in a 324-contact, 19 mm square, 1 mm contact spacing, Ball Grid Array package (and some other parts on the board have lead spacing as small as 0.6 mm). However, a big problem with such small, dense packaging is that, often and unfortunately, it is all but impossible to build “just a few prototypes” of a custom circuit
that uses any recent state-of-the-art components such as FPGAs. Instead, an off-the-shelf board that has some or all of the desired small and densely packaged components is usually the only practical answer.But there is a drawback to most off-the-shelf boards: Seldom do such boards have exactly the right mix of required I/O components for any particular application – in this case, an Ethernet 10/100BASE-TX PHY chip with associated CAT5 circuits and connector directly wired to a suitable FPGA. On the Tri-M FPGA boards, however, one-half of the Cyclone FPGA’s I/O pins (106 to be exact) connect directly and only to the four daughtercard connectors. The other FPGA pins connect to the PCI bus interface, the onboard auxiliary circuits, and power. Thus, the daughtercards can implement exactly, and only, whatever the required I/O circuitry is for a particular application. Moreover, if a particular I/O application does not need to use all four of the daughtercard connectors, the unused one(s) are available for other purposes. Tri-M has published a standardized set of daughtercard outlines and PCB layout template files. For Ether2’s DQSA prototype system, it was only necessary to design and build the daughtercard carrying a through-hole CAT5 connector/transformer, a surface-mount 10/100BASE-TX PHY chip, and some resistors, capacitors, and so on. Lead spacing was large enough to accommodate hand-soldering. The PCB was four layers – not suitable for “basement lab” fabrication, perhaps – but much easier to design and much cheaper to make than the 10-layer PCB of the FPGA-PCI104. During development and system testing, and since the Ether2 NIC card only used two of the daughtercard sites, two of Tri-M’s standard daughtercards were also used. A triple RS-232 serial driver/receiver board was used to dump debugging information from the DQSA functions in the FPGA out to a monitor, and a multi-LED card was used for status/event indications. Another unique feature of the FPGA-PCI104 proved of value to Ether2. The circuitry did not require the entire space available on a PCI-104 board, and so the board was “notched” at the sides, as shown in Figure 2, to allow greater height for the daughtercards. This configuration accommodates parts on daughtercards that are quite high (for example, a CAT5 RJ-45 connector) but without increasing overall space requirements in PCI-104 systems. Only one final problem required a solution: The PCI-104 interface of the FPGA-PCI104 had to be adapted into a PC-style “desktop” PCI bus physical connector. The PCI-104 bus fully complies with the 32-bit PCI standard, so adapting was simple: mount the FPGA-PCI104 on a PCI-104 to PCI bus adapter (Winsystems’ PCI-PPM). See Figure 3 for a photo of the final DQSA NIC board.
As a result, Ether2’s requirement for a NIC that could support DQSA and, moreover, one wherein it was possible and affordable to build “just a few,” was satisfied. Project complete Ten Ether2 NIC boards and two hubs were completed on time, adequately supporting two, four-node DQSA systems plus spares. The systems are now undergoing initial testing and will soon be serving their intended purpose – demonstrating the DQSA technology and generating performance statistics.Distributed Queue Switch Architecture
The DQSA, developed at the Illinois Institute of Technology, addresses a major problem in communications: The world’s circuit-switched telecom infrastructure is ill-suited for the packet traffic that today dominates communications, while the router-based, packet-switched infrastructure that has been installed to support this packet traffic and the Internet is ill-suited to providing quality of service. Even the combination of these two distinct communications technologies still does not provide a level of service that supports the long sought-after goal of voice, video, and data convergence.DQSA allows the basic circuit-switched communications infrastructure to operate simultaneously as a circuit-switched network and as a packet-switched network, thereby eliminating the need for separate networks dedicated only to packet switching and the routers they utilize (refer to Sidebar Figure 1). DQSA provides this simultaneous support of packet traffic and circuit-oriented traffic in wireless networks, local, metropolitan, and wide area networks, satellite networks, and in virtually all other types of communications networks regardless of the medium, distance covered, or transmission speed. DQSA provides efficient switching of short messages – a particular requirement of cluster computing. Papers, simulations, and other material on DQSA are available at: www.iit.edu/~dqrap Graham Campbell is currently the chief science officer of Ether2. He was a professor of computer science and engineering at the Illinois Institute of Technology until his retirement, and thereafter conducted research in the field of communications. Dan O’Leary is currently a lead design engineer with Ether2, responsible for the development of hardware, firmware, and FPGA IP for DQSA products. He has more than 30 years of experience designing embedded computer products and systems, microprocessor/DSP/mixed-signal digital/analog circuits, and related software. Reprinted from PC/104 Embedded Solutions Spring 2006 Copyright 2006 For more information please contact Tri-M Systems' Sales Department. sales@tri-m.com 1.800.665.5600 |