What Cable Do I Need? The Complete Guide to GPU Cluster Interconnects in 2026
Building a GPU cluster is mostly a hardware procurement problem — until you hit the cabling. Then it becomes a compatibility research project that can stall a build for days.
This guide is the single reference you should have had from the start. We've organized it by the most common questions buyers actually search for.
The Quick-Reference Table: GPU System → Cable
| System / Use Case | Cable Type | Form Factor | Speed | Max Distance | Example Part # |
|---|---|---|---|---|---|
| DGX Spark ↔ DGX Spark | Passive DAC | QSFP56 | 200G | 0.5m–3m | MCP1650-V00AE30 |
| H100 NVL / SXM5 intra-node | NVLink (on-board) | N/A | 900 GB/s | On-board | N/A |
| H100 / A100 cluster (InfiniBand) | Active DAC or AOC | QSFP56 / QSFP-DD | 200G–400G | DAC: ≤5m / AOC: ≤100m | MFS1S50-H005E |
| 400G spine-leaf fabric | Passive DAC or AOC | QSFP-DD | 400G | DAC: ≤3m / AOC: ≤100m | MCP1660-W002E30 |
| 400G to 2×200G breakout | Passive Breakout DAC | QSFP-DD to 2×QSFP56 | 400G↔2×200G | ≤2m | MCP7H60-W002 |
| 100G server-to-switch | Passive DAC or AOC | QSFP28 | 100G | DAC: ≤5m / AOC: ≤100m | MFA1A00-C010 |
| 800G next-gen fabric | Passive Breakout DAC | OSFP to 2×OSFP | 800G↔2×400G | ≤2m | MCP7Y00-N002 |
DAC vs. AOC vs. Fiber Transceiver: When Do You Use Each?
This is the most fundamental decision in cluster cabling, and it comes down to distance and cost.
Direct Attach Copper (DAC): A fixed cable with transceivers molded onto each end. No separate optics needed. Passive DAC cables draw no power beyond the host port — they're the cheapest and most reliable option for short runs. The trade-off is distance: passive DAC is reliable up to about 3–5 meters; active DAC (with signal conditioning circuitry) extends that to about 7 meters. If your switches and servers are in the same rack or adjacent racks, DAC is almost always the right answer.
Active Optical Cable (AOC): An AOC uses fiber internally but presents as a fixed cable with transceivers on both ends — no separate optics to source. AOC reaches up to 100 meters over multimode fiber and uses very little power. It's the right choice when you need to cross a row, reach a top-of-rack switch from a distant server, or span a patch panel. The trade-off vs. DAC is cost — AOC runs 3–5x more for the same form factor and speed.
Discrete Fiber Transceiver + Fiber Patch Cable: This is the most flexible option. You install a transceiver (QSFP28, QSFP56, QSFP-DD) in the switch or NIC port and run a separate fiber patch cable. This gives you the ability to swap transceivers independently, run much longer distances (up to 500m on SR4, and many kilometers on LR variants), and mix and match. It's the right architecture for multi-row or multi-room deployments. The trade-off is more components to source and manage.
The simple rule: Same rack → DAC. Adjacent racks or across a room → AOC. Multiple rooms or buildings → discrete transceiver + fiber.
What cable connects two NVIDIA DGX Spark systems?
The DGX Spark uses a ConnectX-7 QSFP port on the rear. To connect two DGX Sparks directly for dual-system NVLink-style workloads, you need a 200G QSFP56 Passive DAC cable, 0.5 meters.
The 0.5m length is intentional — DGX Sparks are compact desktop units designed to sit side by side, and shorter cable means less signal loss and a cleaner installation.
Compatible part: MCP1650-V00AE30 (NVIDIA/Mellanox compatible 200G QSFP56 Passive DAC, 0.5m). Resilient Tec stocks this specifically for DGX Spark dual-system deployments.
What interconnect does an H100 cluster need?
This depends on whether your H100s are NVL (PCIe) or SXM5.
For H100 SXM5 systems (the highest-performance H100 variant, used in DGX H100 and HGX H100), intra-node GPU-to-GPU communication uses NVLink — a proprietary, on-board interconnect built into the NVSwitch fabric. You don't cable NVLink; it's part of the baseboard.
For inter-node communication (connecting multiple H100 servers into a training cluster), NVIDIA's reference architecture uses InfiniBand — specifically NDR InfiniBand at 400 Gb/s per port, using QSFP-DD connectors. The cables are either passive DAC for short rack-to-rack runs or AOC for longer reaches.
For H100 PCIe (NVL) systems, the same inter-node InfiniBand architecture applies, though some builds use RoCE v2 over Ethernet as a more cost-effective alternative with only modest performance trade-offs for many workloads.
What's the difference between InfiniBand, RoCE v2, and NVLink for AI training?
These three protocols solve different parts of the communication problem.
NVLink is NVIDIA's proprietary GPU-to-GPU interconnect within a single server or NVSwitch fabric. It runs at extremely high bandwidth (900 GB/s total in H100 NVL configurations) and very low latency. You can't replicate this with networking — it's a hardware architecture.
InfiniBand is a purpose-built low-latency RDMA (Remote Direct Memory Access) network fabric used for inter-node communication between servers. It's the gold standard for distributed AI training because of its extremely low latency, high bandwidth, and native support for RDMA operations that bypass the CPU. NDR InfiniBand (400 Gb/s per port) is the current generation used in H100 deployments. HDR (200 Gb/s) is still common in A100 clusters.
RoCE v2 (RDMA over Converged Ethernet) provides the same RDMA semantics as InfiniBand but runs over standard Ethernet hardware. The appeal is cost — 400G Ethernet switches are more commoditized than InfiniBand switches, and the cable/transceiver ecosystem is larger. The trade-off is latency: RoCE v2 is somewhat higher latency than InfiniBand, which matters most for tightly coupled, synchronous training workloads. For inference, fine-tuning, or less synchronization-intensive training, RoCE v2 is a legitimate and increasingly popular choice.
Form factor guide: QSFP28 vs. QSFP56 vs. QSFP-DD vs. OSFP
| Form Factor | Speed | Common Use Case |
|---|---|---|
| QSFP28 | 100G | 100GbE servers, older AI clusters |
| QSFP56 | 200G | HDR InfiniBand, H100/A100 inter-node |
| QSFP-DD | 400G | NDR InfiniBand, 400GbE spine-leaf |
| OSFP | 400G–800G | Next-gen AI fabric, 800G deployments |
QSFP-DD ports are backward compatible with QSFP56 and QSFP28 optics via adapter or pin reuse. This matters for phased upgrades: you can deploy QSFP-DD switches today and run QSFP56 cables to your current H100 servers, then upgrade the cables when you move to next-gen hardware.
What's the maximum distance for each cable type?
| Cable Type | Max Reliable Distance |
|---|---|
| Passive DAC (100G–400G) | 3–5 meters |
| Active DAC (100G–400G) | 7 meters |
| AOC (100G–400G) | 30–100 meters (OM3/OM4) |
| SR transceiver + MMF | 70–100 meters (OM4) |
| DR/FR transceiver + SMF | 500m–2km |
| LR transceiver + SMF | 10km |
For AI cluster deployments, the vast majority of links use passive DAC (same-rack or adjacent-rack) or AOC (across rows). Long-reach fiber is reserved for multi-building campus connectivity or storage replication.
Should I use breakout cables? When do they make sense?
A breakout cable takes a single high-speed port and splits it into multiple lower-speed connections. The most common example in AI infrastructure is 400G QSFP-DD to 2×200G QSFP56 — a single port on a 400G spine switch fans out to connect two H100 servers running 200G InfiniBand.
Breakout cables make sense when your spine switches run a higher speed than your server NICs. They let you maximize port density on expensive spine hardware without buying separate lower-speed switches. The trade-off is that both ends of the breakout must be in predictable, fixed locations — breakout configurations are harder to repatch.
For DGX-scale clusters, breakout DAC and AOC cables are standard practice for connecting 400G switch ports to 200G server NICs.
The bottom line
For most AI cluster builds in 2026, the answer is simpler than it looks: passive DAC for in-rack and adjacent-rack links, AOC when you need to cross a room, and InfiniBand or RoCE v2 as your fabric depending on your latency tolerance and budget. Start with the quick-reference table at the top, match your form factors, and get the right part number before your hardware arrives — cable delays are one of the most avoidable bottlenecks in a cluster deployment.
Not sure which cable fits your specific build? Call us at 888-442-3849 or request a quote — we'll spec it out for you.