Liquid cooling for AI data centres has moved from "nice to have" to "technically required" for dense GPU clusters. A single NVIDIA DGX H100 system dissipates 10.2kW — far exceeding what air cooling can remove from a standard data centre rack. Direct Liquid Cooling (DLC) with cold plates on CPUs and GPUs is now the mandated cooling solution for H100 and H200 deployments, while full immersion cooling is emerging for maximum density. This guide covers DLC architecture, vendor options, deployment requirements, and the operational changes liquid cooling demands from enterprise data centre teams.
Why Air Cooling Fails for AI GPU Clusters
DLC Implementation Approaches
| Approach | What Gets Liquid-Cooled | Residual Air Required | PUE Achievable | Retrofit Difficulty |
|---|---|---|---|---|
| Cold Plate DLC (CPU/GPU only) | CPU and GPU die — highest heat generators | Yes — for VRMs, memory, other components | 1.15–1.25 | Medium — requires liquid distribution units (CDU) |
| Full Server DLC | CPU, GPU, VRM, memory, NVMe via liquid rails | Minimal — small residual air for storage | 1.05–1.15 | High — custom server design required |
| Rear Door Heat Exchanger | All server heat captured at rack exhaust | Yes — servers still air-cooled internally | 1.20–1.35 | Low — attaches to existing rack rear |
| Single-Phase Immersion | Entire server submerged in dielectric fluid | None | 1.03–1.05 | Very high — bespoke tank infrastructure |
The CDU is the heart of a DLC system: pumps coolant to server cold plates, controls flow rate and pressure, monitors supply/return temperatures and flow. Size CDU for the rack's maximum thermal load plus 20% headroom. For DGX H100 deployments: 60kW CDU minimum per rack (10.2kW DGX × 4 racks, plus headroom). Water quality management (deionised water or glycol mixture) is critical — poor water quality causes galvanic corrosion in cold plates. Engage your facilities team and the CDU vendor on water chemistry specifications before procurement.
CDUs connect to the facility chilled water loop or dry cooler loop. Design for free-cooling operation: if your chilled water loop operates at 40–50°C supply temperature, no chiller is required for DLC — major energy saving. Work with your mechanical engineer on: secondary loop isolation (prevent data centre water from entering building HVAC water), leak detection under raised floor or at every CDU, and emergency drain-down procedures. Facility water loop modifications typically require 3–6 months lead time — plan ahead of GPU procurement.
Our DevOps and digital transformation teams advise on AI GPU infrastructure including DLC requirements, CDU specification, and data centre readiness assessment for H100/H200 deployments. Book a free advisory session.