Energy layer
Cluster Buildout Sizer
Go from a power envelope (MW) or a GPU count to the reciprocal, plus cooling tonnage and 5-yr TCO — one chained buildout view.
The engineer question
If I have N MW (or want N GPUs), how big a cluster, how much cooling, and what 5-yr TCO?
Result
- GPUs supportedat nameplate; subtract ~20% for idle/between-job
- 32,051 × B200
- Power envelope (input)
- 50 MW
- IT load (accelerators + servers)+30% server overhead on 32.1 MW of accelerators
- 41.7 MW
- Cooling loadheat ≈ IT load; 1 RT = 3.517 kW
- 11,847 RT
- Recommended cooling
- Direct-to-chip liquid (cold plate)
- Acquisition capex32,051 × ~$40,000 (rough street price)
- $1.28 B
- Power cost (5-yr)@ 70% util, $0.08/kWh
- $94.34 M
- 5-yr TCO (capex + power)excludes networking, storage, facility build — see assumptions
- $1.38 B
- $ / effective PFLOP-yrdense FP16, idealized — for relative comparison only
- $5.5 k
Recommendation
50 MW fits a single hyperscale building; substation-upgrade lead-time 12–24 months in top markets (VA Loudoun, TX Dallas, OR Hillsboro, IA Des Moines). Cooling: direct-to-chip liquid (cold plate) — 1000W/accelerator forces liquid. The binding constraint at this scale is usually power-interconnect timing and networking (NVLink/InfiniBand) cost — neither of which this sizer prices.
Assumptions
- · B200 (Blackwell): 1000 W TDP, ~$40,000 street price, 2250 TFLOPS dense FP16. TDP + overhead reconcile with the Datacenter Power Calculator; price + throughput with the GPU TCO Calculator.
- · Power chain: facility W per accelerator = TDP × 1.3 server overhead × PUE 1.20. GPU↔MW conversion is reversible and assumes nameplate (not utilization-derated) power for facility sizing — the grid contract must cover peak.
- · Cooling: heat to remove = IT load (≈100% of IT power becomes heat), tonnage at 1 RT = 3.517 kW (ASHRAE). The cooling regime is the per-accelerator-density default for the generation, not a per-rack CFD result.
- · TCO: 5-yr energy = TDP × 70% utilization × PUE 1.20 × 8760 h/yr × 5 yr × $0.08/kWh. "Effective PFLOP-yr" = dense FP16 × utilization × count × lifetime — idealized, no MFU applied, so absolute $/PFLOP is optimistic (use for relative comparison).
- · Street prices are ROUGH public estimates (NVIDIA publishes no list price); ±20–40% by volume, SKU, and quarter. Not an audited quote.
- · EXCLUDED: networking fabric (NVLink/NVSwitch/InfiniBand — often 15–30% of a Blackwell rack), storage, host CPU/memory, datacenter build/lease, cooling capex (CDUs, plant), staff/ops, software, financing, resale, water (WUE), and grid-interconnect risk. Single-generation cluster assumed.
Worked example (default inputs)
Result
- GPUs supportedat nameplate; subtract ~20% for idle/between-job
- 32,051 × B200
- Power envelope (input)
- 50 MW
- IT load (accelerators + servers)+30% server overhead on 32.1 MW of accelerators
- 41.7 MW
- Cooling loadheat ≈ IT load; 1 RT = 3.517 kW
- 11,847 RT
- Recommended cooling
- Direct-to-chip liquid (cold plate)
- Acquisition capex32,051 × ~$40,000 (rough street price)
- $1.28 B
- Power cost (5-yr)@ 70% util, $0.08/kWh
- $94.34 M
- 5-yr TCO (capex + power)excludes networking, storage, facility build — see assumptions
- $1.38 B
- $ / effective PFLOP-yrdense FP16, idealized — for relative comparison only
- $5.5 k
Recommendation
50 MW fits a single hyperscale building; substation-upgrade lead-time 12–24 months in top markets (VA Loudoun, TX Dallas, OR Hillsboro, IA Des Moines). Cooling: direct-to-chip liquid (cold plate) — 1000W/accelerator forces liquid. The binding constraint at this scale is usually power-interconnect timing and networking (NVLink/InfiniBand) cost — neither of which this sizer prices.
Assumptions
- · B200 (Blackwell): 1000 W TDP, ~$40,000 street price, 2250 TFLOPS dense FP16. TDP + overhead reconcile with the Datacenter Power Calculator; price + throughput with the GPU TCO Calculator.
- · Power chain: facility W per accelerator = TDP × 1.3 server overhead × PUE 1.20. GPU↔MW conversion is reversible and assumes nameplate (not utilization-derated) power for facility sizing — the grid contract must cover peak.
- · Cooling: heat to remove = IT load (≈100% of IT power becomes heat), tonnage at 1 RT = 3.517 kW (ASHRAE). The cooling regime is the per-accelerator-density default for the generation, not a per-rack CFD result.
- · TCO: 5-yr energy = TDP × 70% utilization × PUE 1.20 × 8760 h/yr × 5 yr × $0.08/kWh. "Effective PFLOP-yr" = dense FP16 × utilization × count × lifetime — idealized, no MFU applied, so absolute $/PFLOP is optimistic (use for relative comparison).
- · Street prices are ROUGH public estimates (NVIDIA publishes no list price); ±20–40% by volume, SKU, and quarter. Not an audited quote.
- · EXCLUDED: networking fabric (NVLink/NVSwitch/InfiniBand — often 15–30% of a Blackwell rack), storage, host CPU/memory, datacenter build/lease, cooling capex (CDUs, plant), staff/ops, software, financing, resale, water (WUE), and grid-interconnect risk. Single-generation cluster assumed.
Related tools in the Energy layer
Datacenter Power Calculator
Estimate MW capacity required for an AI training or inference cluster of a given size.
PUE Comparison
Compare PUE benchmarks across hyperscale operators, regions, and cooling approaches.
US Electricity Cost by Region
Per-state industrial electricity rate snapshot + 12-month trend, indexed to datacenter footprint.
Cooling Load Estimator
Estimate the cooling tonnage / liquid-cooling capacity needed for a given rack density.