Chips & Compute layer
Inference Cost Calculator
Per-million-tokens cost for self-hosted inference across H100 / H200 / B200 / MI300.
The engineer question
What does it cost to self-host a 70B model at 100k QPS?
Status · Coming soon
Where the data lives today
The current quarter snapshot is generated by the V17 Phase D Product Landscape module and lives on the parent topic page. The interactive comparator below ships once a structured spec datasource is wired.
View Chips & Compute product landscape →Inputs
- Model size + variant
- Throughput target (QPS)
- Hardware mix
Outputs
- $ / 1M tokens
- GPU hours / day
- Recommended cluster shape