Chips & Compute layer

Inference Cost Calculator

Per-million-tokens cost for self-hosted inference across H100 / H200 / B200 / MI300.

The engineer question
What does it cost to self-host a 70B model at 100k QPS?

Status · Coming soon

Where the data lives today

The current quarter snapshot is generated by the V17 Phase D Product Landscape module and lives on the parent topic page. The interactive comparator below ships once a structured spec datasource is wired.

View Chips & Compute product landscape →

Inputs

  • Model size + variant
  • Throughput target (QPS)
  • Hardware mix

Outputs

  • $ / 1M tokens
  • GPU hours / day
  • Recommended cluster shape

Related tools in the Chips & Compute layer

Get notified when Inference Cost Calculator numbers update

We refresh the inputs as the market moves. One email when they change.