Skip to main content

Inference-as-a-Service — Timeline

45 milestones, source-traced.

  1. Jul 7, 2024

    Funding: Fireworks AI raised a $52M Series B led by Sequoia Capital with participation from…

    Fireworks AI raised a $52M Series B led by Sequoia Capital with participation from NVIDIA, AMD, and MongoDB Ventures at a $552M post-money valuation.

    funding-research
  2. Aug 5, 2024

    Funding: Groq raised $640 million in Series D at a $2.8 billion valuation, led by…

    Groq raised $640 million in Series D at a $2.8 billion valuation, led by BlackRock Private Equity Partners with participation from Neuberger Berman, Type One Ventures, Cisco Investments, KDDI Open Innovation Fund III, and Samsung Catalyst Fund.

    funding-research
  3. Aug 5, 2024

    Funding: Groq raised a $640 million Series D led by BlackRock Private Equity Partners at…

    Groq raised a $640 million Series D led by BlackRock Private Equity Partners at a $2.8 billion post-money valuation, with participation from Neuberger Berman, Type One Ventures, Cisco Investments, KDDI Open Innovation Fund III, and Samsung Catalyst Fund.

    funding-research
  4. Sep 27, 2024

    Funding: Cerebras raised $85 million in Series F-1 at a $2.87 billion post-money valuation.

    funding-research
  5. Nov 1, 2024

    Partnership: Together AI announced a partnership with Hypertec to co-build a cluster of 36,000 Nvidia…

    Together AI announced a partnership with Hypertec to co-build a cluster of 36,000 Nvidia GB200 NVL72 GPUs.

    funding-research
  6. Nov 1, 2024

    Partnership: Together AI partnered with Hypertec to co-build a cluster of 36,000 NVIDIA GB200 NVL72…

    Together AI partnered with Hypertec to co-build a cluster of 36,000 NVIDIA GB200 NVL72 GPUs.

    funding-research
  7. Dec 1, 2024

    Funding: Baseten closed a $150M Series D at a $2.15B valuation led by BOND, with…

    Baseten closed a $150M Series D at a $2.15B valuation led by BOND, with participation from Conviction, CapitalG, Premji Invest, 01A, IVP, Spark, Greylock, Scribble Ventures, BoxGroup, and Kevin & Elizabeth Weil.

    funding-research
  8. Feb 1, 2025

    Product Launch: Together AI launched Together GPU Clusters powered by Nvidia Blackwell GPUs.

    funding-research
  9. Feb 1, 2025

    Partnership: Groq secured a $1.5 billion commitment from Saudi Arabia to expand AI chip distribution…

    Groq secured a $1.5 billion commitment from Saudi Arabia to expand AI chip distribution in the country, with projected $500 million in 2025 revenue.

    funding-research
  10. Feb 1, 2025

    Funding: Baseten raised a $75M Series C at an $825M valuation, led by IVP and…

    Baseten raised a $75M Series C at an $825M valuation, led by IVP and Spark Capital.

    funding-research
  11. Feb 10, 2025

    Partnership: Groq signed a $1.5 billion commitment with Saudi Arabia to expand LPU-based AI inference…

    Groq signed a $1.5 billion commitment with Saudi Arabia to expand LPU-based AI inference infrastructure, including a new GroqCloud data center in Dammam.

    funding-research
  12. Feb 20, 2025

    Funding: Together AI closed a $305 million Series B round led by General Catalyst with…

    Together AI closed a $305 million Series B round led by General Catalyst with co-lead Prosperity7 at a $3.3 billion valuation.

    funding-research
  13. Feb 20, 2025

    Funding: Together AI closed a $305 million Series B led by General Catalyst and co-led…

    Together AI closed a $305 million Series B led by General Catalyst and co-led by Prosperity7 at a $3.3 billion valuation.

    funding-research
  14. Sep 1, 2025

    Product Launch: Together AI launched Together Instant Clusters, enabling automated GPU cluster provisioning from a single…

    Together AI launched Together Instant Clusters, enabling automated GPU cluster provisioning from a single node up to hundreds of GPUs.

    funding-research
  15. Sep 1, 2025

    Funding: Baseten raised a $150M Series D at a $2.15B valuation, led by BOND with…

    Baseten raised a $150M Series D at a $2.15B valuation, led by BOND with Conviction and CapitalG participating.

    funding-research
  16. Sep 1, 2025

    Product Launch: Together AI launched Together Instant Clusters, offering automated GPU cluster provisioning from 8 GPUs…

    Together AI launched Together Instant Clusters, offering automated GPU cluster provisioning from 8 GPUs to hundreds.

    funding-research
  17. Sep 17, 2025

    Funding: Groq closed a $750 million Series E round at a $6.9 billion post-money valuation,…

    Groq closed a $750 million Series E round at a $6.9 billion post-money valuation, led by Disruptive with participation from BlackRock, Neuberger Berman, and DTCP.

    funding-research
  18. Sep 17, 2025

    Funding: Groq raised $750 million led by Disruptive at a $6.9 billion post-money valuation, with…

    Groq raised $750 million led by Disruptive at a $6.9 billion post-money valuation, with participation from BlackRock, Neuberger Berman, DTCP, Samsung, Cisco, D1, Altimeter, 1789 Capital, and Infinitum.

    funding-research
  19. Sep 30, 2025

    Funding: Cerebras Systems raised $1.1 billion in Series G led by Fidelity Management & Research…

    Cerebras Systems raised $1.1 billion in Series G led by Fidelity Management & Research Company and Atreides Management at a post-money valuation of $8.1 billion.

    funding-research
  20. Sep 30, 2025

    Funding: Cerebras raised $1.1 billion in Series G led by Fidelity and Atreides Management at…

    Cerebras raised $1.1 billion in Series G led by Fidelity and Atreides Management at an $8.1 billion post-money valuation.

    funding-research
  21. Oct 1, 2025

    Funding: Fireworks AI raised $250M Series C co-led by Lightspeed Venture Partners, Index Ventures, and…

    Fireworks AI raised $250M Series C co-led by Lightspeed Venture Partners, Index Ventures, and Evantic at a $4B post-money valuation.

    funding-research
  22. Oct 1, 2025

    Customer Win: Fireworks AI customer base grew to over 10,000 companies by October 2025, up from…

    Fireworks AI customer base grew to over 10,000 companies by October 2025, up from ~1,000 at Series B.

    funding-research
  23. Oct 1, 2025

    Funding: Fireworks AI raised a $250M Series C co-led by Lightspeed Venture Partners, Index Ventures,…

    Fireworks AI raised a $250M Series C co-led by Lightspeed Venture Partners, Index Ventures, and Evantic with continued support from Sequoia Capital at a $4B post-money valuation.

    funding-research
  24. Dec 1, 2025

    Groq builds custom LPU inference silicon; NVIDIA struck a ~$20B non-exclusive LPU license and…

    Groq builds custom LPU inference silicon; NVIDIA struck a ~$20B non-exclusive LPU license and hired Groq's founder (Dec 2025).

    Knowledge base
  25. Dec 1, 2025

    Partnership: Groq signed a $20 billion non-exclusive licensing agreement with NVIDIA covering Groq's AI inference…

    Groq signed a $20 billion non-exclusive licensing agreement with NVIDIA covering Groq's AI inference technology, with founder Jonathan Ross and president Sunny Madra joining NVIDIA.

    funding-research
  26. Jan 1, 2026

    Funding: Baseten raised a $300M Series E at a $5B valuation, led by IVP and…

    Baseten raised a $300M Series E at a $5B valuation, led by IVP and CapitalG, with Nvidia investing $150M.

    funding-research
  27. Feb 1, 2026

    Cerebras, a wafer-scale inference chipmaker, completed its IPO in 2026 (~$66B day-one market cap).

    Knowledge base
  28. Feb 1, 2026

    Customer Win: Cerebras signed a commercial agreement with OpenAI valued at over $10 billion to deliver…

    Cerebras signed a commercial agreement with OpenAI valued at over $10 billion to deliver 750 megawatts of compute capacity by 2028.

    funding-research
  29. Feb 3, 2026

    Funding: Cerebras Systems raised $1 billion in Series H led by Tiger Global at a…

    Cerebras Systems raised $1 billion in Series H led by Tiger Global at a post-money valuation of approximately $23 billion.

    funding-research
  30. Feb 3, 2026

    Funding: Cerebras raised $1.0 billion in Series H led by Tiger Global at an approximate…

    Cerebras raised $1.0 billion in Series H led by Tiger Global at an approximate $23 billion post-money valuation.

    funding-research
  31. Mar 1, 2026

    Funding: Baseten raised a $300M Series E at a $5B valuation led by IVP and…

    Baseten raised a $300M Series E at a $5B valuation led by IVP and CapitalG, with NVIDIA contributing approximately $150M alongside 01A, Altimeter, Battery Ventures, BOND, BoxGroup, Blackbird Ventures, Conviction, and Greylock.

    funding-research
  32. Apr 15, 2026

    Funding: Cerebras Systems secured an $850 million revolving credit facility arranged by Morgan Stanley, Citi,…

    Cerebras Systems secured an $850 million revolving credit facility arranged by Morgan Stanley, Citi, Barclays, UBS and others.

    funding-research
  33. Apr 15, 2026

    Funding: Cerebras secured $850 million in debt financing via a credit facility with Morgan Stanley,…

    Cerebras secured $850 million in debt financing via a credit facility with Morgan Stanley, Citi, Barclays, and UBS.

    funding-research
  34. May 1, 2026

    Baseten, an inference-serving platform, reached a ~$13B valuation after a ~$1.5B raise.

    Knowledge base
  35. May 1, 2026

    Fireworks AI, a fast open-model inference platform, was reportedly raising at around a ~$15B…

    Fireworks AI, a fast open-model inference platform, was reportedly raising at around a ~$15B valuation.

    Knowledge base
  36. May 1, 2026

    Together AI reached roughly ~$1B in annual recurring revenue serving open models via API.

    Knowledge base
  37. May 1, 2026

    Nebius's Token Factory offers managed inference; Nebius's Q1 2026 revenue grew ~684% year-over-year.

    Knowledge base
  38. Jun 1, 2026

    The inference-service cohort — Baseten, Fireworks, Together, Nebius — is among the best-funded categories…

    The inference-service cohort — Baseten, Fireworks, Together, Nebius — is among the best-funded categories in AI infrastructure.

    Knowledge base
  39. Jun 1, 2026

    Open-weight models (GLM, Qwen, DeepSeek, Llama) at roughly 1/6 the cost of frontier models…

    Open-weight models (GLM, Qwen, DeepSeek, Llama) at roughly 1/6 the cost of frontier models drive inference-service economics.

    Knowledge base
  40. Jun 1, 2026

    Inference now accounts for ~2/3 of AI accelerator demand in 2026, up from ~1/2…

    Inference now accounts for ~2/3 of AI accelerator demand in 2026, up from ~1/2 in 2025 and ~1/3 in 2023 (Deloitte).

    Knowledge base
  41. Jun 1, 2026

    Inference-as-a-service decouples model serving from raw GPU rental — buyers pay per token, not…

    Inference-as-a-service decouples model serving from raw GPU rental — buyers pay per token, not per GPU-hour.

    Knowledge base
  42. Jun 1, 2026

    Customer Win: Cerebras announced a $10 billion compute deal with OpenAI to deliver 750 megawatts of…

    Cerebras announced a $10 billion compute deal with OpenAI to deliver 750 megawatts of AI compute capacity by 2028.

    funding-research
  43. Jun 1, 2026

    Funding: Baseten closed a $1.5B Series F (split-priced) at a $13B post-money valuation, co-led by…

    Baseten closed a $1.5B Series F (split-priced) at a $13B post-money valuation, co-led by Spark Capital, Sands Capital, Altimeter Capital, and Wellington Management.

    funding-research
  44. Jun 1, 2026

    Partnership: OpenAI negotiated a deal granting it a $6 billion stake in Cerebras post-IPO ahead…

    OpenAI negotiated a deal granting it a $6 billion stake in Cerebras post-IPO ahead of Cerebras' planned 2026 IPO.

    funding-research
  45. Jun 22, 2026

    Funding: Groq announced a $650 million growth capital pro-rata round led by Disruptive and Infinitum…

    Groq announced a $650 million growth capital pro-rata round led by Disruptive and Infinitum to fund its pivot to an AI inference cloud business and scale toward 200 MW by 2027.

    funding-research

Milestones merged from 0 curated events, 10 verified facts (with observed dates), and 35 business signals from the last 24 months. Deduped by date + label; curated entries take precedence.

← Back to Inference-as-a-Service