GPU cloud providers

GPU Cloud & Neocloud Providers: The AI Compute Landscape

A neutral research guide to GPU cloud and neocloud providers — the anchor-tenant business model, GPU-collateralized financing, inference specialists, and why power siting decides the roadmap.

A new class of cloud provider has grown up around AI compute. Often called "neoclouds," these are GPU-specialized clouds — CoreWeave, Lambda, Crusoe, Nebius, Vultr, Applied Digital and others — that rent accelerator capacity rather than offering the full general-purpose stack of a hyperscaler. They emerged in large part because hyperscaler GPU capacity could not keep up with demand.

This article maps the GPU cloud landscape in neutral, infrastructure terms — not as an investment thesis. It covers what a neocloud is, the anchor-tenant contracts and GPU-collateralized debt that finance them, the inference-serving specialists layered on top, and why power and siting end up deciding the roadmap. For the demand-side context, it pairs with AI Data Center Power Demand.

What Is A Neocloud

A neocloud is a cloud built GPU-first: dense accelerator clusters, high-speed interconnect, and a thinner services layer than AWS, Azure, or Google Cloud. The category grew as demand for NVIDIA capacity outran what hyperscalers could provision. One illustrative pattern: CoreWeave was reported to have signed a multi-year arrangement, sized in the billions, to backfill Azure OpenAI capacity shortfalls starting in the mid-2020s.

The research distinction that matters is structural. A hyperscaler sells thousands of services to millions of customers; a neocloud sells GPU compute, often to a small number of very large customers, with economics that look more like a capital-intensive infrastructure project than a software business.

The Anchor-Tenant Model

Neoclouds are typically financed against large, multi-year anchor contracts. CoreWeave's filings disclosed heavy revenue concentration — one large customer accounted for roughly 62% of revenue per its S-1 — alongside a reported ~$11.9B multi-year agreement with OpenAI that also involved an equity stake. Nebius disclosed a roughly $17.4B, five-year agreement to supply dedicated GPU capacity to a hyperscaler from a New Jersey data center.

The anchor model is both the growth engine and the risk. A signed multi-billion-dollar contract underwrites the capital required to buy GPUs and build sites, but it concentrates exposure on a handful of counterparties. The clean research questions are contract duration, customer concentration, and what happens to utilization if a single anchor shifts capacity in-house.

GPU-Collateralized Financing

Buying GPUs at this scale requires debt, and a distinctive financing pattern emerged: borrowing collateralized by the GPUs themselves. CoreWeave's S-1 disclosed a debt stack of several billion dollars primarily collateralized by GPU inventory; Lambda secured a GPU-collateralized credit facility from a major bank; and Crusoe arranged multi-billion-dollar project debt tied to a large campus buildout.

This is worth tracking carefully because GPU-collateralized debt couples a provider's balance sheet to the depreciation curve of its accelerators. If a GPU generation depreciates faster than the financing assumes, the collateral value and the economics move together. It is a new pattern that deserves primary-source verification rather than assumption.

Inference-Serving Specialists

A second tier of providers serves inference rather than renting raw GPUs. Together AI, Fireworks AI, Replicate, and Anyscale (the commercial entity behind the open-source Ray framework) sell managed inference, fine-tuning, and serving infrastructure, often priced per token or per second rather than per GPU-hour. Cloudflare runs inference at the edge through its Workers AI product across a large global network.

The research lens for this tier is different. Instead of capital intensity and anchor contracts, the questions are serving performance, model coverage, pricing per token, and developer adoption. A neocloud and an inference specialist can both be "AI compute" companies while operating very different businesses.

Power And Siting Decide The Roadmap

GPU clouds are, increasingly, power projects. Crusoe's Texas campus has been described as targeting on the order of 1.2 GW of total capacity and anchors a large buildout; Nebius's New Jersey facility targets several hundred megawatts; and Applied Digital's North Dakota campus targets a few hundred megawatts of IT capacity for HPC tenants. Several of these providers trace their origins to power-advantaged siting, including stranded or flared natural gas.

This is why GPU cloud research cannot be separated from the grid. The binding constraint on a neocloud's roadmap is often megawatts and interconnect timelines, not chips. See AI Data Center Power Demand for the demand-side view and the supply chain map for the broader dependency picture.

What To Track In GPU Cloud Research

Useful markers include anchor-contract size and duration, customer concentration, GPU-collateralized debt and its depreciation assumptions, power and siting announcements, utilization, and the split between raw-GPU rental and inference serving. Where a provider cites a specific contract value, capacity, or valuation, verify it in filings and primary disclosures before treating it as established.

Use the AI Infrastructure Stack Map to place GPU clouds in the broader stack, and the related power pages for the constraint that most often gates their growth. The objective is to understand the GPU cloud business as infrastructure, not to make investment recommendations.

Summary

GPU cloud and neocloud providers form a capital-intensive layer of AI infrastructure: GPU-first clouds financed by large anchor contracts and GPU-collateralized debt, with a second tier of inference-serving specialists layered on top. Power and siting frequently decide how fast any of them can grow.

The right research language is specific and neutral: anchor contracts, customer concentration, GPU-collateralized debt, utilization, megawatts, and interconnect timelines. Track those signals and verify numeric claims from primary sources rather than turning the GPU cloud story into financial advice.