Industry
Industry — AI Cloud Infrastructure
1. Industry in One Page
The AI cloud infrastructure industry rents GPU-accelerated compute, network, storage, and the orchestration software that make them usable for training and running AI models. Customers pay either by the hour (on-demand) or, increasingly, through multi-year reserved-capacity contracts that pre-pay billions of dollars of compute so an operator can finance the data centers and GPUs that satisfy them. Profits today exist because demand for GPUs outstrips installed supply: NVIDIA H100 rental prices rose roughly 40% in six months (SemiAnalysis, April 2026), and NBIS CEO Arkady Volozh told Reuters on May 13, 2026 that "several customers are competing for every GPU we bring online." The cycle pivots on three physical constraints that bottleneck at once — GPUs, grid-connected megawatts of power, and data-center construction lead times — and is amplified by the model-training arms race among hyperscalers and frontier labs. This is not a software business; it is a capital-intensive utility-style buildout where long-term winners need access to power and balance-sheet capacity, not just code.
Q1'26 Enterprise Cloud Spend (Synergy Research, $B)
Q1'26 YoY Cloud Growth
H100 Rental Price Move (6M, SemiAnalysis)
Projected Inference Share of 2026 Compute Demand
2. How This Industry Makes Money
The unit of sale is compute time on a GPU, priced in either GPU-hours (on-demand IaaS), reserved-capacity dollars per year over multi-year terms, or — at the higher-margin layer — tokens processed (inference). Cost of revenue is dominated by depreciation of GPUs and data-center equipment, plus electricity, colocation rent, and bandwidth. The economic shape resembles a regulated utility crossed with a leasing business: high upfront capex, multi-year asset life (GPUs depreciated typically 4–6 years), and revenue that ideally covers depreciation plus a spread that compounds at high utilization.
Key terms in plain English:
- Neocloud: A specialized cloud whose product is GPU compute, in contrast to general-purpose hyperscalers (AWS/Azure/GCP) whose AI offering is one product among many. CoreWeave, Nebius, Crusoe, and Lambda are the canonical neoclouds.
- ARR (Annualized Run-Rate Revenue): Last-month revenue × 12. Used because GPU revenue ramps faster than reported quarterly GAAP revenue catches up.
- Contracted backlog: Dollar value of signed multi-year capacity commitments not yet delivered.
- Contracted power: Megawatts of grid-connected power secured by signed land/utility contracts. Power has overtaken GPUs as the binding constraint on growth.
- Hyperscaler vs. neocloud: Hyperscalers sell broad enterprise software and increasingly buy GPU capacity from neoclouds (e.g., Microsoft buying $17B of capacity from Nebius). Neoclouds rent specialized AI infrastructure back to hyperscalers, model labs, and AI-native startups.
Margins are highest at the silicon layer and lowest in raw GPU rental once supply equalizes. Today the neocloud layer earns scarcity rents — gross margins comparable to a software business — but those margins compress when GPUs become abundant. Whether each neocloud can layer enough software (managed services, inference, fine-tuning, orchestration) on top of raw compute to keep blended margins above commodity IaaS once the shortage breaks is the strategic question.
Capital intensity is the single most distinguishing fact of the industry. Nebius spent $2.47B of capex against $399M of revenue in Q1 2026 — a capex-to-revenue ratio above 6x. CoreWeave has guided $30–35B of 2026 capex against a $5.1B 2025 revenue base. These businesses look like utilities mid-buildout, not asset-light SaaS.
3. Demand, Supply, and the Cycle
Demand has three engines that pulse on different rhythms. Frontier model training (the largest LLM and multimodal runs) is concentrated in a handful of buyers — Microsoft/OpenAI, Meta, Google, Anthropic, xAI — and tends to come in lumpy multi-billion-dollar commitments tied to model generations. Enterprise AI adoption is broader but slower, sitting in pilots and early production through 2025–2026. Inference — running, not training, models — is the fastest-growing pool and is projected by analysts to account for roughly two-thirds of compute demand by 2026 as deployed AI apps scale.
Supply is constrained by three things in order: power (multi-year grid interconnect queues, particularly in the US), GPUs (allocation from NVIDIA, which prioritizes large committed buyers), and data-center construction (18–36 month builds with HVAC, water, cooling, and dense fiber). When even one of the three is short, the others sit idle. Today all three are tight simultaneously, which is what produces the pricing power neoclouds are currently earning.
The industry has not experienced a full downturn since AI compute became a category in 2023. The most likely template is the telecom-fiber bust (2001–2002) and the crypto-mining unwind (2022–2023): a glut of capacity, GPU rental prices that fall faster than depreciation, and operators carrying long-dated debt against assets that depreciate four times faster than data-center shells. The first signal of a turn would show up in spot GPU pricing and in shorter contract tenors at renewal.
The industry's defining macro risk is a supply-demand inversion. Morningstar's Javier Correonero (March 2026): "Once the supply/demand situation of artificial intelligence normalizes … neocloud companies that simply purchase and rent GPUs without any real differentiation will have important struggles, as their current business model looks commoditized."
4. Competitive Structure
The market is a barbell. At one end sit three hyperscalers — AWS (28% share of Q1 2026 enterprise cloud spend per Synergy Research), Microsoft Azure (21%), and Google Cloud (14%) — which combined hold roughly 63% of total cloud spend and increasingly buy AI capacity from neoclouds rather than only sell their own. At the other end sits a fast-growing tier of specialized AI cloud builders — CoreWeave, OpenAI's own infrastructure, Oracle (a hyperscaler-class entrant via OCI), Crusoe, Nebius, Anthropic's stack, ByteDance — that Synergy explicitly singled out as Q1 2026's fastest-growing tier-2 providers. The middle of the market is thin.
Three structural observations matter. First, the market is not winner-take-all at the IaaS layer — frontier customers actively want to diversify away from a single hyperscaler. Microsoft buying $17B of compute from Nebius and Meta buying $27B from Nebius are explicit second-sourcing decisions. Second, customer concentration is the single biggest risk for every neocloud: Nebius's two hyperscaler contracts alone are reported at roughly $46B of contracted backlog, and CoreWeave's revenue is similarly concentrated. Third, the bitcoin-miner-to-AI pivot peers (IREN, APLD, HUT) are real participants but trail neoclouds on the software stack, so they tend to sell power + halls + hosting rather than fully managed AI cloud.
5. Regulation, Technology, and Rules of the Game
The industry sits at the intersection of three regulatory frameworks (AI conduct, export control, data/energy) plus a fast-moving technology refresh cycle. Each can move unit economics by 10–30 percentage points if it cuts the wrong way.
The two regulations most likely to move investor judgement are US export controls (which both constrain NBIS's geographic expansion and concentrate global supply in the operators that can comply) and grid-interconnect cost allocation (the new bottleneck on greenfield data-center economics — the FY2025 20-F flags it directly). The technology refresh that matters most this year is NVIDIA's Vera Rubin NVL72 platform arriving in H2 2026, which will drive the next training-capex cycle.
6. The Metrics Professionals Watch
Most general-cloud KPIs (DAUs, ARPU, ARPC) are useless here. The metrics below are what AI infrastructure analysts actually model.
Industry KPI Intensity by Operator Type (1=low, 5=high — illustrative)
7. Where Nebius Group N.V. Fits
Nebius is a specialized AI cloud (neocloud) with an integrated full-stack model: it designs servers and racks in-house, owns most of its data-center capacity (>75% of contracted power is owned, not leased), holds NVIDIA Reference Platform Cloud Partner status, and is layering an inference / token-priced platform (Token Factory, accelerated by the May 2026 $643M Eigen AI acquisition) on top of raw GPU rental. Within the cohort, Nebius sits between CoreWeave (the largest pure-play neocloud) and the BTC-pivot peers (IREN/APLD/HUT) — closer to CoreWeave in product depth, but distinctively European-domiciled with most owned capacity and a multi-business holding structure (Nebius core + Avride autonomous vehicles + TripleTen edtech + equity stakes in ClickHouse and Toloka).
Q1'26 Revenue ($M)
Q1'26 YoY Growth
ARR at YE 2025 ($M)
Reported Backlog ($M)
Nebius is best understood as a power-anchored, customer-financed AI infrastructure utility-in-the-making, with an in-house hardware and software stack and an emerging inference platform. The economics that matter for the rest of this report are: cost of power, GPU allocation, contract tenor, financing cost, and the speed at which the software / inference layer scales relative to the raw IaaS layer.
8. What to Watch First
One-line industry read: AI cloud is structurally short, financially intense, and politically watched. The cycle is currently in the early-build phase — capital-friendly, scarcity-priced — and the operators that have power, GPU allocation, and financing access (Nebius among them) compound fastest. The same conditions that make this attractive now would make it brutal in a supply glut. Watch power, GPU spot pricing, and hyperscaler capex first; everything else follows.