GPU Spot Prices Surge 114% in Six Weeks (2 minute read)

NVIDIA's B200 GPU rental prices surged 114% in six weeks to $4.95/hour, driven by demand from frontier AI models that require newer chip architectures.

What: The Ornn Compute Price Index shows NVIDIA's B200 GPU spot rental prices jumped from $2.31/hour in early March 2026 to $4.95/hour in late April. The price premium over previous-generation H200 chips also doubled from $0.28 to $1.80/hour.

Why it matters: This signals that inference costs are rising faster than algorithmic and hardware efficiency gains can offset them. The pattern shows that newer foundation models increasingly require cutting-edge hardware features like expanded memory, creating a depreciation cycle for older GPU generations and persistent upward pricing pressure.

Takeaway: Developers running AI workloads should note that spot market prices lead contract pricing by approximately 90 days, meaning B200 rates will likely settle above $5/hour through summer 2026.

Deep dive

Major AI model launches since September 2025 directly correlate with B200 GPU price spikes, suggesting demand shocks drive pricing more than gradual growth
GPT-5.5's expanded context window requires memory capabilities only available on Blackwell architecture, forcing users to pay premium rates for newer chips
Price spread across different cloud providers has more than doubled since September 2025, indicating an opaque market with information asymmetry about supply deliveries and capacity resales
B200 launched at a premium over H200 in September 2025, then collapsed to near-parity ($0.28 gap) by November as supply flooded the market
Since GPT-5.3-Codex launched in February 2026, the pricing gap re-widened to $1.80, approaching launch levels and signaling accelerated depreciation for H200 chips
The widening premium represents both scarcity value for B200 and depreciation signal for H200 as new models demand newer architectures
Cloud providers are regaining pricing power after six months of margin compression in late 2025
The market remains opaque with uncertainty about hyperscaler delivery schedules and which AI startups are offloading excess capacity at discounts
Inference at the frontier is becoming more expensive as inflationary demand from new models outpaces deflationary improvements from better algorithms and chips

Decoder

B200 (Blackwell): NVIDIA's latest generation GPU with expanded memory and inference capabilities, launched September 2025
H200 (Hopper): NVIDIA's previous generation GPU, now being priced out by models requiring newer architecture features
Spot market: On-demand GPU rental pricing that fluctuates based on real-time supply and demand, as opposed to fixed long-term contracts
Ornn Compute Price Index: Market index tracking GPU rental prices across cloud providers
Inference density: How many AI model inferences a GPU can handle simultaneously, a key performance metric for serving models

Original article

NVIDIA's latest GPU rental prices on the Ornn Compute Price Index hit $4.95 per hour this week, up from $2.31 in early March : a 114% surge in six weeks.1

The price spread over prior-generation chips doubled from $0.28 to $1.80 per hour. The new chip is NVIDIA's B200 (Blackwell); the prior generation is the H200 (Hopper).

The GPU market is becoming lucid - even if the fog hasn't lifted.

1. Frontier model releases correlate with demand shocks

The price spikes line up with major model launches. Every major model release since September 2025 preceded or coincided with jumps in B200 pricing.

GPT-5.5's expanded context window requires the memory headroom that only Blackwell provides.2

The correlation isn't perfect. Supply shocks matter too. But the pattern is clear : newer models need newer chips.

2. The gap between cheapest & most expensive providers is blowing out

In September 2025, B200 prices across providers clustered tightly. Today the spread has more than doubled. Some providers still offer B200 at near-H200 prices. Others command scarcity premiums.

This bears the hallmarks of an opaque market with big supply/demand shocks. When is a hyperscaler receiving a new delivery? Which AI startup overbought capacity & is now selling at a discount? Opaque everywhere you look.

3. The B200-over-H200 price gap collapsed, then recovered

When B200 came to market in September 2025, it cost more per hour than H200. Buyers paid up for the extra memory & inference density.

By November, that gap collapsed to $0.28 as supply flooded the market. For a brief window, B200 & H200 reached near price parity.

Since February when GPT-5.3-Codex launched, the spread re-widened. The current $1.80 gap is back near launch levels.

The widening gap is also a depreciation signal : older chips lose value when new models demand new architectures.

For cloud providers, pricing power is returning. After six months of margin compression, the sellers' market is back.

For AI startups, the spot market leads contract pricing by ~90 days. B200 likely settles above $5.00 for the summer.

For model builders, inference at the frontier is getting more expensive.

Inflationary demand outpaces deflationary algorithmic & chip improvements, but the fog of the GPU market continues.

Ornn Compute Price Index, daily index values for B200 & H200 GPUs, Sep 2025 – Apr 2026. ↩︎
OpenAI, "Introducing GPT-5.5," Apr 23, 2026. ↩︎