Think all GPUs are created equal? Think again. Research shows that chips of the same model can deliver wildly different performance levels, turning GPU rentals from cloud providers into a high-stakes gamble.

This phenomenon, known as the silicon lottery, has been documented since at least 2022, when researchers at the University of Wisconsin linked it to inconsistencies in GPU-dependent supercomputers. The effect is even more pronounced for AI cloud customers, according to experts.

Study Reveals Performance Gaps in Cloud GPUs

Researchers from the College of William & Mary, Jefferson Lab, and Silicon Data conducted a study to quantify these disparities. They ran 6,800 benchmark tests on 3,500 randomly selected GPUs across 11 cloud providers. The GPUs included 11 Nvidia models, with the most advanced being the Nvidia H200 SXM.

To assess performance, the team used SiliconMark, a benchmark designed to evaluate a GPU’s ability to run large language models (LLMs). The test measured two key metrics:

  • 16-bit floating-point computing performance, expressed in trillions of operations per second (TOPS)
  • Internal memory bandwidth, measured in gigabytes per second (GB/s)

Key Findings: How Much Performance Can Vary

The results highlighted significant variability:

  • For 259 H100 PCIe GPUs, computing performance varied by up to 34.5%.
  • For 253 H200 SXM GPUs, memory bandwidth varied by up to 38%.

While factors like cooling, cloud provider configurations, and chip usage history can influence performance, the study found that manufacturing inconsistencies were the primary cause of these disparities.

Why This Matters for AI Workloads

The unpredictability of GPU performance has real financial implications. Users renting GPUs may end up paying more for a high-end model, only to receive performance comparable to an older, cheaper chip. This unpredictability makes it difficult to budget for AI projects or ensure consistent results.

“It’s called the silicon lottery.”

Carmen Li, Founder and CEO of Silicon Data

What GPU Renters Can Do to Mitigate Risk

Experts recommend taking a proactive approach to ensure you’re getting the performance you pay for:

“The most practical approach is to benchmark the actual rental they receive. Running a benchmark tool [such as SiliconMark] allows them to compare their specific instance’s performance against a broader corpus of data.”

Jason Cornick, Head of Infrastructure at Silicon Data

By testing GPU performance before committing to long-term rentals, users can avoid the pitfalls of the silicon lottery and make more informed decisions.