HBM Supply Chain: Memory as a Structural Bottleneck

By Ahijah Ireland·February 20, 2025·4 min read

Why Memory Became the Constraint

The conventional understanding of AI hardware focuses on GPU compute — the number of CUDA cores, the floating-point operations per second, the die area allocated to tensor processing. Practitioners who have built and operated large model training runs know the real bottleneck emerged elsewhere: the connection between the compute die and the memory that feeds it data.

Modern large language models require moving enormous volumes of parameters and activations through the processor at speeds that standard DRAM cannot sustain. A high-end GPU without sufficient memory bandwidth is like a factory floor with too few loading docks — the machinery sits idle waiting for material. This is the problem High Bandwidth Memory was designed to solve.

HBM stacks multiple DRAM dies vertically using a technology called through-silicon vias (TSVs), then connects the stack to the processor through an extremely wide silicon interposer. The result is memory bandwidth that is 10 to 15 times higher than standard LPDDR5, delivered in a package that sits physically adjacent to the compute die. NVIDIA's H100 GPU ships with 80GB of HBM3 providing 3.35 terabytes per second of bandwidth. Without HBM, that compute capability would be largely unusable for transformer model workloads.

Supply Chain Concentration and Dynamics

HBM production is concentrated in a way that creates both structural constraint and investment opportunity. Three companies manufacture HBM today: SK Hynix, Samsung, and Micron Technology. Of these three, SK Hynix holds roughly 50% of total HBM capacity and was the first to qualify HBM3e with NVIDIA's Blackwell platform. Samsung holds approximately 40%, though it has faced qualification challenges with its HBM3e product. Micron is the third entrant, with a smaller but growing share.

The manufacturing process for HBM is substantially more complex than standard DRAM. The TSV drilling and stacking process requires additional equipment, longer cycle times, and higher defect rates — all of which constrain output. When NVIDIA ramps a new GPU generation, it typically requires a corresponding ramp in HBM production. Because HBM capacity cannot be expanded quickly (wafer starts take 6 to 8 weeks, and capacity additions require 12 to 18 months lead time for equipment), supply tightness is structurally persistent.

This means HBM is priced at a substantial premium to commodity DRAM — current pricing is 5 to 8 times standard DDR5 on a per-bit basis — and is sold under long-term supply agreements rather than in the spot market. Hyperscalers securing AI accelerator supply are simultaneously securing HBM supply through commitments to the memory manufacturers.

The Packaging Bottleneck Within the Bottleneck

HBM requires advanced packaging to integrate with the compute die. The dominant approach is a 2.5D integration where both the GPU die and HBM stacks sit on a large silicon interposer, manufactured through a process called CoWoS (chip-on-wafer-on-substrate) developed by TSMC. CoWoS packaging capacity has emerged as a constraint within the HBM supply chain — there is sufficient memory capacity but insufficient capacity to package it with the compute dies.

TSMC has been aggressively expanding CoWoS capacity, but the process requires specialized equipment and cleanroom space that cannot be added quickly. This packaging bottleneck briefly made CoWoS capacity the binding constraint on H100 supply in 2023, and is a dynamic worth tracking as Blackwell ramps.

The investment implication of the packaging bottleneck is that TSMC has pricing power not just for its leading-edge process nodes but for its advanced packaging capacity as well. Companies supplying packaging equipment — particularly for advanced integration technologies — also benefit from this dynamic.

Memory as an Investment Category

Memory semiconductors have historically been viewed as commodity investments — cyclical, low-margin, and driven by DRAM spot pricing. HBM changes this framing for the segment of memory supply serving AI infrastructure.

SK Hynix's HBM revenue has grown to represent a substantial portion of total revenue, and HBM ASPs are structurally higher than commodity DRAM. The company's HBM manufacturing capacity is nearly fully committed through 2025, limiting spot market exposure. This is a fundamentally different business dynamic than the boom-bust DRAM cycles investors have experienced historically.

Micron's HBM ramp represents a potential share gain story. As the third player qualifying into major AI accelerator programs, Micron is earlier in the HBM curve than SK Hynix but has the manufacturing scale and process capability to be a credible competitor. Their qualification timeline with key customers will be a critical catalyst to monitor.

The framework we apply to memory companies in this context is not the traditional DRAM spot price model. We focus on HBM capacity allocation, qualification status with key accelerator OEMs, and the spread between HBM ASPs and DRAM commodity prices as a measure of the structural premium being sustained by the supply constraint.

Topics

Research ReportSemiconductorsMemoryTechnology

Download PDF

Related Research

HBM Supply Chain: Memory as a Structural Bottleneck

HBM Supply Chain: Memory as a Structural Bottleneck

Why Memory Became the Constraint

Supply Chain Concentration and Dynamics

The Packaging Bottleneck Within the Bottleneck

Memory as an Investment Category

Continue Reading

Power Delivery: The Critical Path of the AI Infrastructure Buildout

Optical Interconnect: The Bandwidth Constraint