Beyond the HBM Headline
High-bandwidth memory has received the most attention among AI infrastructure memory components — appropriately, given that HBM is the binding constraint on GPU performance in large model training workloads. GZC has covered this constraint in prior research.
But HBM is not the entirety of the memory and storage bottleneck in AI infrastructure. There is a broader architecture of memory and storage that must scale with the AI compute buildout, and several layers of this architecture have supply dynamics that equity markets have not fully priced.
The Memory Hierarchy in an AI Cluster
An AI training or inference cluster involves multiple tiers of memory, each serving a different function in the compute pipeline:
High-Bandwidth Memory (HBM): On-package memory attached directly to the GPU die. This is the most performance-critical memory layer — bottleneck for model training speed. Production is dominated by three manufacturers globally, with lead times and pricing reflecting concentrated supply.
Server DRAM (DDR5): General-purpose DRAM used in the CPU and memory systems that surround GPU clusters. AI servers require substantially more DRAM per server than standard compute — partly because the GPU requires more host-side memory to manage the data pipeline, and partly because inference architectures often require keeping large model weights in host memory. DDR5, the current generation standard, is in its early production ramp.
Enterprise Flash Storage: Large AI training runs require enormous datasets to be accessible at high throughput. NVMe-based flash storage — the enterprise standard for high-performance storage in data center applications — is a critical component of the data pipeline that feeds GPU clusters. The bandwidth requirements of AI training workloads have driven demand for flash storage at scales and specifications that exceed what traditional data center architectures required.
Persistent Memory and Storage Tiers: For inference at scale, where model weights must be loaded, cached, and served to millions of requests, the storage architecture below the GPU and DRAM layers becomes a meaningful performance constraint. This drives demand for specialized storage architectures optimized for AI inference workload patterns.
The DDR5 Transition
DDR5 represents the current memory standard for high-performance server platforms. The transition from DDR4 to DDR5 is not merely a performance upgrade — it is a transition with supply chain implications for anyone deploying AI infrastructure at scale.
DDR5 production capacity has been ramping, but the pace of ramp has not always matched the pace of server platform transitions. The three major DRAM manufacturers — Samsung, SK Hynix, and Micron — are each executing DDR5 capacity transitions while simultaneously managing HBM capacity additions, creating a supply allocation challenge that periodically tightens availability in one segment as investment flows to another.
From a BTT perspective: DDR5 demand is non-discretionary for anyone deploying current-generation AI server platforms. Supply is concentrated in three manufacturers. Capacity allocation between HBM and DDR5 creates periodic tightness that manifests as either pricing power or allocation constraints.
Enterprise Storage: The Flash Layer
The enterprise flash storage layer — served primarily by NVMe SSDs in data center configurations — has undergone significant structural change driven by AI infrastructure demand. The read and write pattern requirements of AI training and inference workloads differ materially from traditional database and file server workloads, driving the development of specialized storage architectures optimized for AI applications.
Companies that have built enterprise storage platforms specifically designed for AI workload patterns — high sequential read throughput, large capacity per rack unit, low latency for metadata operations — are in a favorable competitive position. The specification requirements for AI storage create product differentiation that is not easily replicated by commodity flash storage providers.
Applying the BTT Framework
The memory and storage layer of AI infrastructure has several characteristics that fit the BTT framework well:
Non-discretionary demand: AI infrastructure cannot function without adequate memory and storage at each layer of the hierarchy. There is no substitution for the memory tier closest to the GPU — the laws of computer architecture require it.
Concentrated supply: The DRAM and HBM manufacturing industries are effectively oligopolies. Three companies produce the majority of global DRAM supply. Enterprise flash manufacturing is more fragmented but still concentrated among a handful of companies with the capital and expertise to serve the performance requirements of AI workloads.
Multi-year procurement visibility: Memory and storage procurement for hyperscale AI infrastructure is planned and contracted in multi-year cycles, aligned with data center buildout timelines. This creates the revenue predictability and margin stability that BTT analysis targets.
Position Implications
GZC's memory and storage coverage focuses on companies positioned at the intersection of constrained supply, non-discretionary demand, and AI-specific product requirements. We look for companies where the AI workload requirement creates a genuine performance specification that commodity alternatives cannot meet — because this specification requirement is the source of durable pricing power.
The memory cycle is long. The transition to DDR5, the continued growth of HBM demand, and the development of AI-optimized storage architectures represent a 3-to-5 year investment horizon where the supply-demand dynamics we identify today will continue to generate forced-spend in companies that own the relevant supply chain positions.


