How Nvidia is creating a $1,4T data center market in a decade of AI
We are witnessing the rise of a completely new computing era. Within the next decade, a trillion-dollar-plus data center business is poised for transformation, powered by what we refer to as extreme parallel computing, or EPC — or as some prefer to call it, accelerated computing. Though artificial intelligence is the primary accelerant, the effects ripple across the entire technology stack.
Nvidia Corp. sits in the vanguard of this shift, forging an end-to-end platform that integrates hardware, software, systems engineering and a massive ecosystem. Our view is that Nvidia has a 10- to 20-year runway to drive this transformation, but the market forces at play are much larger than a single player. This new paradigm is about reimagining compute from the ground up: from the chip level to data center equipment, to distributed computing at scale, data and applications stacks and emerging robotics at the edge.
In this Breaking Analysis, we explore how extreme parallel computing is reshaping the tech landscape, the performance of the major semiconductor players, the competition Nvidia faces, the depth of its moat, and how its software stack cements its leadership. We will also address a recent development from CES — the arrival of so-called “AI PCs” — with data from Enterprise Technology Research. We’ll then look at how the data center market could reach $1.7 trillion by 2035. Finally, we will discuss both the upside potential and the risks that threaten this positive scenario.
Optimizing the technology stack for extreme parallel computing
Our research indicates that every layer of the technology stack — from compute to storage to networking to the software layers — will be re-architected for AI-driven workloads and extreme parallelism. We believe the transition from general-purpose, x86, central processing units toward distributed clusters of graphics processing units and specialized accelerators is happening even faster than many anticipated. What follows is our brief assessment of several layers of the data center tech stack and the implications of EPC.
Compute
For more than three decades, x86 architectures dominated computing. Today, general-purpose processing is giving way to specialized accelerators. GPUs are the heart of this change. AI workloads such as large language models, natural language processing, advanced analytics and real-time inference demand massive concurrency.
- Extreme parallelism: Traditional multicore scaling has hit diminishing returns. By contrast, a single GPU can contain thousands of cores. Even if a GPU is more expensive at the packaged level, on a per-unit-of-compute basis it can be far cheaper, given its massively parallel design.
- AI at scale: Highly parallel processors require advanced system design. Large GPU clusters share high-bandwidth memory or HBM, and require fast interconnects (such as InfiniBand or ultra-fast Ethernet). This synergy among GPUs, high-speed networking and specialized software is enabling new classes of workloads.
Storage
While storage is sometimes overlooked in AI conversations, data is the fuel that drives neural networks. We believe AI demands advanced, high-performance storage solutions:
- Anticipatory data staging: Next-generation data systems anticipate which data will be requested by a model, ensuring that data resides near the processors ahead of time to reduce latency and address physical limits as much as possible.
- Distributed file and object stores: Petabyte-scale capacity will be the norm, with metadata-driven intelligence orchestrating data placement across nodes.
- Performance layers: NVMe SSDs, all-flash arrays and high-throughput data fabrics play a significant role to keep GPUs and accelerators saturated with data.
Networking
With mobile and cloud last decade we saw a shift in network traffic from a north-south trajectory (user-to-data center) toward an east-west bias (server-to-server). AI-driven workloads cause massive east-west and north-south traffic within the data center and across networks. In the world of HPC, InfiniBand emerged as the go-to for ultra-low-latency interconnects. Now, we see that trend permeate hyperscale data centers, with high-performance Ethernet as a dominant standard which will ultimately in our view prove to be the prevailing open network of choice:
- Hyper-scale networks: Ultra-high-bandwidth and ultra-low-latency fabrics will facilitate the parallel operations needed by AI clusters.
- Multi-directional traffic: Once dominated by north-south flows and more recently east-west, advanced AI workloads now spin off traffic in every direction.
Software stack and tooling
OS and system-level software
Accelerated computing imposes huge demands on operating systems, middleware, libraries, compilers and application frameworks. These must be tuned to exploit GPU resources. As developers create more advanced applications — some bridging real-time analytics and historical data — system-level software must manage concurrency at unprecedented levels. The OS, middleware, tools, libraries and compilers are rapidly evolving to support ultra-parallel workloads with the ability to exploit GPUs (that is, GPU-aware OSes).
Data layer
Data is the fuel for AI and the data stack is rapidly becoming infused with intelligence. We see the data layer shifting from an historical system of analytics to a real-time engine that supports the creation of real time digital representations of an organization, comprising people, places and things as well as processes. To support this vision, data harmonization via knowledge graphs, unified metadata repositories, agent control frameworks, unified governance and connectors to operational and analytic systems will emerge.
The application layer
Intelligent applications are emerging that unify and harmonize data. These apps increasingly have real time access to business logic as well as process knowledge. Single-agent systems are evolving to multi-agent architectures with the ability to learn from the reasoning traces of humans. Applications increasingly can understand human language, are injecting intelligence (in other words, AI everywhere) and supporting automation of workflows and new ways of creating business outcomes. Applications increasingly are becoming extensions to the physical world with opportunities in virtually all industries to create digital twins that represent a business in real time
READ the latest news shaping the data centre market at Data Centre Central
How Nvidia is creating a $1,4T data center market in a decade of AI, source