Giga scale: The AI infrastructure gold rush

As AI rewrites the data center playbook, how can emerging players capture the prize?

The AI revolution has triggered an unprecedented surge in data center demand, redefining what digital infrastructure must deliver. This Viewpoint explores how AI is straining existing systems, why giga-scale data centers are emerging as the new backbone of compute, and what is required to lead in this new era. As new players emerge, speed matters, but expert judgment is vital to separate real opportunities from distractions.

AI CAUSING INDUSTRY-WIDE DISRUPTION

Data center demand is growing faster than supply, leading to historic lows in vacancy rates (see Figure 1). The global surge in AI adoption is intensifying this imbalance, increasing pressure on both capacity and infrastructure. In major markets like Northern Virginia, USA, and Frankfurt, Germany, vacancy rates dropped below 3% in 2024. This underscores the structural strain in supply pipelines and highlights the urgency of expansion. Total data center capacity demand in the US is projected to rise, with Arthur D. Little (ADL) forecasting a 2.5-3.5x increase between 2024 and 2030, driven by traditional factors (cloudification and traditional workload intensity) and AI, with AI expected to become the single largest contributor by the end of the decade.

AI isn’t new, of course, but breakthroughs in generative AI have led to a huge rise in infrastructure demand. Large language models pushed this shift, and predictive and machine learning systems are adding demand. Training these models requires immense datasets (e.g., Llama 3 was trained on ~15 trillion tokens) and remains a significant and persistent source of demand, especially as more specialized models and agents emerge.

Inference is also scaling faster, driven by adoption, real-world deployment, and monetization efforts. The rise of AI agents (autonomous systems that perceive, reason, and act), agentic systems (architectures that enable AI agents to plan, decide, and act autonomously, sometimes involving collaboration among multiple agents), and interconnected AI ecosystems (networks of models, agents, and tools that interact and share capabilities) is intensifying demand through continuous, coordinated inference calls. In the medium term, inference will overtake training as the main source of AI infrastructure demand.

Some countervailing dynamics may influence future demand. Certain players, particularly in China, are emphasizing more nimble AI architecture, such as small language models and alternative frameworks like artificial general decision-making that could reduce compute intensity. Materials innovations are also emerging (e.g., in-memory CRAM [computational random access memory], photonic chips, 3D stacking), alongside new architecture approaches (e.g., neuromorphic computing, analog in-memory computing), increasingly efficient GPUs (graphics processing units)/TPUs (tensor processing units), and advanced cooling — all demanding higher performance per watt.

Figure 1. Data center vacancy rates evolution in Tier I and II global markets (2020–2024)

A debate is surfacing around the sustainability of AI’s trajectory as it relates to data center capacity, with some parallels to the “dark fiber moment” of the early 2000s. There is no doubt about the trajectory of AI adoption, but the timing and speed of the AI revolution are not at all certain. AI infrastructure has a long, useful life and a wide range of applications. As Google CEO Sundar Pichai replied to investors worried about overinvestment in AI infrastructure, “When we go through a curve like this, the risk of underinvesting is dramatically greater than the risk of overinvesting for us.”

Technical challenges

The rapid growth of AI workloads requires more than added capacity — it demands a fundamental redesign of the infrastructure stack. Training and inference rely on highly parallelized environments with thousands of GPUs and specialized chips that traditional data centers were not built to handle. This shift strains every layer of the stack, with four infrastructure pain points standing out:

Power demand is rising sharply. GPU power use has nearly doubled in recent years (from about 400 W in 2018 to almost 750 W today) and could exceed 1,200 W by 2035 (see Figure 2). A rack hosting dozens of GPUs already surpasses 130 kW, with megawatt (MW)-scale prototypes emerging. Larger and more complex chips, specialized AI hardware (e.g., tensor cores designed for AI calculations), and packaging that combines multiple chips into one powerful unit are all driving this increase. Even though each new GPU generation becomes about 40% more energy-efficient each year, their ability to handle larger and more complex models fuels heavier usage — pushing total consumption ever higher.
Cooling becomes critical. Higher power densities mean greater heat concentration and fluctuating thermal loads. Conventional air-cooling systems are no longer sufficient, pushing operators toward advanced liquid cooling and real-time thermal management solutions.
Design and power-delivery limits are being tested. Operators must channel enormous energy loads into compact footprints, forcing innovation in power distribution, redundancy, and even structural design. Many are turning to onsite generation, such as gas-fired plants or small modular reactors (SMRs), to bridge gaps to grid access and enhance resilience. Increasingly, projects are codeveloped with utilities or independent power producers to ensure sufficient capacity.
Networking is another bottleneck. Frontier models require ultra-low latency and huge bandwidth between thousands of GPUs. Technologies like Nvidia InfiniBand and NVLink-Switch, which outperform Ethernet in speed and latency, are becoming essential to large-scale AI clusters.

The challenge is no longer about building bigger data centers; it’s about orchestrating complex, resource-intensive infrastructure. Operators must secure unprecedented amounts of power, land, and (increasingly) water access far in advance. They must navigate intricate permitting paths and meet heightened sovereign and societal expectations tied to strategic infrastructure. This is no longer a matter of scaling traditional hyperscale playbooks — it is a coordinated industrial effort that requires new approaches to development, procurement, and execution.

Figure 2. Evolution of power consumption by GPUs

GIGA-SCALE DATA CENTERS ARE KEY

A giga-scale data center is a next-generation facility delivering around a gigawatt (GW) of IT power, purpose-built for AI training and inference. Typically located on a large rural or semi-urban campus near high-voltage infrastructure, these sites scale in several-hundred-MW increments, reaching multiple gigawatts (in some cases above 5 GW). They combine advanced cooling, resilient long-term power supply, and ultra-high-bandwidth connectivity to reliably support frontier AI workloads. The top latest example is Hyperion, recently announced by Meta, for which ADL served as the sole commercial due diligence adviser. This is record-breaking in multiple ways:

The US $30 billion initial investment represents the largest-ever funding for a single-site data center, the largest-ever project financing, and the largest-ever private credit financing.
The data center has 2.6 GW initial capacity, expandable to 5+ GW.
It consists of 4+ million square feet (370,000 square meters) across nine interconnected buildings on 2,250 acres (910 hectares), about half the size of Manhattan.

Unlike legacy data centers, giga-scale sites are engineered specifically for large-scale AI compute. They introduce new levels of integration, performance, and efficiency — marking a shift from infrastructure that supports AI to one that powers its evolution. To stay ahead, major players are pivoting to giga-scale deployments, which offer three key advantages.

1. Purpose-built infrastructure

Unlike legacy facilities that struggle with performance limits and fragmentation, giga-scale campuses are designed to support large-scale training and inference with tightly integrated power, compute, and network layers. They deliver the performance, reliability, and low latency required for frontier models. These data centers can host tens of thousands of interconnected GPUs with high-speed interconnects, advanced cooling, and dense power delivery, and they can scale as AI workloads grow. Selecting sites with long-term expansion potential ensures continued relevance as compute intensity rises.

By colocating compute, storage, and networking, these environments enhance data gravity, streamline throughput, and reduce synchronization bottlenecks. They also offer robust physical and network security, making them ideal for sensitive workloads and model development.

2. First-mover advantage to capture momentum

Early entrants are securing high-capacity infrastructure ahead of grid congestion, shortages, and bottlenecks. This lets them set clear delivery timelines and meet the accelerating needs of AI model development. This momentum is reinforced by favorable market conditions in the form of abundant capital from large funds, robust regulatory and political backing across regions (aimed at strengthening domestic players in the global AI race), and reliable access to key commodities. As such, securing land and long-term power contracts is becoming a strategic differentiator, one that cannot be easily replicated once demand intensifies and grid strain increases. Early movers can also shape permitting frameworks and build trust with local authorities, ensuring smoother execution. This early visibility and demonstrated commitment enhance credibility with public stakeholders, an increasingly important asset amid growing scrutiny around land use, energy consumption, and community impact. By locking in critical infrastructure ahead of market saturation, early entrants not only de-risk execution, they also gain leverage with tenants, investors, and regulators, anchoring a durable first-mover advantage.

3. Economies of scale, energy efficiency & operational performance

Operating at a massive scale could deliver a meaningful decrease in cost per MW across power procurement and site management. Larger sites benefit from:

CAPEX and build-out efficiency. A 1 GW hyperscale data center could potentially cut CAPEX per MW by 15%-25% compared to deploying the same capacity across smaller sites, using:
- Shared infrastructure. Leverages economies of scale by sharing equipment across the facility, reducing per-unit costs.
- Bulk procurement. Scale enables better sourcing contracts for high-cost components like switchgear, liquid cooling, and prefabricated modules.
- Design repetition. Building at scale allows iterative refinement of modular site designs, which improves build speed, cost predictability, and construction quality.
- CAPEX savings from onsite power. Onsite generation can reduce up-front infrastructure costs by limiting the need for extensive backup systems (e.g., diesel generators or large-scale battery storage), especially when primary generation is designed for high availability.
Operational and energy efficiency
- Consolidated operations. Centralized security, maintenance, and monitoring reduce operational costs and duplication across regions. This means fewer staff per MW due to standardized layouts, lowering OPEX per unit of deployed capacity.
- Operational leverage. Operators can justify investing in centralized spare parts inventory, predictive maintenance platforms, and dedicated reliability engineering; this improves uptime and reduces unplanned outages.
- OPEX savings from synergies. OPEX can be reduced by avoiding grid transmission fees in certain markets and leveraging synergies between onsite generation and data center load profiles. This improves energy utilization and lowers the effective cost per MWh over time through co-optimization of generation and consumption.

Beyond efficiency, giga-scale campuses offer versatility, such as combining buildings with varied specs to handle diverse workloads. As AI shifts from compute-intensive training to inference-heavy demand, their modular design allows easy adaptation and retrofitting — ensuring they stay optimized for future workloads across hyperscalers and enterprise inference needs.

THE RACE IS ON

Giga-scale data centers define who can deploy, scale, and lead in a market driven by access to compute. For hyperscalers, it’s about staying ahead of demand. For governments, it’s about securing digital sovereignty. For investors, it’s a matter of early positioning for AI competitiveness. What was once dominated by Amazon, Microsoft, and Google is now a crowded field (see Figure 3).

Challengers like CoreWeave, Lambda Energy, Crusoe, Meta, Oracle, OpenAI, and Nvidia are scaling rapidly, and this surge is reshaping the infrastructure value chain, turning compute capacity into a strategic asset and making early access to giga-scale developments a prerequisite for long-term relevance. In the AI era, access to compute determines how fast companies can train models, scale products, and capture value. True access involves controlling land, power, and purpose-built data centers capable of dense, AI-optimized workloads. The companies that command this infrastructure will set the pace for innovation and competitiveness across the AI value chain.

Figure 3. Giga-scale data center announcements

The global race to build giga-scale data centers has begun. Investment is accelerating, permitting frameworks are evolving, and infrastructure is being treated as a strategic asset. What follows is not a regional arms race; it’s a reordering of who controls the physical backbone of AI deployment. Currently, the US is in the lead, but sovereign-backed efforts are gaining momentum across Europe, the Middle East, and Asia. The risk of a widening global AI divide is there: only 32 nations, mostly in the Northern Hemisphere, have AI-specialized data centers.

In the US, AWS has committed more than $150 billion to new developments, and Microsoft and Google are expanding rapidly (see Figure 4). Meta is building multi-GW AI campuses, including Hyperion, and Oracle is scaling through its Stargate initiative, growing its Abilene, Texas, USA, site to nearly 2 GW, targeting 4.5 GW with OpenAI and Nvidia. These private sector efforts are reinforced by a national policy agenda treating data centers, grid access, and semiconductors as strategic enablers of AI dominance, backed by permitting reform, land allocation, and workforce acceleration.

Europe and the Middle East are following suit. The EU’s €20 billion AI-Continent Plan is driving giga-scale sites across France, Portugal, Spain, and the UK under the banner of digital sovereignty. The United Arab Emirates’s 5 GW AI campus, built via Stargate and G42, cements its hub status; Saudi Arabia’s HUMAIN is targeting 1.9 GW by 2030 and 6.6 GW by 2034. Across Asia, India, Malaysia, and Japan are scaling quickly, driven by domestic cloud demand, data localization, and renewable energy integration. India is targeting 3 GW of capacity, with major efforts underway by Reliance and Nxtra.

Figure 4. Investment by major players