Agencies must ensure their IT infrastructure is up to the task of transforming how data is processed before they can begin leveraging high-performance computing (HPC) and artificial intelligence.
Vertiv specializes in critical infrastructure solutions and is helping government build the necessary power, cooling and service environments to support the massive computational workloads required by AI.
Federal demand for HPC infrastructure continues to grow apace, in conjunction with agencies’ appetite for enhanced operations and decision-making.
“High-performance computing represents the upper ranges of compute and energy density,” says Tony Evans, vice president of federal strategy and sales at Vertiv.
DISCOVER: Vertiv helps support your data centers through power and cooling solutions.
AI and GPUs Require More Power and Generate More Heat
The shift to AI, particularly in conjunction with the use of graphics processing units, caused a dramatic increase in power consumption and heat generation — pushing traditional server cooling methods to their limits, Evans says.
GPUs are key to AI in an HPC environment, and the challenge lies in their energy and heat management.
“GPUs draw significantly more energy, and they generate so much heat that traditional air-based server cooling isn’t reliable in many cases,” Evans says.
As a result, liquid cooling solutions are on the rise because they’re more efficient at managing the heat generated by GPUs and entire servers.
Click the banner below to begin building modern IT architectures.
The challenges go beyond cooling to power distribution across the IT environment, which must also evolve.
“The resulting infrastructure-related challenges range from adequate power supply and redundancy at the chip level to the electrical distribution and UPS, all the way to the building itself,” Evans says.
In some cases, even utility companies may struggle to provide the necessary power for AI-driven HPC environments.
Vertiv’s Solutions to Power and Cooling Challenges
Vertiv developed solutions tailored to HPC environments, including advanced uninterruptible power supply (UPS) systems, power distribution, and both air and liquid cooling systems, to meet these challenges. These solutions improve performance while reducing energy consumption, because efficiency is key.
EXPLORE: Energy’s Kestrel supercomputer received 132 GPU nodes.
“Liquid cooling is much more efficient than air, allowing for higher working fluid temperatures — sometimes up to 30 to 40 degrees Fahrenheit higher,” Evans says.
This allows HPC environments to minimize energy use, he says, adding that ongoing purity and quality of the coolant used in chip and immersion cooling systems is another key consideration.
Vertiv designed its uninterruptible power supply systems to manage the unique power draw characteristics of GPUs without being oversized.
Scalability and Modularity of HPC Infrastructure Is Key
Proper preparation is crucial for agencies beginning their AI journeys, so they need to ensure a clear, budget-conscious path to achieving their goals.
LEARN MORE: These are six ways agencies can make AI work for their missions.
“We have starter kits in 30- to 50-kilowatt increments for air-cooled servers and, for more energy-intensive servers, liquid cooling solutions in 70-kilowatt increments,” Evans says.
Scalable solutions that allow agencies to incrementally build their HPC infrastructure minimize the impact on existing infrastructure.
Agencies will eventually need modular cooling and power designs that can expand over time to save both capital and operational expenditures, Evans says.
Vertiv’s service approach supports agencies through the entire data center lifecycle, from installation to ongoing maintenance, and ensures the company can provide continuous support as agencies scale their HPC infrastructure to accommodate AI advancements. The company has been “deeply involved” in HPC projects across defense, intelligence and civilian agencies, Evans says.
“We’re proud to have helped with some pioneering implementations among the government’s research arms, and we’ve maintained an active dialog with most of the agencies that are doing this type of thing,” Evans says. “We are ready to assist with what’s next.”
UP NEXT: Ask these five questions about your hybrid data center’s power.
Brought to you by: