Close

See How Your Peers Are Moving Forward in the Cloud

New research from CDW can help you build on your success and take the next step.

May 03 2024
Data Center

Q&A: How AMD EPYC and Instinct Processors Meet Agency Compute Needs

Wider implementation of artificial intelligence and high-performance computing in federal data centers requires specialized server hardware and open-source software options.

The Office of Management and Budget signaled a tipping point in the government’s embrace of artificial intelligence when it recently issued guidance to agencies on use of the emerging technology.

As agencies move into this new era of AI-driven innovation — and original equipment manufacturers position themselves to meet growing demand — technology procurement decisions are taking on greater significance.

Developing expertise in the underlying technologies that drive AI, high-performance computing and other data-intensive compute workloads will be a key factor in the success of agency use cases.

FedTech recently spoke with Patrick Pinchera, senior solution architect for the federal government and public sector at AMD, about how agencies can best position themselves for the changing data center landscape as the use of AI grows.

DISCOVER: How AMD can help advance data center performance for your agency.

FEDTECH: Why is memory hierarchy such an important consideration for server processors running generative AI and HPC workloads?

PINCHERA: Generative AI and HPC are both memory-intensive processes. They require quick access to memory. That’s a feature we’ve worked on with our Zen 4 CPU cores, improving the memory hierarchy.

With the EPYC server processors, we went from eight channels of DDR4 memory in Zen 3 to 12 channels of DDR5 memory in Zen 4. As we rolled out the Zen server CPUs, we've seen that as the number of cores increased, many workloads became memory-starved. Zen 4 increases both the speed of the memory and the bandwidth.

We also increased the caches. When you put the most frequently accessed instructions and data closest to the CPU, you don't even have to go out to main memory. In Zen 4, we doubled the Level 2 cache to 1 megabyte and made other improvements to keep the data local to the cores.

We are seeing a 14 percent increase in the number of instructions executed per clock cycle, on average. That increase allows you to get more work done without having to increase the clock speed.

Click the banner below to learn how Backup as a Service boosts data protection.

 

FEDTECH: What other processor innovations are meeting today’s generative AI and HPC workload needs at agencies?

PINCHERA: Security is an important one — maintaining the confidentiality, integrity and availability of agency data. Data-intensive operations such as generative AI and HPC make for attractive targets because of the high-value data involved. Security has been an ongoing focus for our Zen series, protecting against side-channel attacks.

With Infinity Guard, we can encrypt all the memory without the customer needing to do anything on their end. We also encrypt the data in use, with a negligible 3 percent performance overhead. And with Zen 4, we introduced the capability for guest operating systems in virtualized environments to run exclusively on one core.

Another one is power management. On today's system-on-chip designs — with the large core counts, multiple memory hierarchies and wide data paths — if you direct the power needed to move, store and crunch the data to where and when it is needed, you can maintain very good base clock speeds and still have the ability to boost the clock when the workload needs it.

These power management settings are accessed in the server’s basic input and output system and allow the IT staff to tune the system for their particular workload. Our server engineering teams have published a lot of tuning guides to help customers navigate the BIOS settings. This lets you tune the chip settings to get the best performance from your servers.

The specialized power management delivers great energy efficiencies, which is helpful in data centers that can’t be redesigned for more power and cooling and are out of floor space.

Patrick Pinchera
Data-intensive operations such as generative AI and HPC make for attractive targets because of the high-value data involved.”

Patrick Pinchera Senior Solution Architect, Federal Government and Public Sector, AMD

FEDTECH: As agencies look to implement new AI technology, how can they determine the right processor for the workload?

PINCHERA: There are many valuable use cases for generative AI across agencies. Same as the private sector, agencies are training their large language models on their own intellectual property and looking to keep it secure. They are engaged in both the training of LLMs and inferencing, executing the AI model. Training and inferencing each have separate computing needs.

Training is a compute intensive operation best accomplished on a graphics processing unit, which can do parallel processing. If your application can benefit from breaking up the workload into many computations in parallel, a GPU is the right choice.

The Instinct MI300X and MI300A represent our third-generation compute GPU featuring a 3.5D chiplet design; third-gen compute GPU, CDNA 3; fourth-gen CPUs in the MI300A; and high-bandwidth memory. Even before the generative AI boom, our MI300 series accelerators were designed for the needs of HPC workloads and large data sets.

AI inferencing has lower compute demands compared with training. We recommend using our fourth-gen AMD EPYC processors with off-the-shelf servers. With up to 96 cores, high-memory bandwidth and support for AI-specific data types, our EPYC Zen 4 series is a good fit for inference computing. We’re able to provide the GPU and the CPU you need, wherever the agency is in its use of AI.

DISCOVER: DOE’s Kestrel supercomputer has entered its second phase.

FEDTECH: Where should agencies look for additional AI adoption opportunities?

PINCHERA: We know the most recent Government Accountability Office figures show that 20 out of 23 reporting agencies have about 1,200 current or planned AI use cases, so AI has taken root across many agencies. Having said that, I often recommend my federal customers look for “greenfield” opportunities: solutions that already run on software that doesn’t require a rewrite of existing code.

Some AI workload examples include PyTorch, TensorFlow, Open Neural Network Exchange and Google JAX. If the customer has legacy CUDA code, our ROCm software ecosystem has tools that will translate the CUDA calls into portable calls that can be compiled for both AMD and NVIDIA GPUs.

FEDTECH: What steps can agencies take to upgrade their data centers to meet growing generative AI and HPC demand?

PINCHERA: As we move into the future, agencies are looking to replace their older infrastructure. They are not typically able to redesign their data centers; they have to work with the space, design, power loads and cooling that are already in place. Agencies are very conscious of power and space efficiency.

They can replace their two-socket servers with one-socket EPYC CPUs, not lose any performance while also reducing the total cost of ownership. Both EPYC and Instinct have the cores and compute units to address extremely large data sets and build language models in a reasonable amount of time.

LEARN MORE: Don’t implement an AI use case without proper governance.

Another consideration has to be the software. Code portability is a growing need. Agencies are trying to avoid the nonrecurring engineering costs involved in building customized code for a single hardware platform.

Open-source is becoming more attractive to agencies. Using open-source software lets them move between hardware vendors as needed.

AMD has partnerships with independent software vendors. We consult with many of them on their code. The goal is to get their code to run optimally on our hardware, without having to do any custom building.

EPYC and Instinct CPUs are purpose-built to support open and portable software. And our ROCm GPU software stack allows customers to build open-source generative AI and HPC code so it can run on the best hardware the agency can procure. In the long term, that’s a win for agencies.

Brought to you by:

Photo provided by AMD