FEDTECH: As agencies look to implement new AI technology, how can they determine the right processor for the workload?
PINCHERA: There are many valuable use cases for generative AI across agencies. Same as the private sector, agencies are training their large language models on their own intellectual property and looking to keep it secure. They are engaged in both the training of LLMs and inferencing, executing the AI model. Training and inferencing each have separate computing needs.
Training is a compute intensive operation best accomplished on a graphics processing unit, which can do parallel processing. If your application can benefit from breaking up the workload into many computations in parallel, a GPU is the right choice.
The Instinct MI300X and MI300A represent our third-generation compute GPU featuring a 3.5D chiplet design; third-gen compute GPU, CDNA 3; fourth-gen CPUs in the MI300A; and high-bandwidth memory. Even before the generative AI boom, our MI300 series accelerators were designed for the needs of HPC workloads and large data sets.
AI inferencing has lower compute demands compared with training. We recommend using our fourth-gen AMD EPYC processors with off-the-shelf servers. With up to 96 cores, high-memory bandwidth and support for AI-specific data types, our EPYC Zen 4 series is a good fit for inference computing. We’re able to provide the GPU and the CPU you need, wherever the agency is in its use of AI.
DISCOVER: DOE’s Kestrel supercomputer has entered its second phase.
FEDTECH: Where should agencies look for additional AI adoption opportunities?
PINCHERA: We know the most recent Government Accountability Office figures show that 20 out of 23 reporting agencies have about 1,200 current or planned AI use cases, so AI has taken root across many agencies. Having said that, I often recommend my federal customers look for “greenfield” opportunities: solutions that already run on software that doesn’t require a rewrite of existing code.
Some AI workload examples include PyTorch, TensorFlow, Open Neural Network Exchange and Google JAX. If the customer has legacy CUDA code, our ROCm software ecosystem has tools that will translate the CUDA calls into portable calls that can be compiled for both AMD and NVIDIA GPUs.
FEDTECH: What steps can agencies take to upgrade their data centers to meet growing generative AI and HPC demand?
PINCHERA: As we move into the future, agencies are looking to replace their older infrastructure. They are not typically able to redesign their data centers; they have to work with the space, design, power loads and cooling that are already in place. Agencies are very conscious of power and space efficiency.
They can replace their two-socket servers with one-socket EPYC CPUs, not lose any performance while also reducing the total cost of ownership. Both EPYC and Instinct have the cores and compute units to address extremely large data sets and build language models in a reasonable amount of time.
LEARN MORE: Don’t implement an AI use case without proper governance.
Another consideration has to be the software. Code portability is a growing need. Agencies are trying to avoid the nonrecurring engineering costs involved in building customized code for a single hardware platform.
Open-source is becoming more attractive to agencies. Using open-source software lets them move between hardware vendors as needed.
AMD has partnerships with independent software vendors. We consult with many of them on their code. The goal is to get their code to run optimally on our hardware, without having to do any custom building.
EPYC and Instinct CPUs are purpose-built to support open and portable software. And our ROCm GPU software stack allows customers to build open-source generative AI and HPC code so it can run on the best hardware the agency can procure. In the long term, that’s a win for agencies.
Brought to you by: