Close

See How Your Peers Are Moving Forward in the Cloud

New research from CDW can help you build on your success and take the next step.

Feb 02 2024
Software

NVIDIA Supports Federal Data Scientists and Developers with AI Training Models

Moving forward with AI software means meeting the requirements of the applications being developed and the infrastructure they run on.

The Department of Defense and other agencies want to incorporate graphics processing units as they ramp up their IT infrastructure with additional capacity to support artificial intelligence applications.

Agencies have turned to accelerated computing to enable virtualization and production-grade AI software.

Furthermore, the government is no stranger to high-performance computing environments. For decades, they’ve been a mainstay at the Department of Energy’s National Renewable Energy Laboratory, Oak Ridge National Laboratory and Lawrence Livermore National Laboratory supporting large-scale research efforts such as mapping climate change and developing COVID-19 vaccines.

“The strategy has evolved from releasing software to enable GPU virtualization to building enterprise platforms for AI workloads backed by reference architectures with our partners,” says Konstantin Cvetanov, senior solution architect for enterprise AI and machine learning software and services at NVIDIA. “Our software strategy continues to evolve as adoption of accelerated computing grows exponentially year after year.”

Click the banner below to learn more about continuous app modernization.

 

Meeting the Requirements of Data Scientists and AI Developers

For NVIDIA, a key component of supporting agencies amid this strategic shift is satisfying the requirements of data scientists on one hand and machine-learning operations and DevSecOps on the other. Their needs may differ, but it’s important that they can coexist.

“IT departments need to provide the platform for end users to do their work,” Cvetanov says.

A common example is building a new AI application, whether for process automation or data analysis.

“The IT team has mature resource management tools, but part of the challenge is they’ve never had to manage an AI cluster,” Cvetanov says. “AI workloads have a very specific set of requirements for performance and latency that are oftentimes very different from traditional enterprise applications.”

The path to better workflows starts here

 

In these scenarios, the ideal approach is to consider the workload in the context of the agency’s existing tools and processes.

“We want to make the learning curve less steep. We don’t want our customers to have to onboard a lot of new tools if they can get away with using the ones they are already trained on,” Cvetanov says. “For instance, if an AI workflow is tested and proven to work well in a virtual environment without significant performance degradation, then they can use their existing tooling to manage the GPU environment, as we have full integration with platforms like VMware.”

Prebuilt AI Training Models Are the Way to Go

NVIDIA launched NVIDIA AI Enterprise in 2021 with the goal of making the NVIDIA AI stack more accessible to public and private sector entities. Functioning as the software layer of the NVIDIA AI platform, it comes preloaded with frameworks for developing, validating and deploying ML models.

Additionally, there are prebuilt workflows for tasks such as audio transcription, next-best action recommendation and cybersecurity threat detection.

A particular benefit is the ability of NVIDIA AI Enterprise to work with pretrained AI models. This includes models developed by NVIDIA as well as third-party models approved for use within the federal ecosystem.

DISCOVER: As AI evolves, so does data poisoning.

NVIDIA AI Enterprise also comes with toolsets to help fine-tune a third-party model to run within its environment.

“Training a model from scratch is one of the most capital- and labor-intensive processes in the entire AI development cycle,” Cvetanov says.

The process requires large-scale compute infrastructure and high-quality datasets, coupled with the expertise required to train, optimize and deploy the model in production.

“Having a prebuilt model takes you right to being able to run an AI model in production,” Cvetanov says. “If we can get you 50 percent of the way into the AI project, then it’s more likely to succeed.”

Konstantin Cvetanov
Training a model from scratch is one of the most capital- and labor-intensive processes in the entire AI development cycle.”

Konstantin Cvetanov Senior Solution Architect for Enterprise Artificial Intelligence and Machine Learning Software and Services, NVIDIA

Closing the AI Talent Gap with Cutting-Edge Tools

Prebuilt AI models offer more than technical and operational advantages.

“If federal agencies are tasked with being competitive, with doing things like training large language models, then they need top-notch talent,” Cvetanov says. “We’re helping them do that by giving them tools that are being widely adopted but haven’t yet trickled down to the government sector.”

For example, NVIDIA DeepStream is an AI toolkit for analyzing streaming video data. Retailers may use video data to look for patterns in self-checkout use, or manufacturers may watch for defective products or malfunctioning machines.

MORE FROM FEDTECH: Agencies shouldn’t neglect AI governance.

Video is critical to the work of many agencies, from the National Park Service’s bear cam to the National Oceanic and Atmospheric Administration’s weather-tracking drones.

As data scientists get their AI tools, it’s important to make sure IT teams aren’t left behind. This is especially true of agencies that have adopted containers for the portability and flexibility of running AI applications but soon face the challenge of needing large-scale automation and container orchestration.

“It’s not just about software that has enterprise-grade capabilities,” Cvetanov says. “Enterprise-level technical support with guaranteed service-level agreements is equally as important to minimize downtime and accommodate bug fixes, a benefit that should not be overlooked.”

Here, NVIDIA aims for support that helps agencies manage clusters in the context of AI app development.

“We try to cover the end-user stakeholder and the IT stakeholder,” he says. “We want it to feel like they have an extension of their own workforce that they can lean on.”

Brought to you by:

Image provided by Nvidia