The AI Use That’s Producing Results for Federal Agencies
The Office of Management and Budget’s 2025 AI Use Case Inventory documented 3,611 use cases across 56 agencies — more than doubling the prior year’s total. Meanwhile, a Government Accountability Office report found that generative AI use across 11 agencies increased ninefold between 2023 and 2024.
GAO found 126 active AI use cases at the IRS as of last summer — voice bots answering routine taxpayer questions, machine learning models prioritizing which returns to audit and AI tools rewriting code in a 60-year-old programming language to accelerate IT modernization. But that same GAO report warned that staff reductions cost the agency 63 employees who had been working on AI, undermining its capacity to build and deploy the very tools it needs most.
At the Centers for Medicare & Medicaid Services, the math is stark. CMS processes 4 million to 5 million claims per day with roughly 500 fraud investigators. AI now runs about 250 models daily to flag suspicious billing patterns.
“My team uses AI to comb through those claims and figure out where the risk is the greatest,” Jeneen Iwugo, acting director of CMS’ Center for Program Integrity, said at the FedScoop UiPath Public Sector Summit early this month. The payoff: $14 returned for every $1 spent on fraud prevention in FY2024, and nearly $2 billion saved in the fraud war room’s first year. “The longer leash I get, the more I’m able to push the needle with using AI,” Iwugo said.
At the Department of Veterans Affairs, AI spans 367 use cases. VA GPT, the department’s internal generative AI tool, has over 95,000 users who report saving two to three hours per week. The Automated Decision Support system helped cut average claims processing time from 141 days to 81 — a 42% reduction — and the claims backlog dropped below 100,000 for the first time since 2020.
Those gains, however, come against a backdrop of strain. For every agency showing measurable returns, others are struggling to balance AI’s efficiency with its propensity for errors.
Accuracy Issues Persist for Some Agencies Using AI
The results of agencies’ AI use are real but not universal. An IBM survey of 2,000 CEOs found only 25% of AI initiatives delivered expected ROI. Federal adoption is similarly lopsided: Larger agencies such as the Department of Health and Human Services, VA, NASA and the Department of Justice account for the bulk of deployed use cases, while smaller agencies remain in pilot mode.
And at VA, speed has drawn scrutiny.
The agency reported 94% issue-level accuracy, but only an 83% claims-based rate. Ryan Gallucci, executive director of the Veterans of Foreign Wars, testified during an April House Veterans’ Affairs hearing that “quality must be treated more substantially than speed.”
BE PREPARED: Get our artificial intelligence and data readiness checklist.
Why Data Readiness Determines Federal AI Success
The agencies seeing results share a key trait: They did the unglamorous work before deploying anything. CMS built a “fraud war room” that paired AI outputs with human judgment — legal counsel, the Office of Inspector General and investigators spending hours on the highest-risk cases.
The IRS invested in AI-powered code translation to modernize legacy systems before layering on new capabilities.
The use cases from the IRS and CMS show that AI has the potential to help agencies fill gaps left by departed workers, but only if the foundation is there, if humans stay in the loop and if speed doesn’t become the only metric that matters.
