May 12 2021

New Data Analysis Tools Help Agencies Move Decisively During Crises

Increasingly sophisticated technology provides government agencies with additional avenues for understanding information.

As COVID-19 took hold in the U.S. in March 2020, federal agencies quickly shifted to remote work. The U.S. Food and Drug Administration, for instance, put many in-person inspections on hold and used remote assessments when possible.

But mission-critical inspections — particularly those pertaining to the virus, such as inspections of hand sanitizer facilities — couldn’t wait for the pandemic to subside.

Combining resources from within the FDA as well as the White House, the Centers for Disease Control and Prevention and the Department of Health and Human Services, the FDA analyzed county-by-county virus case-load data, guidelines on reopening businesses and algorithms on keeping workers safe.

With that information, the agency was able to create an advisory matrix map that field officers could use to prioritize in-person inspections. The matrix, which could have taken months or even years to develop, was available within six weeks.

It’s one example of a portfolio of data analysis “driver projects” at the heart of the FDA’s Data Modernization Action Plan (DMAP), announced March 3, 2021. They are initiatives that can be rolled out quickly and that provide tangible value across the agency — and in some cases, outside of it.

“We’ve been looking for projects that build our data skills and data muscles so that we can improve the overall capability in the organization,” says FDA Chief Data Officer Ram Iyer.

The FDA is in good company. As agencies collect ever-growing stores of data, many are turning to the increasingly sophisticated tools available to help them cull insights from those resources.

These data analysis projects — which often utilize new technologies and adapt existing capabilities — have been particularly useful in helping agencies move quickly during difficult times, such as the COVID-19 crisis and the growing cybersecurity threat landscape.

“The pandemic showed us the need to improve our data capabilities,” says Iyer. “While we are very, very good in-pocket, I think we have an opportunity to do a better job across centers and use it for complex problem sets. It also showed the need for the scaling of the agency so that we’re not blindsided by a new process that takes us away from our assigned work.”

More Data Means More Capability for Agencies

Data initiatives such as those at the FDA and Sandia National Laboratories, one of three U.S. nuclear weapons ­stockpile maintenance agencies, are built on needs as well as improved capabilities.

“We are producing so much data,” says Vince Urias, cybersecurity research scientist and distinguished member of the technical staff at Sandia. “The advent of cheaper storage and better indexing, and the performance of Splunk and other emerging tools, have led to our ability to ask ­questions about fairly large data sets very quickly, which allows us to store and pivot off of data at a rate that I don’t think has been possible in the last 10 years.”

Business analytics, machine learning tools and artificial intelligence are not new, points out Laura DiDio, principal of ITIC, a technology research firm. Organizations have been using them for years to collect data. But only 30 to 40 percent of them were actually analyzing that data to make it actionable.

That’s changed under COVID. “Now there’s a compelling need to not only collect the data but also use it to streamline operations and take meaningful action right away,” DiDio says.

RELATED: Find out how agencies can optimize data intake at the network edge. 

Data Drives Action at the FDA

DMAP, which builds upon FDA’s 2019 Technology Modernization Action Plan, was announced this year, but many driver projects were already in place.

For instance, in spring 2019, the FDA began piloting an artificial intelligence capability to screen imported seafood. The initiative uses machine learning to analyze decades of data about past shipments so that it can more quickly identify which shipments can move faster through clearance and which should receive extra scrutiny.

6,073

The average number of daily reports made to the FDA Adverse Events Reporting System in 2020

Source: FDA Adverse Events Reporting System Public Dashboard

That project addresses the challenge at hand — ensuring food safety — but it can also help the FDA accelerate the use of artificial intelligence for future projects at the agency, Iyer explains.

DMAP does not depend on one technology. “We need the entire stack, from collecting the data to cleaning it, making the data available for analysis, and then visualizing, modeling and reproducing the results,” says Iyer. “And with the emergence of AI, we are now talking about managing the algorithms that are built on top of the data.”

In addition to AI, the FDA is looking at the growing use of wearable devices to collect and analyze data. It will also encompass many of the agency’s existing resources, such as the FDA’s Adverse Events Reporting System Public Dashboard, which uses the Qlik data analytics platform to track unexpected physical reactions to medications and vaccines.

“Our job is not to create yet another new set of tools, but to understand what is going on, pick the one that is working really well and then apply that progress elsewhere,” says Iyer. 

“When people collaborate, our resources are elastic, and we can be creative in how we use our data.”

DIVE DEEPER: How AI tools allow agencies to analyze vast amounts of data.

Luring Cyberattackers in to Monitor Them 

Sandia National Laboratories aimed its data initiative at a browser exploit. From there, the hacker pivoted to a process, elevated privilege, grabbed credentials, then used them to move within the network and look for financial information.

What the hacker didn’t know was that the team at Sandia had created the deceptive environment to lure adversaries and study and learn from their actions and movements.

“We have a variety of ways of getting adversaries into that environment, and we’re able then to watch them transparently and allow them to move freely,” explains Uria. “But as they do that, we collect data about how they’re pivoting.”

How does the hacker verify what he’s looking for? How does he exfiltrate data out of the network? “All these things are now pivotal points for active threat intelligence,” says Urias. His agency can now build proactive sensors that detect adversaries and protect the network.

Ram Iyer, Chief Data Officer, FDA
Our job is not to create yet another new set of tools, but to understand what is going on, pick the one that is working really well and then apply that progress elsewhere.”

Ram Iyer Chief Data Officer, FDA

Sandia’s High-Fidelity Adaptive Deception and Emulation System (HADES), the basis for the deceptive research environments, compiles and analyzes data into Splunk’s data analysis platform. It was first deployed in 2016 with the goal of changing the conversation with the adversary.

EXPLORE: How does the Federal Data Strategy enable cooperation on data projects?

Sandia Moves to Predictive Analytics on Security Threats 

“What we had done in the past is, if there was a vulnerability or an exploit found in a system, we would pull the plug and do forensics,” says Urias. 

With HADES, they can collect data about every process that’s created and every registry key, transaction or file that’s opened, closed or moved.

“All that gets fused with the networking information that we’re collecting, so we’re able to do deep packet inspection on every bit that moves across the network,” says Urias. “We can turn it into defensible indicators.”

HADES has evolved into providing more predictive analysis. “We spend a lot of time defending internally,” he says. “As we look more broadly, we can start stitching together events and context to better predict and block things that may be emerging.”

illustration by Jacey