Mar 13 2020

Predictive Analytics: What Is It and How Can It Help the Federal Government?

Powerful data analytics tools can help agencies save money and make more informed decisions.

The federal government sits on mounds of data and often organizes that information into data lakes to gain insights from it. However, another powerful tool in agencies’ IT arsenals that many have been busy deploying is predictive analytics.

Predictive analytics tools allow the government to get ahead of problems before they waste money, harm IT systems or cost lives. Such data analytics platforms can provide agency leaders, IT leaders and analysts with actionable insights they can use to enhance their missions, improve their cybersecurity, save money on maintenance costs and generally make more informed decisions.

Agencies can also take advantage of open data to glean insights for and from one another, or open up data to the public and give them the opportunity to do the same. 

“From spotting fraud to combatting the opioid epidemic, an ounce of prevention really is worth a pound of cure — especially in government,” Deloitte notes in a report on predictive analytics in government. “Predictive analytics is now being applied in a wide range of areas including defense, security, health care, and human services, among others.”

What Is Predictive Analytics?

For years, federal agencies employed traditional statistical analytics software (SAS) to build predictive models, but those workers were usually sequestered into back rooms without access to policymakers, notes Andrew Churchill, vice president of federal sales at analytics firm Qlik. “But now data science is in vogue and it’s the cool job,” he says.

The most basic way to understand predictive analytics is to ask, “How do I take what I can clearly see is happening and begin to, through trained models, describe what will happen based on the variables that we are feeding the machine?” Churchill says.

Mohan Rajagopalan, senior director of product management at Splunk, notes that predictive analytics involves the ability to aggregate data from a variety of sources and then predict future trends, behaviors and events based on that data. That can include identifying anomalies in data logs and predicting failures in data centers or machines on the agency’s network. It can also be used to forecast revenues, understand buying behaviors and predict demand for certain services.

“The outcome of predictive analytics is the prediction of future behaviors,” Rajagopalan says.

Adilson Jardim, area vice president for public sector sales engineering at Splunk, says that predictive analytics exists on a spectrum. On one end is basic statistical or mathematical models that can be used to predict trends, such as the average of a certain type of behavior. On the other end are more advanced forms of predictive analytics that involve the use of machine learning, in which data models are asked to infer different predictive capabilities, Jardim says.

Some customers are ingesting up to 5 petabytes of data per day, and that data can be used to not only understand what has happened but what could or is likely to happen, he says.

Predictive analytics can be applied across “a broad range of data domains,” Churchill says. 

MORE FROM FEDTECH: See how HHS has embraced Big Data to help battle the opioid crisis. 

Defining the Predictive Analytics Process

There are numerous elements of the predictive analytics process, as Predictive Analytics Today notes. Here is a quick breakdown:

  • Define project: Agencies first must define the scope of the analysis and what they hope to get out of it.
  • Data collection: Getting the data itself and mining it can be a challenge, according to Rajagopalan. One of the big challenges federal agencies and other organizations face these days is the volume, variety and velocity of data. “A model in the absence of trustworthy, validated and available data doesn’t yield much of a result,” Churchill adds.
  • Data analysis: Another core element of the process involves algorithms that can inspect, clean, transform and analyze data to derive insights and make conclusions.
  • Statistics: Predictive analytics tools need to then use statistical analysis to validate the assumptions and hypotheses and run them through statistical models.
Mohan Rajagopalan, Senior Director of Product Management, Splunk
The outcome of predictive analytics is the prediction of future behaviors.”

Mohan Rajagopalan Senior Director of Product Management, Splunk

  • Modeling: Another key element is the modeling that is used to define how the data will be processed to automatically create accurate predictive models, Rajagopalan says. The algorithms can be as simple as rules that can be applied to understand a particular situation or understand data in the context of a particular scenario. There are also supervised algorithms and models that use machine learning techniques to build hypotheses around trends in the data and constantly refine themselves based on the data they are presented with.
  • Deployment: IT leaders then have the outputs of the model, such as a visualization, report or chart. The results of the predictive analysis are then given to decision-makers.
  • Model monitoring: The models are continuously monitored to ensure they are providing the results that are expected. 

Before, Rajagopalan says, agencies had specialized units to apply SAS, but those models were expensive to create. The democratization and consumerization of data and of analytics tools has made it easier to create simple and succinct summaries of data that visualize outputs.

READ MORE: Find out how agencies can unleash the power of data analytics.

What Is Open Data?

Joshua New, formerly a policy analyst at the Center for Data Innovation and now a technology policy executive at IBM, tells FedTech that open data is best thought of as “machine-readable information that is freely available online in a nonproprietary format and has an open license, so anyone can use it for commercial or other use without attribution.”

On May 9, 2013, former President Barack Obama signed an executive order that made open and machine-readable data the new default for government information.

“Making information about government operations more readily available and useful is also core to the promise of a more efficient and transparent government,” the Obama administration noted.

On Jan. 14, 2019, the OPEN Government Data Act, as part of the Foundations for Evidence-Based Policymaking Act, became law. The OPEN Government Data Act makes data.gov a requirement in statute, rather than a policy. It requires agencies to publish their information online as open data, using standardized, machine-readable data formats, with their metadata included in the data.gov catalog. May 2019 marks the 10th anniversary of data.gov, the federal government’s open data site.

The General Services Administration launched the site with a modest 47 data sets, but the site has grown to over 200,000 data sets from hundreds of data sources including federal agencies, states, counties and cities. “Data.gov provides easy access to government datasets covering a wide range of topics — everything from weather, demographics, health, education, housing, and agriculture,” according data.gov.

MORE FROM FEDTECH: See how agencies can get the most value out of their data.

What are Examples of Open Data?

There are numerous federal open data programs. FarmPlenty helps farmers better analyze Agriculture Department open data on crops grown within a 5-mile radius of their farms. The application that supports the program was built as part of the USDA-Microsoft Innovation Challenge and is supported by the USDA’s National Agricultural Statistics Service CropScape and Quickstats application programming interfaces.

Where are the Jobs? is an app that uses data from the Census Bureau and the Bureau of Labor Statistics to allow users to interactively explore the salary and job statistics for various occupations at national, state and regional levels. Home Energy Saver is an interactive consumer application used to estimate residential energy use and plan home energy efficiency upgrades that uses open data from the Energy Information Administration.

Predictive Analytics Examples in Government

Federal agencies are using predictive analytics for a wide range of use cases, including cybersecurity. Specifically, agencies are using these tools to predict insider threats, Splunk’s Jardim says. The models look at users’ backgrounds, where they have worked, how often they have logged in to networks at certain times and whether that behavior actually is anomalous. The goal of such tools is to make a good prediction of whether the security events should be tracked by human analysts, Jardim says.

“You only want to surface the events that are very clear insider threats,” he says. “The analyst is focused on high-probability events, not low-probability events.”

Predictive analytics can also be used for agencies’ data center maintenance by applying algorithms to look at compute capacity, how many users are accessing services and to assess throughput for mission-critical applications, Jardim says. Such tools can predict when a particular server will become overloaded and can help agencies preempt those events to ensure users have access to vital applications.

The Defense Department can also use predictive analytics to ensure that soldiers have enough of the right munitions and supplies in particular theaters of war and enough support logistics. “Logistics and operational maintenance take on a life-or-death consequence if I cannot ship enough munitions or vehicles into a specific theater,” Jardim says.

Qlik’s Churchill says that a customer within the Army is using predictive analytics tools to build models that support force enablement and predict the capabilities that will be needed in the future and which capabilities will be diminished, as well as the capabilities that will be required if certain scenarios arise.

The Pentagon is also working on predictive analytics tools for financial management via the Advanta workflow tool, which has brought together roughly 200 of the DOD’s enterprise business systems, Churchill says.

“How can they use predictive models to understand the propensity to have to de-obligate funds from a particular program in the future?” Churchill says. “As I am evaluating the formulation and execution of budgets, technologies like this have the ability to help those decision-makers identify the low-hanging fruit. How do I put those insights in front of people that they wouldn’t have gotten before?”

Predictive maintenance is also a key use case, especially for vehicles and other heavy equipment. Models can ingest data such as the weather and operating conditions of vehicles, not just how many hours they have been running, to determine when they will break down, Churchill says.

gorodenkoff/Getty Images