Close

See How Your Peers Are Moving Forward in the Cloud

New research from CDW can help you build on your success and take the next step.

Oct 07 2024
Cloud

Agencies Save the Day with Effective Backup and Recovery Solutions

When something goes wrong, a strong disaster mitigation strategy can make it right again.

U.S. Navy IT leaders received notification from an aircraft carrier on deployment that the ship’s entire unclassified network was inoperable in April 2022.

“The impact is tremendous,” says Jeff Myers, deputy program manager for PMW 160, the Navy’s tactical networking program office.

The office oversees the Consolidated Afloat Networks and Enterprise Services, or CANES, program, providing a tactical edge network for all ships and submarines in the Navy. Although the network does not govern navigation or warfighting applications, an outage can bring huge portions of a ship’s operations to a standstill.

“All email, all chat, all sailor learning and training, all teleconference capabilities were unavailable,” Myers says. “This was a very severe loss of network capacity.”

Click the banner below to read the 2024 CDW Cloud Computing Research Report.

 

The outage, caused by a hardware failure, took days of intensive effort to correct. But officials knew they had to find the solutions and systems they needed to restore operations and data.

Incidents such as this illustrate the critical nature of backup and recovery, especially for agencies that store huge amounts of mission-critical and sensitive data. Organizations use a range of tools; for example, the Treasury Department’s Financial Crimes Enforcement Network uses Commvault, the National Institute of Allergy and Infectious Diseases within the National Institutes of Health relies on AvePoint, and the Navy uses Veeam to back up data and systems on CANES. But just as important as any of these solutions are the people and processes behind them, says Nicole Burdette, principal at MeriTalk.

“It’s important to think about backup and recovery in the context of an overall data strategy,” Burdette says. “Not all data or applications are equally important. Knowing what’s mission critical, where data lives, how much downtime a mission can tolerate — that’s all part of backup and recovery planning.”

Jeff Myers
CANES has to be available 24/7 for every level of mission, from daily peacetime activities all the way up through conflict.”

Jeff Myers Deputy Program Manager, U.S. Navy PMW 160

Strategizing for Backup and Recovery of CANES

CANES is effectively a LAN data center at the tactical edge, with more than 133 installations afloat aboard a Navy ship or submarine. Myers notes that the network features nearly 200 commercial, off-the-shelf solutions, with Veeam used to create backups on an on-premises appliance.

“CANES has to be available 24/7 for every level of mission, from daily peacetime activities all the way up through conflict,” Myers says. “Everyone from the junior-most sailor up to the senior-most admiral needs to use our network.”

As Navy IT leaders strategize for backup and recovery, Myers says, they are planning to protect CANES not only from hardware outages but also from cyberattacks. He also notes that the sailors maintaining networking equipment on individual ships are relatively young and inexperienced, making it important for backup and recovery processes to be as straightforward as possible.

“The average age of the sailors who are operating our network is 20 to 22 years old,” Myers says. “They come into the Navy with a high school education, maybe a little bit of college, and so they have to go through the training process that we’ve established for them to not just operate a very complex data center, but also to repair it and maintain it. When a ship is tied up to a pier, it is much easier for us to send our subject matter experts out to assist. When they’re underway, it’s a lot harder, and those sailors need to know how take action mostly independently.”

66%

The percentage of federal IT decision-makers who are concerned that their organization’s data infrastructure may not be resilient enough to recover all of its data

Source: meritalk.com, “The Federal Data Maturity Report,” May 2024

It took about two weeks to fully restore the ship’s network in the 2022 outage, but ultimately, all of the data up to the time of the hardware failure was restored.

“We have to know that we can always either roll back or very quickly recover the data that was already in our applications,” Myers says. “It’s a key function, and one that we absolutely expect our network to be able to perform.”

How the Census Bureau Restored 81 Million Files Quickly

The U.S. Census Bureau maintains more than 90 petabytes of backup data, spread among on-premises and public cloud environments.

“Protecting the data that the Census Bureau collects and uses to report on our nation’s people and economy is critical to its mission,” says Jan Dickerson, chief of the enterprise backup branch in the agency’s Computer Services Division. “Backup is the last resort for getting data back, so it has to be right. Whether we are protecting from human error, faulty software releases, hardware issues, malware or cyberattacks, or natural disasters — all are important.”

Pull quote

Dickerson says the Census Bureau has a “central, well-established” backup and recovery environment that uses a combination of cloud-native features and commercial, off-the-shelf software to protect data in both the agency’s own data centers and its resources in the public cloud. “This approach gives us operational consistency wherever our data resides, and it also allows us to recover our data on-premises to the cloud, or vice versa, or between cloud providers, or between on-premises sites,” she says.

The bureau, Dickerson says, seeks out backup and recovery tools with features including deduplication (which helps optimize the use of disk storage), auto image replication (which creates identical copies across multiple sites to aid with disaster recovery) and immutable backups (which remove cyberattackers’ ability to seize control over both production and backup environments simultaneously).

While the agency periodically tests its entire backup environment and processes, Dickerson says, her team fields daily requests from different Census Bureau program areas to restore everything from single files to entire systems. “It could be for any number of reasons,” she says. “A patch that needs to be rolled back or something deleted by accident.”

Over the past year, Dickerson says, the agency successfully restored 81 million files. “That is a good indicator that our backup environment is working end to end,” she says. “Those restore requests are our main priority. We get them done quickly. We want to make sure our users know we’re there for them.”

RELATED: Zero-touch configuration helped the Census Bureau modernize its IT.

How the USPTO Is Protecting Its Data

Annually, the U.S. Patent and Trademark Office takes in more than 600,000 new filings for patents alone, and the organization runs more than 7,000 servers on-premises and nearly 4,000 more in the public cloud. USPTO officials must not only back up all of that data, but also ensure uptime for the agency’s various systems — many of which are interdependent and some of which are public-facing.

In 2018, the organization faced an outage to its Patent Application Locating and Monitoring database. As a result, the agency had to temporarily shut down more than a dozen systems, including those that allowed patent applicants to pay fees. But thanks to effective backup tools and processes, the database was fully restored within several days.

“It was very important for us to make sure that we recovered all of that data and then restored that database to normal operations,” says Debbie Stephens, deputy CIO. “The good news is, we didn’t lose the data. But we lost time. That is our primary database for patent application data, so that was easily petabytes worth of information. It had dependencies for things like being able to submit payments, because the system goes to that database to check on the status of an application to allow someone to pay a fee.”

Stephens notes that USPTO uses Cohesity software for data and server backups while relying on Commvault to back up individual workstations. Both solutions feature immutable backups.

LEARN MORE: New cyber resilience solutions focus on zero trust.

As federal IT leaders design their backup and recovery environments, Stephens says, they must keep their overarching mission in mind.

“It seems very basic: What are the drivers of your work, what is important to the business?” she says. “The mission for us, which is constitutional, is to issue patents and register trademarks. So, you have to ask, what are your drivers and how do you protect them? You have to determine your most important driver, and whatever it is, that should be your No. 1 system to protect.”

Monkey Business Images/Getty Images