The Office of Personnel Management considers every day an opportunity to test its preparedness for an emergency situation.
“Our workforce is highly mobile,” OPM CIO Donna Seymour says. “Employees who telework, and require remote access to applications and data, drive disaster recovery and continuity of capabilities into the infrastructure.”
Disaster recovery and operations touch every branch of the government. IT leaders must ensure that employees can access critical data as well as communicate with one another at all times.
Seymour’s team prioritizes email, chat and video conferencing applications, as well as records management for reporting and analysis. Collaboration applications that support document sharing are essential for continuity.
“We design our services so that they are highly mobile and highly available,” Seymour says. “Collaboration and communication tools allow our users to gather virtually when they can’t physically to still accomplish their jobs.”
Prioritizing systems is a major part of preparing for a crisis, she says.
For example, during a national emergency, a hiring application could go offline without consequence. “There is only so much bandwidth coming into the building. You have to be agile in the way you utilize it and balance the load.”
Every Day Is a Test
Like OPM, the Environmental Protection Agency uses remote access and telework to gauge the organization’s readiness for everything from a major earthquake to pandemic influenza.
“Remote users work from their alternate work location or while on travel on a daily basis, and the system is constantly monitored and tested to validate capabilities and connectivity,” says Liz Purchia, the agency’s press secretary.
The EPA’s Office of Environmental Information established a mobile computing policy that provides secure access to the agency’s intranet. This protected infrastructure allows remote users to reach cloud-based productivity, collaboration and communications solutions such as Adobe Connect and Microsoft Office 365, which includes Lync and Outlook. Purchia says virtual meetings can be conducted via conference calls or video teleconferencing at certain locations.
Although the EPA has provided remote access capabilities since 2004, it wasn’t until 2011 that disaster recovery infrastructure was added to provide more redundant and fault-tolerant configurations.
“This initiative ensures that agency remote access capabilities are available during an outage of the primary site,” Purchia says.
Finding Opportunities to Test Remote Access
Snow days and other unplanned-for events provide the perfect chance to examine how technology staff and employees respond to working remotely, Seymour says.
“In theory, if mobility has been institutionalized as part of your workers’ day-to-day routine, then they should be able to conduct business as they normally would,” Seymour says. “In a time of crisis, you don’t want users to need to figure out how to do things differently.”
Following disruptive events, Seymour polls staff on their remote access experience. To catch any possible issues before they arise, she wants to ensure that employees who need access are able to log on to the network. Catching those problems now can save time later.
“We can’t prevent disruptions, but we can be agile and flexible enough to continue delivering services to our workforce,” she says.
Planning for Long-Term Disruptions
In addition to short-term disruptions, the EPA is focused on incidents that “render office space unusable for a period of time,” Purchia says. “All offices within EPA have individual plans for instructing employees how to conduct business in the wake of an emergency, including disaster recovery.”
The Federal Emergency Management Agency leads a working group that addresses disaster recovery issues for agencies, something the EPA is also a part of.
EPA employees are required to take annual, agencywide awareness training for continuity of operations. The agency uses FEMA’s national planning scenarios for its internal exercises, teaching employees how to deal with a range of incidents, including nuclear detonation, chemical and biological threats and cyberattacks.
“We coordinate closely with FEMA when appropriate and ensure that internal communication and coordination are maintained,” Purchia says. “Best practices are used in the development, testing and implementation of agency continuity of operations plans.”
OPM’s long-term disaster plan centers on mobility.
“If an area or city is shut down, we have to empower employees to move to another office or work from a residence,” Seymour says. “Tech teams must remain calm because everyone around us is going to be in a chaotic state,” she adds.
The Benefit of Setting Expectations
At the Social Security Administration, disaster recovery and business continuity are made easier when user expectations are properly set.
“We train across the agency for this, and our employees know their roles and responsibilities in a time of crisis,” says Jonas Garland, associate commissioner for SSA’s Office of Security and Emergency Preparedness. “They have two critical missions: paying benefits and processing applications for Social Security cards.”
SSA participates in FEMA’s national-level exercises as well. Garland says the agency has offices across the country, enabling them to easily move employees and work from place to place if needed.
“All of our field sites access data from the same data centers,” he says, and those plans have been tested many times, including during hurricanes Katrina and Sandy. “We can shift workloads to different locations.”
Garland acknowledges that employee training at all levels has helped.
“Leadership is committed to mandatory training at headquarters and in the field,” he says.
Disaster-Proofing Government IT
Tom Fellona, assistant associate commissioner in SSA’s Office of Systems, points to the data center as an area that should be equally ready for the challenges disasters present.
“In our performance testing, we size the environment so we have the exact configuration from network bandwidth and mainframe capacity at both data centers,” he says.
The data centers each feature idle backup engines that can be brought online for failover if necessary.
“They are preloaded with appropriate capacity and have the ability to be as robust as each other,” Fellona says.
The goal is to keep employees connected to the virtual private network that gives them access to systems and support programs along with Microsoft Office applications.
Fellona says support systems are split between the agency’s data centers, giving the agency redundancy in case of a failure.