The hurricanes and tropical storms that frequently batter the Gulf Coast may be the most dramatic threat to the Bay Pines Veterans Affairs Healthcare System (VAHS), near St. Petersburg, Fla., but they’re far from the only threat. Even a brief outage that interrupts communications could jeopardize the lives of patients in the facility, says CIO Dawn Genzlinger, so preparing for a disaster is critical.
That’s why, about three years ago, the IT staff at Bay Pines VAHS installed an F4W emergency telephony system as part of the facility’s disaster recovery (DR) and continuity of operations (COOP) planning.
“During an emergency, we desperately need communications for data, voice, video and technology resources in order to keep treating patients,” Genzlinger says. “When you’re making your COOP plan, the first thing to consider is what business operations you’re trying to create continuity for, and virtually everything we do requires that we be able to communicate easily.”
Before the F4W implementation, Bay Pines lacked telephone switch redundancy, says Genzlinger. Once in place, F4W hardware and software provided the medical center with immediate failover capability, enabling working IP phones, satellite communications and access to the Internet, as well as interfacility and intrafacility data communications.
Those capabilities filled a crucial gap in Bay Pines’ contingency plan. And understanding what’s important to an organization and its mission has to be the starting point for any effective DR and COOP initiative, says Roberta Witty, a Gartner analyst who focuses on business continuity. It’s not always easy coming to that understanding. Agencies such as the Army IT Agency (ITA) and the National Finance Center (NFC) have found they must take into account how complex systems work together and find ways to test how they’ll react under disaster conditions.
“Know what you have to keep up and running, and what you’ll need to recover,” Witty says. “After that, you examine the risks to the organization and set priorities according to what processes are most at risk, and what the loss would likely be in a disaster. Then you decide what it will take to protect or recover essential assets and processes, and cost out those solutions.”
Where to Start
The Army IT Agency builds its DR and COOP plans for the Pentagon and National Capitol Region on nine objectives laid out in the 2004 Federal Preparedness Circular, according to Gregg Meserve, director of the ITA’s Defense Continuity Integrated Network. The list is comprehensive, including issues such as protecting lives and property, succession, preserving essential operations and quickly returning full service to customers. For Meserve, the contingency planning should cover every potential loss or vulnerability.
The number of natural-disaster declarations by the Federal Emergency Management Agency in 2011
SOURCE: Federal Emergency Management Agency
“A good COOP plan consists of many factors, including people, facilities and technology,” Meserve says.
Identifying the essential requirements for a DR and COOP plan can be a straightforward exercise. But taking into consideration myriad interwoven systems and processes, along with a wide variety of customers, can be dauntingly complex, says Gilbert Hawk, director of the IT Services Division of the NFC, an agency of the Agriculture Department that provides personnel and payroll services to an array of federal entities. At the NFC, as is the case with many federal agencies, contingency planning begins with the results of a business impact analysis (BIA), which is updated annually.
“Businesses change, customer needs change and risks change,” Hawk says. “The BIA leads to recovery time objectives and knowing acceptable downtimes — which may be zero. From there, you build a plan that allows you to meet the goals you’ve set to protect the organization and the mission.”
Revisit to Stay Relevant
Hawk cautions that at a minimum the DR and COOP plan must be reviewed annually to keep it relevant, and that any new system will require that the plan be revisited, no matter where the installation falls in the contingency planning cycle.
In addition to adjusting a contingency plan to match changes in the organization’s infrastructure, it’s important to keep up with the technologies that can help agencies deal with disaster, says Bay Pines Assistant CIO Michael Giurbino. The Veterans Affairs Department provides standard tools as part of an effort to promote consistency and efficiency across the agency, but the Bay Pines IT staff also evaluates new technologies that could potentially fill the medical center’s specific DR and COOP needs.
“We meet with a wide variety of vendors and explore the technologies that exist from a technical perspective, and decide what best fits with our facility,” Giurbino says.
For example, the Bay Pines IT staff recognized that adopting Cisco System’s Unified Communications System had broken the medical center’s dependency on traditional public switched telephone network access and let the agency use the hybrid F4W emergency communications solution that offered the immediate failover Bay Pines was looking for, Genzlinger says.
Testing and Tweaking
Bay Pines has avoided serious disaster since the F4W installation. But the emergency communications system, like the rest of the center’s DR and COOP plan, has been subjected to frequent, varied testing, Genzlinger says. Along with weekly tests of the IT systems, the staff regularly schedules tabletop discussions and designs catastrophe simulations that include all of the medical center’s departments, and sometimes emergency services in the surrounding community.
“Some of the most important tweaks to our systems have come from feedback from other departments, nursing or engineering,” Giurbino says.
Disaster Recovery @
To learn more about disaster recovery, read our Disaster Recovery and Continuity of Operations Plan Reference Guide: cdwg.com/disaster
The point of testing is to act on the feedback from the exercises so that the DR and COOP plan keeps up with a changing organization and shifting threat landscape, Hawk says. Any major disaster drill must be planned carefully, with input from all the stakeholders and formal mechanisms for gathering results and formulating an action plan, he says.
“If you don’t test, why bother to plan?” he says. “Testing is the only way to validate the plan, but you have to develop the drill so that you come out of it with action items. The point is continuous improvement.”
Meserve says all of ITA’s customers routinely perform COOP exercises, with the agency handling a critical part of the testing.
“We constantly monitor the customer’s failover state, and in the last six months alone, we have had over 2,185 systems failed over and failed back,” he says.