Cloud data management helped the Government Accountability Office get a better picture of its storage environment, says Engineering Manager Manjot Singh.

HCI Helps Feds Find New Ways to Store and Analyze Data

Labor, NIH, NCI and GAO store massive data lakes in networks organized with hyperconverged infrastructure.

The Government Accountability Office, like many agencies, had a data storage challenge. Its multiple solutions were located both on-premises and in the cloud. At the same time, the agency was a big user of virtual desktop infrastructure technology as well as hyperconverged infrastructure

“As we continued to build on HCI, we ran into challenges with existing backup solutions that had difficulties integrating with cutting-edge tech, or just difficulty integrating with new technology as a whole,” explains Manjot Singh, the engineering manager for GAO. 

“That’s when we realized we needed a better solution — something that’s more modern, that integrates better with our existing technology.”

So GAO implemented Cohesity DataPlatform to join its secondary data and apps into one platform, as well as DataProtect, a backup and recovery offering, to better protect its data. 

The agency moved to cloud data management, says Singh, so that GAO could get a better picture of its overall storage environment and improve backup and recovery. Traditional backup methods work less well in hyperconverged environments, which often run backup on the same infrastructure as the apps.

“The platform checked a lot of boxes for us, and so far, we have been pleased. It’s made things a lot easier. It has a lot more capabilities, such as deduplication and compression,” he says. 

“We have interconnectivities between our data centers and the platform, which is great because I can send deduplicated deltas across my data center links so that I don’t have to consume so much bandwidth,” he adds.

IT%20Infrastructure_IR_1%20(2)_0.jpg

How Cloud Data Management Can Help Feds

Cloud data management — typically implemented as part of an overall business transformation plan — can help agencies get more out of their data. 

The move to cloud data management began as agencies started to adopt HCI and realized they needed a different way to access, share, archive and manage their data in that environment. The ­technology is already improving productivity and adherence to policies and mandates, says Steven Hill, senior analyst for applied infrastructure and storage technologies at research firm 451 Research.

“Data is all about security and policy, especially within the military and government. There are very specific rules in terms of protecting that data, securing it from unauthorized access and making certain that it is archived in the appropriate manner,” says Hill.

“This is where we’re seeing this whole need for data management that can span any kind of environment,” he adds.

MORE FROM FEDTECH: Learn how embracing a cloud computing architecture can benefit your federal agency. 

Upgrades to Storage Help with Data Management, Version Control

The National Institutes of Health’s Office of Research Services provides the underpinnings for the multifaceted agency, providing training, safety and other important services. Its IT infrastructure was a classic, traditional tiered environment with server compute, a VMware hypervisor and Dell EMC tiered SAN technology with Data Domain that had outlived its usefulness.

“It was all just sitting out there pretty unmanaged, no archive except for using backup. And the problem with the older system was all the maintenance it needed. With the human capital it required, it became outrageous,” says Mark Rein, the agency’s former CIO. 

19.5%

Percentage of IT pros who plan to make data management a priority in 2019

Source: searchstorage.techtarget.com, “Cloud, data management top 2019 enterprise data storage options,” Jan. 8, 2019

NIH ORS implemented Cohesity DataPlatform and DataProtect in a bid to improve the agency’s data management and visibility. Cohesity is providing better Tier 2 storage management for the agency and, most important, improving its overall data resiliency. 

“It’s data that’s been at rest for years, multiple data stores that continue to grow,” Rein says. “There’s been no analysis of the data, no tagging, no metadata. So now I’m looking at all the data and understanding what it is and what the retention requirements are.

The U.S. Trade and Development Agency started using cloud backup for its HCI environment about five years ago, says Benjamin Bergersen, the CIO nd senior agency official for ­information security risk management.

The USTDA, which helps companies create jobs with exports of U.S. goods, is seeing the benefit of cloud data management. It helps with version control, Bergersen says, and takes the onus off IT when it comes to backup and restore. 

“If you’ve got a user who says, ‘I blew up that document accidentally, let me go to the one that was saved an hour ago,’ he or she can do that alone. At the user level, you just right-click the file, go to the version history and pull up what you need, wherever it is. 

“I did that just yesterday, and it saved me a lot of work — plus, you don’t need someone from IT or the help desk to get involved,” he says.

MORE FROM FEDTECH: Find out how fog computing can help your agency. 

Labor, NCI Explore Hyperconvergence and Cloud Storage

The more information an agency has about its data, Hill says, the more flexibility it has in handling and automating it.

“This is really about the re-emergence of object storage as the ideal framework for policy-based management because of its metadata capabilities, as well as its massive scalability,” he says. 

One example: the Labor Department, which is looking for solutions for future data center relocations and closures. The agency is studying the benefits of hyperconvergence, which may be valuable for legacy application migrations and situations in which large amounts of stored data are accessed regularly, says spokesperson Laura McGinnis.

This means that agency IT experts must also find compatible data storage and management solutions. The department is piloting three off-premises locations as an alternative to data centers.

“By moving to or using cloud-based storage, the department will also benefit by eliminating data safeguarding concerns, the fear of losing data or being dependent on backing up data,” McGinnis says. “Additionally, ­cloud-based data storage provides for ease of use in reconnecting data onsite and eliminates the need for expensive infrastructure going forward.”

The National Cancer Institute has its users in mind with its own cloud data management program. The agency has 10 petabytes of data that reside in Dell EMC Isilon network-attached storage.

But that data wasn’t completely accessible; NCI had an unspoken policy that whoever created the data also managed it. This was impractical, though, because of the size and scale of its projects, the amount of data generated and the fact that many scientists and grant winners do not work on the National Institutes of Health campus, where NCI is located. 

Jeff Shilling, Acting CIO, National Cancer Institute
We’re no longer just storing data. It’s all about providing a complete interface to the stored data.”

Jeff Shilling Acting CIO, National Cancer Institute

“Take a big project we’ve got going on right now, The Cancer Genome Atlas. TCGA is a map of all the genes implicated in cancer and associated with cancer types. These are gigantic files,” says Jeff Shilling, acting CIO at NCI. “We had to get some of it into the cloud because as goes the data, so goes the compute.” 

About four years ago, the agency launched a cloud-based object storage project to centralize data management and improve analysis. It also focused on creating data standards to better serve the doctors and researchers who need access to the CT scans, genomes and imaging files used on a daily basis.

“We’re no longer just storing data,” Shilling adds. “It’s all about providing a complete interface to the stored data.” 

Now data can be better analyzed, not just by scientists, but also by the supercomputers that are changing the future of cancer research and treatment. 

“So we’re going from a state of everything being done so that humans could see it to a state of everything being done so machines can analyze it,” he says.

Photography by Ryan Donnell
May 02 2019

Sponsors