Agencies often work with more than one cloud service provider, Parashar says, to take advantage of the various capabilities each offers.
“Tensor processing units, which are good for certain type of applications, are unique to Google Cloud, as an example,” he says. “That’s part of the reason why, depending on what research you’re trying to do, you need to have access to a range of resources. It doesn’t make sense to go to only one vendor — or even only to cloud services.”
How Agencies Can Seamlessly Manage Data
Agencies may be able to use a commercial cloud- and on-premises architecture to internally streamline data management.
Before the National Institutes of Health began disseminating data through cloud services, potential research collaborators in separate locations shared information via FTP or as an email attachment.
If the file was too large to email or transfer over the internet, some researchers shared data on thumb drives or CDs, which resulted in a lot of data duplication, says Nick Weber, program manager of NIH’s Science and Technology Research Infrastructure for Discovery, Experimentation and Sustainability (STRIDES) initiative.
Rather than having collaborators shuffle items between research centers — or submit NIH-affiliated data that relates to more than one discipline to several applicable repositories — contributors can now send the information to a general repository, allowing other researchers to access it through commercial cloud services, along with the computational tools the providers offer.
“Cloud has really flipped data sharing on its head,” Weber says. “Major data sets are located in cloud environments to allow people to use the data and collaborate with others there.
“That was a major driver for the STRIDES initiative and our partnership with Amazon, Google and Microsoft — to be able to say, how can we make this even simpler for researchers? How can we bring some additional ways to use the technologies the cloud offers to accelerate their research?”
Prior to implementing the Open Data Dissemination Program, the National Oceanic and Atmospheric Administration used a number of segmented paths to distribute information, according to CTO Frank Indiviglio.