Many federal agencies are moving data off of legacy systems and into the cloud, and even into data lakes. However, to truly harness the power of that data, they need to develop and enact data governance policies across their enterprises, according to federal IT leaders.
That was a key takeaway from keynote addresses by Energy Department CIO Max Everett and Small Business Administration CIO Maria Roat delivered on Jan. 31 at MeriTalk’s “Creating a Data Driven Government” event in Washington, D.C.
Data lakes, which can exist in the cloud, are repositories with flat architectures that can hold data from a wide variety of formats, including unstructured data, allowing users to transform and visualize the data into new structures when needed.
“People say, ‘We can just take all our data and dump it into a data lake. Problem solved,’” Everett said at the event. “Well, those data lakes quickly become stagnant ponds if you don't have the governance, if you haven't thought ahead and don't have all the pieces.”
Energy Department Moves Toward the Cloud
One of the key drivers of government transformation in the President’s Management Agenda is data accountability and transparency, Everett noted. Data is the driver for making informed decisions, setting priorities and managing a 21st-century workforce, he added.
The Energy Department manages 17 national research labs with a wide range of missions, and the department is always creating data from sensors and high-performance computers. The agency wants to be able to “see that and cross-pollinate that inside and outside government” Everett said. Agency data drives public-private partnerships and technology transfers forward, he said. The agency is the steward of its data, and that requires governance.
“We have to have a strategy across the whole agency,” he said, along with “having data governance standards across a very federated department.”
Everett said he is an advocate of the government’s new Cloud Smart program, particularly its emphasis on performing a business case analysis for deciding when data belongs in a cloud. Cloud does help with data management, Everett said, but cloud is not a “panacea,” he added.
The agency just put its first commodity workloads in the cloud last year, Everett said, acknowledging that DOE is “a little bit behind the eight ball” on cloud adoption, but is “catching up real quick.”
Cloud brings better availability, stronger disaster recovery capabilities and a greater ability to collaborate in mobile environments, Everett said.
Agencies need to think through data access and change control management issues as they move to the cloud, especially if it is a public cloud environment, Everett noted. He also said agencies must think through interoperability and potential vendor lock-in challenges before they migrate data.
SBA Focuses on Data Governance
SBA’s Roat indicated she is excited to get moving on a number of fronts now that the partial government shutdown is over. “Now that everyone's back from furlough, I’m antsy and ready to get going,” said Roat. “I’m excited to have my team back.”
SBA has data on 30 million entrepreneurs and small businesses across the country, Roat noted, adding that the agency has been working to get its arms around who is using its data, how it is being used and where it is moving across the department. There is a great deal of data duplication across the agency’s program offices, she said, and SBA needs to continue to mature its data governance, securing the data and how it is being moved, including into the cloud.
“We have laid out our stack,” she said, adding that the next step is “working through the governance piece of it.”
SBA created an enterprise data manager about a year ago, as well as a community of practice with all of the agency’s data practitioners. SBA has introduced business intelligence tools to allow those practitioners to run data analytics and pull data from different databases and program offices. That will allow the agency to reduce duplication in the long term, Roat said.
SBA is exploring the possibility of deploying more robotic process automation and artificial intelligence tools for its program offices and business operations. Understanding the data layer is key to deciding when to move forward on RPA, she said.
“We want to get some automation and AI. But you have to get the data layer right before you can do that,” she said.