Keep the Garbage Data out of Agency Systems
The term “garbage in, garbage out” has become a cliché because it’s true. Technologies being deployed at the edge, like artificial intelligence and machine learning, are only as good as the data they are collecting.
The challenge is that data being collected at the edge is raw and must be cleaned and normalized before it can be effectively analyzed. This is tedious and time-consuming, but very necessary.
The federal government is relying heavily on AI and machine learning to improve outcomes. Its trust in those technologies will only be rewarded if the data is of high integrity. Data cleansing ensures more accurate and more trustworthy recommendations.
DIVE DEEPER: Find out how agencies can make the most of unstructured data.
Change the Way Data Is Stored
Once the cleansing process is complete, the data must be stored in a highly scalable database. Traditional storage appliances are not appropriate, as they are not built to manage very high data volumes.
Increasing the speed of temporary storage is essential for faster processing and analysis of high amounts of data. Most systems today ingest data to temporary storage before passing it along to long-term storage for processing. This process can be sped up by employing Non-Volatile Memory Express drives that offer consistent performance and high-read, high-write content.
Getting around having to use temporary storage in the first place is an even better option. Persistent memory offers enormous scalability, making it ideal for high data volumes. Data can be ingested directly into the persistent storage module without having to go into temporary storage. This can save time, allowing the data to be stored and analyzed more quickly.
MORE FROM FEDTECH: How does edge computing enable a faster, more resilient government?
Minimize High-Volume Data Transfers
It’s even better to decrease the amount of data sent back to the data center. This is where the true promise of edge analytics begins: By pushing analytics to the edge, only the most critical data needs to be sent to the central core for more in-depth processing.