Reducing AI Hallucinations with Retrieval-Augmented Generation
Hallucinations are a major challenge for all organizations adopting GenAI solutions. In a hallucination, a large language model produces a result that sounds reasonable but is based on factually incorrect information. In a study by Carnegie Mellon University, researchers found that LLMs hallucinate as frequently as 1 in 10 responses.
The military potentially could reduce hallucinations with retrieval-augmented generation (RAG), a technique that ensures LLMs receive the most current and relevant data.
Lack of Explainability Hinders Trust in AI
In a Ponemon Institute survey, researchers found that 57% of cybersecurity professionals cited “lack of explainability” as a barrier to trusting AI solutions. Lack of explainability stems from the sometimes-opaque design-making processes of GenAI. While processing data, LLMs may not exercise correct judgement in characterizing it and may identify benign behavior as malicious. In response, AI solutions providers have developed explainable AI frameworks to enhance transparency.
DISCOVER: New cyber solutions are found by looking for hidden patterns.
Prompt Injection, Jailbreaking, “Cloudborne” Attacks and Cloud-Jacking
Among the security vulnerabilities inherent in LLMs are prompt injection and jailbreaking. According to the Open Web Application Security Project, “Prompt injection involves manipulating model responses through specific inputs to alter its behavior,” while jailbreaking “is a form of prompt injection where the attacker provides inputs that cause the model to disregard its safety protocols entirely.” RAG and other methods can guard against these vulnerabilities.
The Defense Department also must protect against “Cloudborne” attacks, which exploit a vulnerability in bare-metal cloud servers to implant a malicious backdoor in its firmware, and cloud-jacking, in which bad actors exercise control over cloud resources, possibly including cloud-based LLM deployments.
Expanding GenAI Test and Evaluation Techniques
The National Institute of Standards and Technology has moved to address limited avenues for GenAI test and evaluation by establishing the NIST GenAI evaluation program through the NIST Information Technology Laboratory. The program provides a platform for test and evaluation with the goal of assessing GenAI technologies. Program goals include creation of benchmark datasets; developing detection technologies that can authenticate content; conducting comparative analyses with relevant metrics; and promoting technologies that can identify bad information.
LEARN MORE: Predictive AI is essential to zero-trust security.
Plan and Manage AI Initiatives With Third-Party Partners
The CDW Artificial Intelligence Research Report describes additional challenges with AI management, noting that organizations may face difficulties in finding highly skilled IT staff, whether building their own AI capabilities or adopting cloud services. Government agencies and other groups also may find it challenging to ensure data quality and availability in addition to implementing and scaling AI solutions.
Third-party partners can help federal officials overcome these obstacles when planning and managing their AI initiatives. An experienced managed services provider can assist agencies in navigating the four prominent GenAI risks identified by Task Force Lima, at the Pentagon and beyond.