Bar-El Tayouri – Mend

All About RAG: What It Is and How to Keep It Secure

Bar-El Tayouri — Thu, 24 Oct 2024 11:52:00 +0000

AI is growing in power and scope and many organizations have moved on from “simply” training models. In this blog, we will cover a common system of LLM use called Retrieval-Augmented Generation (RAG).

RAG adds some extra steps to typical use of a large language model (LLM) so that instead of working off just the prompt and its training data, the LLM has additional, usually more up-to-date, data “fresh in mind”.

It’s easy to see how huge this can be for business; being able to reference current company data without having to actually train an AI model on it has many, many useful applications.

How does RAG work?

RAG requires orchestration of two models, an embedder and a generator. A typical RAG system starts with a user query and a corpus of data such as company PDFs or Word documents.

Here’s how a typical architecture works:

During a pre-processing stage, the corpus is processed by an AI model called an embedder which transforms the documents into vectors of semantic meaning instead of plain words. Technically speaking, this stage is optional, but it makes things a lot faster if the documents are pre-processed and accessed from a vector database, rather than processed at runtime.

When a user query comes in, the prompt is also fed to the embedder, for the same reason.

Next, the embedded user query is used by a retrieval system to pull relevant pieces of text from the pre-embedded corpus. The retrieval system returns with a ranked set of relevant vectors.

The embedded user query and relevant documents are fed into a generative AI model, specifically a pre-trained large language model (LLM), which then combines the user query and retrieved documents to form a relevant and coherent output.

Security risks with RAG

The two biggest risks associated with RAG systems are poisoned databases and the leakage of sensitive data or personally identifiable information (PII). We’ve already seen instances where malicious actors manipulate databases by inserting harmful data. Attackers can skew the system’s outputs by making their data disproportionately influential, effectively controlling the AI’s responses, which poses a serious security threat.

When implementing RAG, it’s essential to ask key questions: What models are you using for embedding and generation, and where are you storing your data?

Choosing the right models is crucial because different models handle security, accuracy, and privacy differently. Ensuring that these models are fine-tuned for security and privacy concerns or that services are blocking malicious behavior is key, as poorly selected models and third-party services can introduce vulnerabilities.

If you’re using a vector database like Pinecone or LlamaIndex, you must ensure that your data storage complies with security and privacy regulations, especially if you’re working with sensitive data. These databases store the map between the embedding and text, and ensuring that they are properly encrypted and access-controlled is vital to prevent unauthorized manipulation. Developers often choose platforms like OpenSearch, a low-code vector database solution, because it offers easier management of these security aspects, with built-in monitoring, access control, and logging to help avoid data poisoning and leakage.

In addition to model selection and secure data storage, all AI systems operate with a system prompt—a hidden instruction set that initializes every task or conversation. Adjusting this system prompt can help mitigate security issues, such as preventing the model from generating harmful or sensitive content. However, while strengthening the system prompt can help reduce certain risks, it’s not a comprehensive solution. A strong system prompt serves as the first line of defense, but addressing AI vulnerabilities requires a broader approach, including fine-tuning the models for safety, ensuring data compliance, and implementing real-time monitoring, code sanitizers, and guardrails.

In summary, securing a RAG system involves more than just selecting the right models and storage solutions. It requires robust encryption, data governance policies, and continuous oversight to protect against data poisoning, information leakage, and other evolving security threats.

How to protect RAG systems

Protecting AI systems, including RAG systems, requires a multi-layered approach that combines proactive testing, security mechanisms, and safeguards to prevent vulnerabilities from being exploited.

One effective strategy is to red-team your model. Red-teaming RAG systems involves simulated attacks to identify weaknesses in your AI system, such as prompt injection or data poisoning, before they can be exploited in real-world scenarios.

To protect RAG systems, there are several key approaches to consider:

1. Firewalls

In AI, firewalls act as monitoring layers that evaluate both input and output. They can use heuristic techniques to detect suspicious activity, such as attempts to inject harmful prompts or commands. For example, if a user tries to manipulate the AI to ignore its initial instructions (via prompt injection) and generate unintended or harmful output, the firewall can flag this as a potential attack. While firewalls provide an extra layer of security, they aren’t foolproof and may miss more sophisticated attacks that don’t match known patterns.

2. Guardrails

Guardrails are predefined rules or constraints that limit the behavior and output of AI systems. These can be customized based on the use case, ensuring the AI follows certain safety and ethical standards.

NVIDIA NeMo Guardrails offers several types of guardrails:

Input rails filter and control what kinds of inputs are acceptable, ensuring sensitive data (like names or email addresses) is not processed.
Dialog rails shape conversational flows to ensure AI responds appropriately, based on predefined conversation structures.
Retrieval rails ensure the AI retrieves only trusted and relevant documents, minimizing the risk of poisoned data entering the system.
Execution rails limit the types of code or commands the AI can execute, preventing improper actions.
Output rails restrict the types of outputs the model can produce, protecting against hallucinations or inappropriate content.

NVIDIA Garak, another tool from NVIDIA, is an open-source red-teaming tool for testing vulnerabilities in large language models (LLMs). Garak helps identify common vulnerabilities, such as prompt injection or toxic content generation. It learns and adapts over time, improving its detection abilities with each use. Promptfoo is another tool that might be used.

3. Fact-checking and hallucination prevention

RAG systems can also incorporate self-checking mechanisms to verify the accuracy of generated content and prevent hallucinations—instances where the AI produces false information. Integrating fact-checking features can reduce the risk of presenting incorrect or harmful responses to users.

4. Shift-left security

A shift-left approach focuses on integrating security practices early in the development process. For RAG systems, this means ensuring that the data used for training and fine-tuning is free of bias, sensitive information, or inaccuracies from the start. Additionally, many RAG vulnerabilities may be in the code itself, so it’s worth scanning the code and organizing for fixes to take place before the production stage. By addressing these issues early, you minimize the risk of the system inadvertently sharing PII or being manipulated by malicious input.

Conclusion

As AI systems like RAG become more advanced, it’s critical to implement these protective measures to guard against an increasing array of security threats. Combining firewalls, guardrails, fact-checking, early security practices, and robust monitoring tools creates a comprehensive defense against potential vulnerabilities.

Shining a Light on Shadow AI: What It Is and How to Find It

Bar-El Tayouri — Thu, 29 Aug 2024 18:09:01 +0000

After speaking to a wide spectrum of customers ranging from SMBs to enterprises, three things have become clear:

Virtually everyone has moved on from using AI solely as an internal tool and are already deploying AI models.
Many are experimenting with AI agents.
Developers don’t ask the application security team what to do when it comes to AI use.

Add that together, and we get Shadow AI. This refers to AI usage that is not known or visible to an organization’s IT and security teams. Shadow AI comes in many forms, but in this blog we’ll stick to a discussion of Shadow AI as it pertains to applications.

Application security teams are well aware that AI models come with additional risk. What they’re less aware of is how much AI is already in the applications they’re tasked to protect. To get a sense of the scope of shadow AI, one of our customers uncovered more than 10,000 previously unknown AI projects in the organization’s code base.

And realistically speaking, this will only accelerate, because developers are moving fast when it comes to implementing this technology. In big organizations, there may be 100 new repos added every day, many including AI, that all need to be checked for compliance and protected.

Moreover, developers rarely (if ever) ask permission to use AI, and they often fail to routinely disclose AI projects to application security teams. But outside of areas under heavy government regulation, developers have the strongest voice when it comes to technical decisions. Any attempt to completely ban AI is a lost cause that will only lead to more shadow AI.

Before any kind of response to the risk AI poses can be forged, shadow AI must first be brought out into the sun.

The risks of the unknown

The cost of building an extremely secure AI model from scratch is, if not actually impossible, prohibitively expensive. Developers seeking to benefit from the latest AI technologies instead either build upon models found on Hugging Face or interface via APIs with existing large language models (LLMs) like ChatGPT or Claude. In either case, the attack surface is large and unwieldy. The non-deterministic, plain language nature of AI makes it easy to attack and difficult to secure.

Here are just a few of the risks that come with AI use in applications:

Major data breach. Generative AI models trained on company data or connected to Retrieval-Augmented Generation (RAG) components with access to company data can leak large amounts of sensitive information.
Unauthorized actions. AI agents, which are built to leverage AI output and take autonomous actions, that have access to important assets like databases or the ability to execute code may be manipulated to make unauthorized changes to data or execute malicious code.
Intellectual property (IP) disputes. Open source AI models may have noncompliant licenses that put the company’s IP in dangerous territory.
Bad decisions. While less of a traditional concern for AppSec practitioners, overreliance on AI, especially LLMs with a tendency to hallucinate information and sources, can lead to poor outcomes.

Prompt injection is the biggest vulnerability inherent to generative AI— and the most difficult to close. Prompt injection plus autonomous AI agents is an especially risky combination. LLMs are trained at a deep level to respond to commands. When you tell an LLM to do something it shouldn’t, it’s not shocking that it often complies. Prompt injection is not fought by changing an LLM’s code, but by building unseen-by-the-end-user prompts that command and remind the LLM not to do certain things throughout every user interaction. Unfortunately, these system prompts are not impossible to circumnavigate and are often leaked themselves.

That said, many AI risks can be minimized with a shift-left approach to development. That said, none can be addressed if the models and agents remain in the shadows.

Fighting AI with AI

The good news is that we can build tools that detect AI in applications, which is exactly what Mend.io is doing right now. The files and code of AI models and agents have certain characteristics that can be discovered by other AI models trained for that task. Likewise, these models can also detect the licenses of open source AI models.

With information on where and what AI models are used, AppSec teams can take steps to ensure the following:

Open source license adherence to company policies
Proper sanitization of inputs and outputs
Utilization of SOC 2-compliant services

Closing thoughts

Many governments are looking to tame the Wild West of AI development. The EU AI Act, which prohibits some types of AI applications and restricts many, went into effect on August 1, 2024, and imposes heavy compliance obligations onto the developers of AI products. To stay both in compliance and secure, organizations must take the first step of bringing all shadow AI out into the open.