MosaicLeaks: Can your research agent keep a secret?

Key Takeaways

MosaicLeaks is a research benchmark and paper highlighting a critical privacy vulnerability in AI research agents.
It demonstrates how AI agents can inadvertently leak sensitive private information through seemingly harmless external web queries, a phenomenon called the "mosaic effect."
The research introduces Privacy-Aware Deep Research (PA-DR), a training method that significantly reduces data leakage while improving task performance.
For AI practitioners and businesses, MosaicLeaks underscores the urgent need for privacy-by-design, robust security protocols, and specialized training for AI agents handling confidential data.

In the rapidly evolving world of artificial intelligence, AI agents are becoming incredibly powerful. These agents can sift through vast amounts of information, connect with various tools, and even perform complex research tasks. Imagine an AI assistant helping a legal firm analyze case documents or a financial institution performing market research. The potential is huge, but with great power comes great responsibility—especially when dealing with sensitive information.

This brings us to a crucial question: can your AI research agent truly keep a secret? A recent research effort, dubbed "MosaicLeaks," dives deep into this very concern, revealing a subtle yet significant privacy risk. It's not about malicious intent from the AI; it's about how these agents, by their very design, can inadvertently expose confidential data. This deep dive will explain what MosaicLeaks is, why it matters, how this "mosaic effect" works, and what it means for anyone building or using AI agents.

What is a Deep Research Agent (and Why are They Tricky)?

First, let's clarify what we mean by a "deep research agent." These are advanced AI systems, often powered by large language models (LLMs), designed to perform complex, multi-step information gathering and analysis. Unlike a simple chatbot that answers direct questions, a deep research agent can:

Access and process private, internal documents (e.g., company reports, confidential datasets).
Interact with external tools, such as web search engines, databases, or APIs, to gather public information.
Synthesize information from both private and public sources to answer complex, multi-hop questions.
Make decisions on what information to retrieve and how to formulate subsequent queries.

The appeal of these agents is clear: they can automate tedious research, accelerate insights, and free up human experts for higher-level tasks. However, their ability to bridge the gap between private, internal data and the vast, open internet introduces a unique set of privacy challenges. The agent needs to understand your confidential context to ask intelligent questions of the outside world, but this very process can become a leakage channel.

The "Mosaic Effect" Explained: When Small Pieces Reveal the Big Picture

At the heart of the MosaicLeaks research is a concept known as the "mosaic effect." This isn't a new idea in intelligence or cybersecurity, but its application to AI agents highlights a novel vulnerability. Here's how it works in the context of AI:

Imagine an AI research agent working for a healthcare company, tasked with a complex question that requires both internal company documents and external web searches. Let's say the agent needs to figure out the timeline of a competitor's recent cloud migration. To do this, it might first consult an internal document that mentions "MediConn had migrated 70% of its infrastructure to the cloud by January 2025."

Based on this private information, the agent then formulates several seemingly innocuous web queries. One query might reference "cloud-migration milestone," another "January 2024 security disclosure," and a third might try to narrow down "which vendor got hit." Individually, these web searches might appear harmless. They don't explicitly state "MediConn" or reveal the 70% migration figure.

However, an adversary monitoring the agent's outbound web traffic could collect these fragments. By piecing together the timeline, specific terms, and the context of the queries, the adversary could reconstruct the private information: "MediConn had migrated 70% of its infrastructure to the cloud by January 2025." This is the mosaic effect in action: no single piece of public information gives away the secret, but the combination of several pieces, especially when informed by private context, makes the secret inferable.

The adversary never sees the internal documents or the agent's reasoning process—only the cumulative log of its external queries. Yet, from these queries, they can infer sensitive enterprise information.

Introducing MosaicLeaks: A Benchmark for AI Privacy

Recognizing this growing risk, a team of researchers, including Alexander Gurung, Spandana Gella, Alexandre Drouin, Issam H. Laradji, Perouz Taslakian, and Rafael Pardinas (associated with ServiceNow), developed the MosaicLeaks benchmark. The official paper, "MosaicLeaks: Privacy Risks in Querying-in-the-Open for Deep Research Agents," was published around May-June 2026.

The goal of MosaicLeaks is to systematically quantify this privacy risk. It's not a commercial tool you can buy, but rather a carefully constructed testing environment designed to evaluate how well AI agents maintain confidentiality. Here's what makes the MosaicLeaks benchmark significant:

Multi-Hop Research Chains: The benchmark contains 1,001 complex, multi-hop research tasks. Each task requires the agent to combine information from both private, local enterprise documents and a controlled public web corpus.
Interleaving Private and Public Data: The tasks are structured so that the answer to one sub-question (often from private data) becomes a crucial "bridge entity" for formulating the next web query. This forces the agent to make external queries that inherently depend on local, sensitive information.
Synthetic Datasets: The enterprise documents used in MosaicLeaks are synthetic, ensuring that no real-world sensitive data is involved in the benchmark itself. The web corpus is also fixed and controlled.
Adversarial Evaluation: Leakage is measured by using an "adversary LLM" that only observes the agent's external queries. This adversary then attempts to infer private information at three distinct levels:
- Intent Leakage: Can the adversary infer the private research questions or goals the agent was trying to answer?
- Answer Leakage: Can the adversary infer answers to specific private questions about the enterprise documents?
- Full-Information Leakage: Can the adversary predict verifiable true claims about the enterprise documents?

The benchmark's controlled nature allows researchers to measure leakage hop-by-hop, providing a granular understanding of where and how information is exposed.

The Alarming Findings: Why "Prompting Privacy In" Isn't Enough

The findings from the MosaicLeaks research are a wake-up call. The researchers found that:

Frequent Leakage: Across various AI models tested, agents frequently leaked private information at all three levels (intent, answer, and full-information leakage).
Performance vs. Privacy Trade-off: Training agents solely for task performance actually made the privacy leakage worse. This highlights a fundamental tension between achieving optimal results and maintaining confidentiality.
Zero-Shot Prompting Limitations: While using "zero-shot privacy prompting" (e.g., instructing the agent to be careful with private data) can reduce leakage, it doesn't eliminate it. It's not a complete solution. As the paper states, "You can't prompt privacy in. You have to train it in."

These results clearly show that traditional approaches to securing AI, such as simple instructions or general security measures, are insufficient for the unique challenges posed by deep research agents. The inherent flexibility and decision-making capabilities of these agents mean they can find unexpected ways to expose data.

Privacy-Aware Deep Research (PA-DR): A Step Towards Secure Agents

To address the significant privacy risks identified, the MosaicLeaks researchers proposed a novel solution called Privacy-Aware Deep Research (PA-DR). This is a reinforcement learning (RL) framework designed to train AI agents to internalize privacy constraints while still performing their research tasks effectively.

PA-DR works by jointly optimizing for both task performance and privacy preservation. It combines rewards for successful task completion with a penalty derived from a learned privacy classifier. This classifier acts like an "adversary" during training, observing the agent's external calls and attempting to infer private information. By penalizing potential leaks at both the individual query level and the aggregate "mosaic" level, PA-DR encourages the agent to formulate external queries more carefully.

The results of applying PA-DR were promising. For example, when training the Qwen3-4B-Instruct model with PA-DR, the researchers observed a notable improvement:

Task accuracy (strict chain success) increased from 48.7% to 58.7%.
Answer leakage and full-information leakage significantly decreased from 34.0% to 9.9%.

This demonstrates that it is possible to build AI agents that are both effective at their tasks and significantly more private, though it requires a specialized training approach rather than relying on simple prompts or generic security.

Broader Implications: What MosaicLeaks Means for AI Practitioners and Businesses

The MosaicLeaks research has profound implications for anyone involved with AI, from developers building new agentic systems to businesses and freelancers deploying them.

For AI Developers and Researchers:

Prioritize Privacy-Aware Training: The key takeaway is that privacy needs to be "trained in," not just "prompted in." Future AI agent development must integrate privacy as a core objective during the model training and fine-tuning phases, using methods similar to PA-DR.
Develop Better Benchmarks: MosaicLeaks itself is a crucial step in providing a standardized way to measure privacy risks. More such benchmarks are needed to evaluate agents in diverse scenarios and with different types of sensitive data.
Rethink Tool Interaction: Developers need to design how AI agents interact with external tools and APIs with privacy in mind. This includes careful scoping of permissions and filtering of outputs.
Focus on Explainability and Auditability: Understanding why an agent makes a particular external query is vital for identifying and preventing potential leaks. Robust logging and audit trails are essential.

For Businesses and Freelancers Using AI Agents:

The risks highlighted by MosaicLeaks are not theoretical; they represent real-world vulnerabilities for organizations handling confidential data. Here's what you need to consider:

Adopt a "Privacy by Design" Approach: Don't treat security as an afterthought. Build privacy into your AI systems from the very beginning. This means collecting only necessary data, anonymizing or pseudonymizing sensitive information, and restricting access.
Implement Strict Access Controls and Least Privilege: AI agents should only have the minimum permissions required for their specific tasks. Granting broad access significantly increases the risk of data exposure. Regularly review and revoke unnecessary privileges.
Be Wary of "Shadow AI": Unsanctioned use of public AI tools by employees can lead to significant data leakage, as these tools might store or reuse sensitive company data. Implement clear policies and training.
Monitor Agent Activity and Data Flows: It's critical to have visibility into what data your AI agents are accessing, what external queries they are making, and what information they are returning. Implement runtime monitoring and robust audit trails.
Input and Output Filtering: Implement mechanisms to filter both the data fed into the AI agent and the data it outputs, especially when interacting with external systems or users. PII redaction is a key defense.
Understand Vendor Security: If you're using third-party AI agent solutions, thoroughly vet your vendors. Understand their data handling practices, security certifications, and whether they will sign business associate agreements (BAAs) if dealing with regulated data like PHI.
Compartmentalize Data: Avoid giving one AI agent access to all sensitive data. Instead, use different AI tools or agents for different purposes, compartmentalizing data exposure.
Stay Informed: The field of AI security is constantly evolving. Keep up-to-date with new research, vulnerabilities, and best practices.

The challenge of securing AI agents is not just about preventing direct attacks; it's also about understanding the subtle ways these intelligent systems can inadvertently compromise confidentiality through their autonomous actions. The "mosaic effect" is a prime example of such a subtle yet powerful risk.

The Road Ahead: Building Trustworthy AI Agents

MosaicLeaks serves as a vital reminder that while AI agents offer immense potential, their deployment requires a deep understanding of their unique security and privacy implications. The research highlights that simply instructing an AI to be private is not enough; privacy must be engineered and trained into the very fabric of these systems. As AI agents become more integrated into critical workflows, the insights from projects like MosaicLeaks will be indispensable in building AI systems that are not only powerful and efficient but also inherently trustworthy and secure.

Frequently Asked Questions

What is the "mosaic effect" in the context of AI agents?

The "mosaic effect" describes how an AI agent can inadvertently leak sensitive private information by making a series of seemingly harmless external queries. While no single query reveals the secret, an adversary monitoring these queries can combine the fragments to infer confidential data that the agent was processing internally.

Is MosaicLeaks a commercial AI tool or a software product?

No, MosaicLeaks is not a commercial tool or a software product. It is a research benchmark and a concept presented in a scientific paper titled "MosaicLeaks: Privacy Risks in Querying-in-the-Open for Deep Research Agents." Its purpose is to study and quantify privacy leakage risks in deep research AI agents.

What is PA-DR and how does it help prevent data leakage?

PA-DR stands for Privacy-Aware Deep Research. It's a reinforcement learning (RL) framework proposed by the MosaicLeaks researchers to train AI agents to be both effective at their tasks and more private. PA-DR achieves this by combining rewards for task success with penalties for potential privacy leaks, guided by a privacy classifier during training.

What are the key takeaways for businesses and freelancers using AI agents?

Businesses and freelancers should adopt a "Privacy by Design" approach, implement strict access controls and the principle of least privilege for AI agents, and actively monitor agent activity. It's also crucial to be aware of "shadow AI" (unsanctioned tool use) and understand that simply prompting an AI for privacy is insufficient; privacy must be built into the system's training and architecture.