Breaking the Protocol: Security Analysis of the Model Context Protocol Specification and Prompt Injection Vulnerabilities in Tool-Integrated LLM Agents

Authors: N. Maloyan, D. Namiot
Published: Modern Information Technologies and IT-education 21 (3), 2026
MCP Security Prompt Injection Agentic AI


Abstract

This paper presents a systematic security analysis of the Model Context Protocol (MCP) specification, an emerging standard for connecting large language models to external tools and data sources. We identify critical prompt injection vulnerabilities that arise when LLM agents interact with tool ecosystems through MCP, demonstrating how malicious tool descriptions, poisoned context, and crafted responses can compromise agent behavior across the protocol boundary.


Background

The Model Context Protocol has emerged as a widely adopted open standard for integrating LLMs with external tools, databases, and services. Originally developed to provide a uniform interface between AI assistants and the growing ecosystem of developer tools, MCP has seen rapid adoption across major AI platforms and development environments. However, as with many protocols designed primarily for functionality, security considerations were not the central focus of its initial specification.

Tool-integrated LLM agents represent a paradigm shift from standalone language models to autonomous systems capable of reading files, executing code, querying databases, and interacting with external APIs. This expanded capability surface dramatically increases both the utility and the risk profile of these systems. When an LLM agent can take real-world actions through tool calls, a successful prompt injection attack is no longer limited to generating misleading text -- it can result in data exfiltration, unauthorized code execution, or system compromise.

Prior work on prompt injection has focused primarily on direct interactions between users and models, or on retrieval-augmented generation (RAG) pipelines. The unique security challenges posed by standardized tool protocols like MCP -- where trust boundaries between the model, the protocol layer, and tool servers become blurred -- have received comparatively little attention. This paper addresses that gap with a formal analysis of MCP's attack surface.

Methodology

We conducted a specification-level security audit of the Model Context Protocol, examining each protocol message type, capability negotiation mechanism, and tool invocation pattern for potential injection vectors. Our analysis covered the full lifecycle of an MCP session: initialization, tool discovery, context assembly, tool invocation, and response processing. For each stage, we identified points where untrusted data could influence model behavior.

We then constructed a taxonomy of attack vectors specific to tool-integrated LLM agents operating over MCP. These included: malicious tool descriptions that embed hidden instructions in tool metadata, poisoned tool responses that inject directives into the model's context, cross-tool escalation attacks where one tool's output manipulates subsequent tool calls, and server impersonation scenarios where a compromised MCP server serves adversarial content.

Each attack vector was validated through proof-of-concept implementations against multiple MCP-compatible agent frameworks. We measured attack success rates, the conditions required for exploitation, and the effectiveness of existing mitigations. Our evaluation spanned both open-source and commercial agent platforms to ensure broad applicability of findings.

Results

Our analysis revealed several classes of vulnerabilities inherent to the MCP specification's current design. Tool description injection proved particularly effective: because MCP tool descriptions are passed directly into the model's context during tool discovery, an attacker controlling a tool server can embed arbitrary instructions that the model treats as authoritative. In our experiments, this attack vector achieved high success rates across all tested agent platforms.

Cross-tool poisoning attacks -- where the output of one tool call contains instructions that alter the model's behavior on subsequent tool calls -- were similarly effective. These attacks exploit the shared context window that tool-integrated agents maintain across multiple tool interactions. The sequential nature of agent reasoning means that poisoned data introduced early in a chain of tool calls can influence all subsequent decisions.

We also identified protocol-level weaknesses in MCP's capability negotiation, where a malicious server can misrepresent its capabilities to gain access to sensitive operations. The absence of cryptographic authentication between MCP clients and servers in the base specification further compounds these risks, as there is no built-in mechanism to verify that a tool server is who it claims to be.

Discussion

The vulnerabilities identified in this work point to a fundamental tension in the design of tool integration protocols for LLM agents: the same flexibility that makes MCP useful -- its ability to dynamically discover and invoke arbitrary tools -- also creates a broad attack surface that is difficult to secure. Unlike traditional API protocols where inputs and outputs have well-defined types and validation rules, MCP passes natural language descriptions and responses that the model must interpret, creating an inherent injection surface.

We propose several mitigation strategies, including tool description sandboxing, output sanitization layers between tool responses and model context, capability-based access control with explicit user approval for sensitive operations, and cryptographic attestation of tool server identity. However, we note that fully eliminating prompt injection in tool-integrated agents likely requires architectural changes beyond what protocol-level mitigations alone can achieve.

Related Topics

Prompt Injection in Agentic Coding Assistants · Prompt Injection in Defended Systems · LLM-as-a-Judge Vulnerabilities

Access the Paper: View on Google Scholar

Cite as

Maloyan, N., Namiot, D. Breaking the Protocol: Security Analysis of the Model Context Protocol Specification and Prompt Injection Vulnerabilities in Tool-Integrated LLM Agents. Modern Information Technologies and IT-education 21 (3), 2026.