Toutes les vulnérabilités
HIGHAI/LLMexploited in the wild

AI-AGENT-INDIRECT-PROMPT-INJECTION-2025

AI coding · AI coding agents (Cursor, GitHub Copilot, Claude Code, Windsurf)

Résumé

Coding agents that autonomously read project and external content are vulnerable to indirect prompt injection, where hidden instructions placed in untrusted material the agent ingests hijack its behavior. The injection surface is broad: a poisoned README, source-code comment, GitHub issue or PR comment, a dependency's files, a fetched web page, or an MCP tool description, with instructions often concealed using invisible Unicode characters so a human reviewer never sees them, as Pillar Security demonstrated with the 'Rules File Backdoor' technique. Because the agent cannot distinguish trusted developer instructions from attacker text in the data it processes, the injected commands can direct it to insert a backdoor, weaken security controls, exfiltrate secrets, or run shell/MCP commands. Johann Rehberger (Embrace The Red) proved the data-exfiltration variant in Cursor with CVE-2025-54132 (disclosed June 30, 2025, fixed in v1.3): a comment-embedded payload made Cursor render a Mermaid diagram containing an attacker image URL, auto-firing an outbound request that leaked API keys and agent memory without confirmation. When the developer merges or runs the agent's resulting output unmonitored, the attacker-controlled changes land directly in the codebase or on the developer's machine.

Comment l’éviter dans votre code

  • Scan repository content, issues, PRs and dependency files for hidden or invisible-Unicode instructions before the agent ingests them.
  • Sandbox agent execution and require explicit human approval before it runs any shell or MCP command.
  • Block automatic outbound requests (image/diagram rendering, link fetches) that can serve as exfiltration channels.
  • Treat agent diffs as untrusted: mandatory human review plus security scanning, never auto-merge.
  • Pin and vet MCP servers, tool descriptions and rules/config files the agent reads.

Références

Vulnérabilités liées

Tout AI/LLM →