Homoglyph Attacks on Autonomous AI Agents: How One Character Leads to Phishing and Command Execution

A critical class of attacks on autonomous AI agents has been discovered: due to Unicode homoglyphs, models can mistake visually identical URLs and commands. This leads to phishing or malicious CLI execution. For businesses, this risks compromising processes, RAG systems, and automation without obvious indicators, requiring strict input normalization.

Technical Context

Homoglyph (or homograph) attacks are a class of vulnerabilities where an attacker replaces characters in text with visually similar Unicode characters from a different alphabet or range. A human, and (worse) an LLM-based autonomous agent, often perceive the string as "the same," even though byte-wise and semantically they are different tokens, domains, or commands. A classic example: microsoft.com vs miсrosoft.com, where the "c" might be the Cyrillic "с" (U+0441).

Why is this particularly dangerous for agentic systems (autonomous agents, tool-use, RPA+LLM)? Because they have "hands"—access to the browser, terminal, Git, CRM, payments, and internal APIs. If an agent incorrectly recognizes a URL or command, it can navigate to a link, download a binary, execute a script, or send secrets to a fake endpoint on its own.

The Technical Cause: Divergence of Visual and Machine Representation

Unicode confusables: Different code points look identical (Cyrillic/Latin, mathematical symbols, fullwidth forms).
Lack of strict normalization before tokenization: The LLM sees a different set of tokens, while the environment (bash/browser/URL parser) interprets the string differently than the developer expected.
Merging with RAG/Agent Memory: A "compromised thread" or a note in the knowledge base may contain substituted links/commands that the agent later reproduces as "verified."
Failure of naive filters: Simple regexes to "ban Cyrillic" do not catch fullwidth or mathematical variants; allowing Unicode without rules opens the door to script mixing.

Where the Vulnerability Manifests in Agent Architecture

URL handling: The LLM generates or selects a link; the browser/HTTP client goes to a different domain (IDN/punycode).
CLI / bash: The agent copies a "safe" command from text, but some characters are non-ASCII; result—a different command/parameters/path.
Tool calling: Parameters for a tool (e.g., webhook URL, git remote, S3 bucket) look correct visually but actually point to an attacking resource.
RAG ingestion: Documents with homoglyphs poison the index; upon retrieval, the agent receives an "authoritative" action that leads to a substitution.

Practical Signs in Data and Logs

In agent action logs, the domain/string looks correct, but when copied into a hex/Unicode view, different code points are revealed.
DLP/antivirus triggers "out of the blue" because the agent downloaded a payload from a doppelganger domain.
DNS anomalies: Requests to punycode domains (xn--...), especially if strictly corporate zones were expected.

A separate nuance: "bytecode analysis is better via signature search"—this is true for protection. If your agent executes artifacts (scripts, binaries, WASM, containers), byte signatures and allowlist hashes are more resilient to visual spoofing than text checks. But this does not eliminate the need to sanitize input strings, as most attacks start precisely with text (URL/command/path).

Business & Automation Impact

For business, this isn't just "another rare Unicode vulnerability," but a direct risk to AI automation and processes where an agent performs actions without constant human supervision. Homoglyphs turn "verified" text into a delivery channel for phishing and command execution, in a way that neither an employee nor a model may notice.

Which Scenarios Are Most Dangerous

Finance and Procurement: The agent processes invoices/emails, extracts payment links/vendor portals—and goes to a doppelganger domain.
Legal Processes and e-sign: Fake links to documents and signatures where the domain visually matches.
IT/DevOps Agents: Tools that "fix prod," execute terminal commands, change configs, connect repositories—an ideal attack surface.
Customer Support: The agent offers the user an "official link," but pulls it from a poisoned knowledge base.

What Changes in Architecture If You Are Serious About Agentic AI

While LLM security was previously often reduced to prompt injection and tool access policies, homoglyph attacks require another layer: strict string processing at input and output before passing to browser/CLI/API.

Gate before LLM: Unicode normalization, script mixing detection, invisible character filtering, threshold rules.
Gate before Tools: Separate validation for URLs/paths/commands, not a "universal sanitizer."
Domain Policy: Allowlist of corporate and partner domains, blocking IDN navigation without manual confirmation.
Observability: Logging in canonical form (e.g., punycode + list of code points for suspicious strings) for investigations.

Who benefits from proper protection: Companies that have already launched AI implementation into business processes and want to scale without increasing risk. Who is at risk: Teams that gave the agent "access to everything" expecting the LLM to "figure it out like a human," limiting security only to the system prompt.

In practice, companies stumble here precisely because this issue lies between layers: developers think about the LLM, security teams about the network and endpoints, while homoglyphs exploit the gap in text processing. Until a separate discipline emerges—AI Solution Architecture—where strings (URL/CLI/identifiers) are treated as critical data and pass through a strict pipeline.

Expert Opinion Vadym Nahornyi

The main mistake in agentic projects: assuming text is "safe by default" if it looks safe. Homoglyph attacks are not about model intelligence; they are about engineering hygiene: normalization, strict URL/CLI rules, and separate trust contours for human UI and machine execution.

At Nahornyi AI Lab, we see a recurring pattern: companies quickly assemble an agent PoC, give it a browser and terminal, connect RAG—and only then start thinking about string validation. As a result, a single "compromised thread" in a corporate chat or one infected page in the knowledge base is enough for the agent to start reproducing malicious links as "recommended."

What Really Works (and What Doesn't)

Works: Unicode normalization (NFC/NFKC as appropriate), script mixing detection (Latin+Cyrillic), blocking invisible characters, IDNA/punycode policy, and a strict URL allowlist for critical tools.
Works: Separate validators for data types: URL, email, hostname, path, CLI arguments. Universal "text cleaning" almost always gives a false sense of security.
Works: Dual-channel verification for high-risk actions—e.g., the agent shows the user the canonical form of the domain (pinyin/punycode) and asks for confirmation.
Does Not Work: "Ban Cyrillic everywhere." You will break legitimate processes (names, addresses, documents) but still miss fullwidth/mathematical analogs and some bypasses.
Does Not Work: Hoping for a "smarter model." Even if individual models are more robust, attacks often hit the preprocessing and tools around the LLM.

Practical Minimum for Production Autonomous Agents

Input Normalization before LLM and output normalization before tool execution.
Confusables Mapping (Unicode confusables tables) + rules on script mixing in identifiers.
URL Policies: IDN ban by default or permission only for approved domains; logging in punycode.
CLI Safe-Mode: Command templates, ban on arbitrary shell, argument escaping, execution via API wrappers instead of raw bash.
RAG Hygiene: Sanitizing during ingestion, source marking, quarantine for external documents.

My forecast: this won't be a "hype vulnerability for a week," but a permanent class of problems for years, because agent systems inevitably work with text from untrusted sources. The winners will be those who build protection into the pipeline and turn it into a component standard, just as WAF and SAST became the norm.

Theory is useful, but results require practice. If you are building autonomous agents, RAG, or automating with AI in finance, support, or DevOps—let's discuss your project at Nahornyi AI Lab. We will design trust contours, sanitization, and tool execution rules so that AI accelerates business rather than creating a new compromise channel. I personally vouch for the quality of implementation — Vadym Nahornyi.

Share this article

Twitter/X LinkedIn Telegram