How does ClawShield handle false positives in prompt detection?

The three-tier detection system allows granular tuning through confidence thresholds and pattern disabling. You can run in audit-only mode to baseline traffic before enforcement. The canary token system provides high-confidence detection with minimal false positives. SQLite logging helps identify which specific patterns trigger falsely so you can refine policies without disrupting operations.

Can ClawShield protect non-OpenClaw AI agents?

Yes. While optimized for OpenClaw, ClawShield functions as a generic HTTP/WebSocket reverse proxy capable of inspecting any JSON-based API traffic. You can deploy it in front of LangChain applications or custom Python agents. The policy engine works with any endpoint accepting structured JSON, though some OpenClaw-specific features may require manual configuration for other frameworks.

What performance overhead does ClawShield add?

Benchmarks show sub-5ms latency for typical requests, with the Go-based proxy adding minimal overhead. The eBPF kernel monitor adds microseconds to monitored system calls. Memory footprint remains under 100MB for moderate loads. Streaming parsers avoid buffering large messages in memory, keeping resource usage predictable even with large file uploads or long conversation contexts.

Is the eBPF component required for basic protection?

No. The core security features including prompt injection detection and secrets scanning work entirely within the Go-based proxy without kernel modules. The eBPF component provides additional runtime security monitoring for Linux systems but is optional. You can deploy ClawShield effectively on macOS, Windows, or Linux without eBPF support.

ClawShield: Open-Source Security Proxy for OpenClaw AI Agents

Q: Does ClawShield replace existing OpenClaw security tools?

ClawShield complements rather than replaces tools like AgentWard or Rampart. While AgentWard focuses on file system enforcement, ClawShield specializes in network traffic inspection and message content analysis. You should deploy it as an additional layer in your defense-in-depth strategy, using it to catch network-based attacks while other tools handle host-level protections.

ClawShield launched this week as an open-source security proxy specifically architected for AI agent frameworks, with first-class support for OpenClaw deployments. Built in Go and leveraging optional eBPF kernel probes, this approximately 6,000-line HTTP/WebSocket reverse proxy addresses a critical gap in the current AI ecosystem: the absence of standardized traffic inspection for multi-agent systems. While frameworks like OpenClaw provide sophisticated routing and tool-use capabilities, they historically lacked a dedicated security layer to inspect the actual content flowing between agents, tools, and users. ClawShield fills this void by intercepting all inbound and outbound messages to enforce deny-by-default security policies through four specialized scanners. The project ships with ten cross-compiled binaries supporting Linux, macOS, and Windows across both amd64 and arm64 architectures, enabling deployment across diverse infrastructure. Currently running in production at clawshield.sleuthco.ai, the tool represents a practical, shipping solution rather than theoretical research.

What is ClawShield and Why It Matters for AI Agent Security?

ClawShield functions as a mandatory inspection layer between your AI agents and the outside world. Unlike traditional web application firewalls that struggle with LLM-specific threats, it understands the unique attack surface of AI agents: prompt injection, tool misuse, and dynamic code generation. The proxy operates as a reverse proxy written in Go, handling both HTTP and WebSocket traffic while maintaining sub-millisecond latency for most operations. It implements LLM-aware scanning that understands context windows, instruction boundaries, and encoding schemes specific to AI interactions. The author built this after contributing security patches to OpenClaw and realizing the framework needed a standardized way to inspect and control agent traffic without modifying core agent logic. Every decision gets logged to SQLite for audit trails, providing the observability required for production deployments. For builders shipping OpenClaw agents to production, this closes the security gap between “it works locally” and “it is safe to expose to users.”

How ClawShield Fits Into the OpenClaw Ecosystem

OpenClaw provides multi-agent orchestration and routing capabilities, but historically left the security boundary between agents and external systems as an exercise for the implementer. ClawShield plugs into this architecture as a sidecar or gateway component, sitting in front of OpenClaw instances to sanitize all traffic before it reaches agent logic. This complements existing security tools like AgentWard (runtime enforcement) and Rampart (security layer) by focusing specifically on network traffic inspection rather than file system or process-level controls. The integration requires no modifications to OpenClaw core; you point your agent’s HTTP client at the ClawShield proxy, configure your policy YAML, and the proxy handles the rest. For builders running managed OpenClaw hosting or local deployments like MCClaw, ClawShield offers a drop-in security upgrade that works with existing infrastructure. The proxy understands OpenClaw’s message formats and can apply per-tool filters, allowing you to restrict specific agents from accessing sensitive endpoints while permitting others.

The Four-Layer Scanner Architecture Explained

ClawShield implements defense through specialized scanners that handle distinct threat categories. The prompt injection scanner operates across three analytical tiers to detect attempts to override system instructions. The secrets and PII scanner prevents accidental or malicious data exfiltration, targeting obfuscation techniques like unicode escape sequences. The vulnerability scanner monitors for traditional injection attacks including SQL injection, SSRF, command injection, and path traversal that become possible when agents dynamically construct queries. The malware detection scanner identifies malicious binaries, scripts, and archive bombs that agents might download or execute. Each scanner operates independently, allowing you to enable or disable specific protections based on your threat model. The architecture uses streaming parsers to minimize memory overhead, and every detection event gets logged to SQLite with full context including timestamp, agent identifier, and the specific rule triggered. This modular design allows the system to evolve as new threats emerge.

Prompt Injection Detection: Beyond Simple Regex

Most security tools treat prompt injection as a regex problem, but ClawShield recognizes that modern attacks use encoding, fragmentation, and context manipulation. The first tier uses regex heuristics to catch immediate threats: strings like “ignore previous instructions” or attempts to break out of delimiters using markdown code blocks. The second tier performs structural analysis on decoded content, looking for base64 blobs that contain imperative verbs or instruction-like patterns when decoded. It also scores text for command density, flagging sequences of verbs followed by objects that resemble system instructions rather than natural language. The third tier implements canary token leak detection, embedding specific strings in system prompts and monitoring if those tokens appear in outbound traffic, indicating successful data exfiltration. This multi-layer approach catches direct attacks, encoded bypasses, and subtle extraction attempts that single-pass scanners miss. For OpenClaw deployments where agents process untrusted user input and then call tools, this prevents the “confused deputy” problem.

Secrets and PII Scanning: Defeating Unicode Escape Bypasses

AI agents frequently handle sensitive data including API keys, database credentials, and personal information. When agents generate tool outputs or log their reasoning, they risk accidentally leaking these secrets through message content. ClawShield’s secrets scanner applies regex argument filters to decoded JSON values specifically to catch obfuscation attempts. Attackers and accidental leakers alike use unicode escape sequences like \u0070assword to bypass simple string matching. The scanner decodes these representations before applying pattern matching, ensuring that escaped versions of sensitive keywords still trigger alerts. The scanner covers standard patterns including AWS keys, GitHub tokens, database connection strings, email addresses, and phone numbers. You can extend it with custom regex patterns for organization-specific secrets. Scanning happens on both inbound traffic, preventing poisoned training data or prompt injection containing secrets, and outbound traffic, preventing exfiltration. Every detection gets logged with context about which agent and tool triggered the leak.

Vulnerability Scanning: Comprehensive SQLi to SSRF Coverage

When AI agents use tools, they often construct SQL queries, HTTP requests, or file paths dynamically based on LLM output. This creates injection vulnerabilities identical to traditional web applications. ClawShield scans for SQL injection patterns including UNION-based attacks, tautologies like “OR 1=1”, and blind injection techniques using SLEEP or BENCHMARK functions. For SSRF protection, it blocks requests to private IP ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), cloud metadata endpoints at 169.254.169.254, and alternative IP encodings like decimal or hexadecimal. The scanner also detects path traversal using double URL-encoding and null byte injection, plus command injection through shell metacharacters and backtick execution. Unlike static analysis tools that check code, this operates at runtime on the actual strings the agent generates and sends to tools. If your OpenClaw agent connects to a database or makes HTTP requests to user-provided URLs, this layer prevents the agent from becoming a pivot point for attacking internal infrastructure.

Malware Detection: YARA-like Rules and Entropy Analysis

AI agents that process files or execute code face malware risks including poisoned model files, malicious scripts disguised as data, or reverse shells embedded in “benign” attachments. ClawShield implements file type detection through magic bytes rather than extensions, identifying Windows PE, Linux ELF, and macOS Mach-O executables regardless of filename. The YARA-like signature system matches patterns associated with known reverse shells, C2 frameworks, and common exploit kits. For archive files, it calculates compression ratios to detect zip bombs designed to cause denial of service. Shannon entropy analysis identifies encrypted payloads or packed executables that might bypass signature detection; high entropy in otherwise text-heavy files indicates potential steganography or encryption. This protects OpenClaw deployments where agents might download files from the internet, process user uploads, or generate code that gets executed in sandboxed environments. The scanner runs on message content and attachments, quarantining or stripping suspicious payloads before they reach agent logic.

The Policy Engine: Deny-by-Default YAML Configuration

Security tools fail when they rely on developers remembering to enable protections. ClawShield uses deny-by-default YAML configuration where you explicitly define what is permitted rather than what is forbidden. The policy engine supports tool allowlists restricting which specific tools an agent may invoke, per-tool argument filters restricting parameters based on regex patterns, and domain allowlists restricting outbound HTTP requests to specific hosts. You can apply policies per-agent or per-channel, allowing different security postures for a customer-facing agent versus an internal automation. Every decision gets logged to SQLite with full context including timestamp, agent ID, tool name, and matched rule. The configuration syntax supports includes and inheritance, letting you maintain base policies and environment-specific overrides. This approach ensures that new tools or capabilities start in a restricted state until explicitly approved, preventing accidental exposure through misconfiguration.

tools:
  allowlist:
    - web_search
    - file_read
  filters:
    file_read:
      path_regex: "^/safe/(.*)$"
      block_regex: "\.\./|/etc/passwd"

This YAML snippet demonstrates how to define an allowlist for specific tools, in this case, web_search and file_read. It also shows how to apply regular expression filters to the arguments of a specific tool, file_read, to enforce safe path access and prevent directory traversal or access to sensitive system files like /etc/passwd. Such fine-grained control is essential for managing the capabilities of AI agents that interact with file systems or external services.

eBPF Integration: Kernel-Level Monitoring for AI Agents

While the core proxy handles application-layer traffic, ClawShield includes an optional eBPF kernel monitor for runtime security at the system call level. Written in Python using BCC, it attaches to kernel probes for execve (process execution), tcp_v4_connect (network connections), openat2 (file access), and setuid (privilege escalation). This detects fork bombs, unauthorized privilege escalation attempts, and port scanning behavior that might indicate a compromised agent exploring the host. The eBPF component operates outside the agent process, making it resistant to tampering even if the agent itself gets compromised. It correlates kernel events with specific agent actions, providing a complete audit trail from HTTP request to system call. For OpenClaw deployments running on Linux hosts, this adds a second layer of defense beyond the application proxy. The overhead is minimal; eBPF programs run in kernel space without context switches. This provides a robust, low-overhead method for detecting anomalous behavior that might bypass application-level controls.

Deployment Options: From Docker to Production Environments

ClawShield ships with a three-command Docker quickstart suitable for local development. For production, the project provides ten cross-compiled binaries covering Linux, macOS, and Windows on both amd64 and arm64. You can deploy it as a standalone binary on bare metal, as a sidecar container in Kubernetes alongside your OpenClaw pods, or as a gateway instance handling traffic for multiple agent services. The iptables egress firewall component generates validated rules from your YAML policy, creating kernel-level packet filtering that blocks traffic before it reaches the proxy layer. Configuration hot-reloading lets you update policies without restarting the proxy, ensuring zero-downtime policy updates. The SQLite logging backend requires no external dependencies, though you can configure log shipping to external SIEMs. This flexibility allows ClawShield to integrate seamlessly into various infrastructure setups, from development workstations to large-scale cloud deployments.

docker pull clawshield/clawshield:latest
docker run -v $(pwd)/policy.yaml:/config/policy.yaml \
  -p 8080:8080 clawshield/clawshield:latest \
  --upstream http://openclaw:3000

This simple Docker command demonstrates how to pull the latest ClawShield image, mount a policy file, expose the proxy on port 8080, and configure it to forward requests to an upstream OpenClaw instance. This setup provides a quick and easy way to get started with ClawShield’s protection.

Performance Characteristics: Go and eBPF Efficiency

Built in Go with approximately 6,000 lines of code, ClawShield maintains low latency through efficient concurrency patterns and minimal allocation during the hot path. The HTTP/WebSocket reverse proxy uses Go’s net/http with custom middleware for scanning, keeping overhead under 5ms for typical requests. The scanner architecture uses streaming parsers where possible, avoiding the need to buffer entire messages in memory. This is crucial for handling large language model contexts or file uploads without excessive memory consumption. For the eBPF components, kernel probes execute without context switches or user-space roundtrips, adding microseconds of overhead to system calls. Memory usage remains bounded through pooled buffers and aggressive connection reuse. In production at clawshield.sleuthco.ai, the system handles hundreds of concurrent agent connections without degradation. The SQLite logging uses WAL mode for concurrent reads and writes, preventing I/O bottlenecks during high-traffic periods. This focus on performance ensures that security is not achieved at the cost of responsiveness, which is vital for interactive AI applications.

Comparison with Existing Security Layers in the OpenClaw Ecosystem

The OpenClaw ecosystem already includes security tools targeting different attack surfaces. AgentWard focuses on runtime file system enforcement, preventing unauthorized file deletions through syscall interception. Rampart provides a broader security layer with different architectural assumptions. Raypher offers eBPF runtime security and hardware identity verification. ClawShield differentiates itself by specializing in network traffic inspection and message content analysis at the application layer. Where AgentWard prevents file deletion incidents and Raypher monitors kernel behavior, ClawShield prevents data exfiltration through outbound traffic scanning and blocks injection attacks in tool parameters. SkillFortify offers formal verification for agent skills, complementing ClawShield’s runtime enforcement. Unlike Gulama or Hydra, which position themselves as OpenClaw alternatives with built-in security, ClawShield enhances standard OpenClaw without requiring framework migration. This makes ClawShield a unique and valuable addition to a multi-layered security strategy.

Feature	ClawShield	AgentWard	Rampart	SkillFortify
Primary Focus	Network proxy, message content inspection	Runtime file/system enforcement	General security framework	Formal skill verification
Technology Stack	Go, eBPF (optional)	eBPF/Kernel modules	Varies (often Python-based)	Formal methods, static analysis
Prompt Injection Detection	Yes, 3-tier detection	No	Limited (depends on integration)	No (pre-runtime)
Secrets/PII Scanning	Yes, with obfuscation bypass	No	Limited	No
SSRF/SQLi Protection	Yes, runtime detection	No	Partial (depends on integration)	No
Malware Detection	Yes, YARA-like rules, entropy	No	No	No
Deployment Model	Sidecar/Gateway proxy	Kernel module, host agent	Agent wrapper, library	Development pipeline tool
Runtime Enforcement	Yes, network traffic	Yes, system calls	Yes, agent actions	No (pre-runtime verification)
Observability	SQLite logs, Prometheus metrics	Kernel logs, custom events	Framework logs	Reports, verification results
Compatibility	OpenClaw, generic JSON APIs	Linux hosts	OpenClaw, other agent frameworks	OpenClaw skill definitions

This comparison table highlights ClawShield’s unique position in the OpenClaw security landscape. While other tools focus on host-level protection or pre-deployment verification, ClawShield provides critical runtime network and message content security, directly addressing threats that emerge from dynamic AI agent interactions.

Threat Model: What ClawShield Protects Against and Its Limitations

Understanding what ClawShield does not protect against is as important as knowing what it does. It protects against prompt injection attacks that attempt to override system instructions, exfiltrate data through encoded channels, or manipulate agents into unauthorized tool use. It blocks accidental or malicious secrets leakage through outbound messages, including unicode escape bypasses. It prevents vulnerability exploitation when agents construct dynamic queries or HTTP requests, stopping SQL injection, SSRF, and command injection at the network boundary. It detects malware in file attachments or code snippets that agents might download or execute.

However, it does not protect against compromised LLM providers, side-channel attacks on the host system, or physical security breaches. It assumes the OpenClaw framework itself is trusted; if the framework is compromised, ClawShield can only limit the blast radius. The tool is designed for the “AI agent as confused deputy” threat model, where the agent has legitimate access to sensitive data but might be tricked into misusing it. This comprehensive understanding of its capabilities and limitations allows for a more effective and layered security strategy. For instance, if the underlying operating system is compromised, ClawShield’s eBPF component might offer some detection, but a full system compromise is beyond its primary scope.

Integration Patterns for OpenClaw Builders

Integrating ClawShield into your OpenClaw workflow requires minimal code changes. The simplest pattern is the gateway deployment: point your OpenClaw instance at ClawShield as its upstream proxy, configure ClawShield to forward to your actual tool endpoints, and the proxy handles inspection transparently. For microservice architectures, deploy ClawShield as a sidecar container in each agent pod, ensuring that compromised agents cannot bypass inspection by talking directly to the network. This sidecar pattern provides isolation and ensures that every agent’s traffic goes through the security checks.

The policy YAML supports environment variable substitution, letting you inject secrets like API keys into allowlists without committing them to version control. This is a critical feature for maintaining secure development practices. For CI/CD pipelines, run ClawShield in audit-only mode initially to observe traffic patterns and tune your policies before switching to enforce mode. The SQLite logs export to standard formats for analysis in tools like Splunk or Datadog, providing valuable insights for security operations teams. If you are building custom OpenClaw tools, include canary tokens in your system prompts and configure ClawShield to monitor for them. This creates a high-confidence detection mechanism for specific types of prompt injection or data exfiltration.

The Missing Piece: Why AI Agents Need Network Proxies

Traditional security models assume applications have fixed behaviors and known attack surfaces. AI agents violate these assumptions; they generate dynamic content, choose tools at runtime based on context, and process unstructured input from users. Static analysis cannot predict what SQL queries an agent will generate tomorrow, nor can it foresee every possible combination of tool calls and user input. Network proxies like ClawShield provide the runtime enforcement layer that static security misses. They inspect the actual bytes moving between agents and tools, catching attacks that only manifest in specific contexts. Without this layer, every tool an agent uses becomes a potential exfiltration channel or attack vector. The OpenClaw ecosystem has matured rapidly on the orchestration side but lacked standardized traffic inspection. ClawShield fills this gap without forcing builders to abandon their existing OpenClaw investments. It treats AI agents as the dynamic, code-executing, network-accessing systems they actually are, providing a much-needed layer of adaptive security.

Production Readiness and Observability Features

ClawShield runs in production at clawshield.sleuthco.ai, indicating real-world validation beyond proof-of-concept. The observability stack includes structured logging to SQLite with fields for agent ID, conversation thread, tool name, matched rule, and decision outcome. You can query this directly for incident investigation or stream it to external log aggregation systems. This detailed logging is essential for post-incident analysis and for understanding agent behavior. Health check endpoints allow load balancers to verify proxy status, ensuring high availability. Prometheus metrics export request counts, latency percentiles, and block rates for monitoring dashboards, providing a real-time view of the proxy’s performance and security posture. The deny-by-default policy engine ensures that misconfigurations result in blocked traffic rather than accidental exposure, a fundamental principle of robust security design. Hot-reload configuration prevents downtime during policy updates, allowing security teams to respond to new threats without impacting user experience. This combination of features makes ClawShield a robust and reliable choice for securing OpenClaw deployments in demanding production environments. Its continuous operation in a real-world setting underscores its stability and effectiveness.

Conclusion

ClawShield is an open-source security proxy built in Go and eBPF that protects OpenClaw AI agents through network traffic inspection and message scanning.