claws out: A Red Team and Blue Team survey of the openclaw ecosystem
Claws Out: A Red-Team and Blue-Team Survey of the OpenClaw Ecosystem
Chapter draft, research cutoff 23 April 2026. Every verifiable claim below cites a primary source. Unverified leads are called out explicitly in-line.
There’s a very handsome HTML version of this report available here, with a nicer table and TOC
You can read the first chapter of ‘How to Build Your AI CyberWolf’ here
Executive Summary
OpenClaw went from unknown to ubiquitous faster than almost any open-source project in recent memory. It shipped publicly on 24 November 2025, crossed 300,000 GitHub stars within four months, and — along the way — acquired what may be the richest accidental security laboratory of 2026. Thirteen confirmed CVEs in the first five months, a supply-chain campaign that poisoned between 341 and 1,184 skills on the official ClawHub registry, 135,000+ internet-exposed instances with no authentication, a one-click RCE that actually got exploited in the wild, and a formal MITRE ATLAS investigation cataloguing seventeen distinct techniques observed against the platform. By February 2026 Laurie Voss (founding CTO of npm, now at Arize) had called it a “security dumpster fire”1; Andrej Karpathy explicitly told users not to run it on their computers2; a Meta executive reportedly told his team that installing it on a work laptop was a fireable offense3. And despite all of that, OpenAI acquired the project and hired its founder Peter Steinberger4.
The ecosystem of third-party tooling around OpenClaw is the most interesting security story in the project. On the red-team side you have veganmosfet’s five-part “BrokenClaw” series demonstrating 0-click RCE via email hooks, Johann Rehberger-adjacent indirect-prompt-injection research, a Koi Security audit that flagged 341 malicious skills, JFrog’s discovery of the “GhostClaw” scoped-package impersonation attack, and a prolific community researcher (GitHub user @coygeek) filing half a dozen of the most important advisories against the core repo. On the blue-team side you have SecureClaw from Adversa AI (56 audit checks, 15 behavioral rules, mapped to seven agentic-AI security frameworks), Claw EA’s commercial policy-as-code layer, ClawVet and ClawSecure’s open-source scanners, SlowMist’s validation guide, and a cottage industry of sandbox profiles, skill policies, and adversarial benchmarks.
This chapter surveys all of it. The central claims are three:
- The ecosystem’s dominant architectural risk is the collapse of the trust boundary between instructions and data. Every major exploit class — from one-click RCE to malicious skills to heartbeat memory pollution — exploits the fact that OpenClaw cannot distinguish, at the code level, between the operator’s instructions and attacker-controlled content that arrives through email, web fetches, tool outputs, or installed skills. Patches close specific holes; the underlying property does not patch.
- Defensive tooling has converged on a sensible pattern — layered enforcement, policy-as-code, scanned skills, hardened gateway defaults — but none of it solves prompt injection. The strongest current defenses (SecureClaw, Claw EA, container isolation with default-deny network) reduce blast radius and harden configuration; they do not eliminate the underlying class of attack.
- ClawHub in its current form is a supply-chain liability that a cautious operator should treat as presumptively hostile. Between 13% and 41.7% of audited skills contain meaningful vulnerabilities by different researchers’ counts; ~1 in 9 are actively malicious at peak exposure; and the registry still does not operate a verified-publisher system or a public takedown log.
The rest of the chapter builds the case with primary sources, a comparison table of the third-party projects that matter, and four narrative sections: biggest risks, strongest defenses, projects to avoid, and how to experiment safely.
1. Ground Truth: What OpenClaw Actually Is
OpenClaw is a self-hosted personal AI assistant. You install it with npm install -g openclaw@latest; you run openclaw onboard --install-daemon; a Gateway process starts on port 18789 and becomes the control plane for every agent, session, skill, tool, and channel the assistant can see5. The founder, Peter Steinberger, designed it to feel “local, fast, and always-on” — a single piece of software that takes inbound messages from WhatsApp, Telegram, Slack, Discord, Signal, iMessage, IRC, Microsoft Teams, Matrix, Feishu, LINE, Mattermost, Nextcloud Talk, Nostr, Synology Chat, Tlon, Twitch, Zalo, QQ, WeChat, WebChat, and mobile nodes on iOS and Android6.
That list is the threat model. An OpenClaw Gateway sits behind every one of those channels and has — by default — a set of high-power tools wired to a coding agent. The built-in tool inventory includes bash, process, exec, read, write, edit, the browser automation stack, a Canvas renderer that ships UI back into the chat, cron jobs, webhook handlers, and a system.run primitive that executes on paired mobile nodes7. Inbound messages from any of the supported channels become context for the agent; the agent decides which tools to call; the tools run on the host (or, optionally, inside a Docker / SSH / OpenShell sandbox) and return results that again become agent context.
The official threat model, such as it is
The OpenClaw docs do explicitly mark inbound DMs as untrusted input and recommend setting agents.defaults.sandbox.mode: "non-main" for group or channel messages8. The default sandbox allow list is bash, process, read, write, edit, sessions_list, sessions_history, sessions_send, sessions_spawn; the default deny list is browser, canvas, nodes, cron, discord, gateway. The docs also note that skill precedence is workspace > project-agent > personal-agent > managed/local > bundled > extra-dirs, and that agent allowlists (agents.defaults.skills / agents.list[].skills) are separate from the physical precedence — a deliberate split between which copy of a skill wins and which skills an agent may use9.
Two things the docs concede, in small type, that turn out to be the whole chapter:
“Treat third-party skills as untrusted code. Read them before enabling.”10
“Gateway-backed skill dependency installs … run the built-in dangerous-code scanner before executing installer metadata.
criticalfindings block by default unless the caller explicitly sets the dangerous override; suspicious findings still warn only.”10
Both sentences admit what the rest of this chapter is about: third-party skills are the primary attack surface, and there is an operator-flippable override that turns off the dangerous-code scanner. The surrounding ecosystem of tools exists precisely because those two admissions are true.
ClawHub: the registry
ClawHub is the public skill registry at clawhub.ai. The homepage advertises 52,700 skills, 180,000 users, 12 million downloads, and a 4.8 average rating, and promotes four featured verticals — “self-improving agent,” “GitHub integration,” “security soul,” and “dashboard builder”11. Skills are installed via openclaw skills install <slug> and land in the active workspace’s skills/ directory. Under the hood a skill is a folder containing a SKILL.md file with YAML frontmatter and a list of executable scripts; ClawHub does not currently enforce a manifest schema describing filesystem access, network egress, or tool use12.
A registry scan performed for this chapter confirmed that ClawHub’s taxonomy is flat: a single “Security” category, with no subcategories for red-team, blue-team, pentesting, DFIR, forensics, or networking. The homepage’s “security soul” branding is marketing copy — it does not correspond to a namespace or a verification tier. There is no public verified-publisher system, no reputation score, no public takedown log, and — as of April 2026 — no mandatory dependency audit before a skill is published.
2. The Three Defining Events
The third-party ecosystem formed in response to three shocks that all landed between late January and early February 2026. Any analysis that does not start with these three events will miss why the ecosystem looks the way it does.
2.1 ClawHavoc — the supply chain goes on fire
Between roughly 15 January and 10 February 2026, attackers systematically uploaded malicious skills to ClawHub. Koi Security’s audit of 2,857 skills found 341 of them malicious13. A later count from Antiy CERT put the number at 1,184 out of roughly 10,700 (~1 in 9)14. Independent audits by Snyk (“ToxicSkills”) found that 36% of examined skills contained prompt-injection payloads and 7.1% exposed credentials in plaintext15; SkillSieve (Imperial College London / UCL) later put the rate of “security vulnerabilities” at 13–26% of the registry16.
Payloads observed in the wild included Atomic macOS Stealer (AMOS) variants that dumped Keychain, browser cookies, and crypto wallets; token harvesters that targeted ~/.openclaw/ config files (WhatsApp credentials, Telegram bot tokens, Anthropic and OpenAI API keys); backdoors establishing reverse shells; and keyloggers timed to capture credentials during agent interactions. The delivery vector was social: attackers published skills with real, useful functionality (“Tech News Digest,” “Productivity Booster,” “solana-wallet-tracker”) alongside a parallel malicious payload, and inflated install counts to boost discovery in ClawHub’s default search ranking. MITRE later assigned this a new technique, AML.T0111 “AI Supply Chain Reputation Inflation”17.
The eSecurity Planet, Hacker News, Help Net Security and ClawHub’s own incident page (claw-hub.net/clawhub-havoc-incident.html) all document the response: ClawHub added VirusTotal scanning of new submissions and committed to a future “Extension Marketplace” with vetting13. CertiK’s follow-up in March 2026 argued — correctly — that VirusTotal-style scanning of Markdown skill content is not a security boundary, because the payloads are not binaries18.
2.2 ClawBleed — the one-click RCE that actually got used
On 31 January 2026 a coordinated disclosure published GHSA-g8p2-7wf7-98mq, which became CVE-2026-25253 (“ClawBleed”), a CVSS 8.8 one-click remote code execution against any default OpenClaw install1920. The chain is worth reproducing in detail because most of the later defensive tooling is a response to it:
- The Control UI reads
gatewayUrlfrom the query string and writes it tolocalStoragewithout validation (app-settings.ts). app-lifecycle.tsimmediately callsconnectGateway()on load, sending the storedauthTokenin the WebSocket handshake — now pointed at the attacker’s URL.- Because the Gateway’s WebSocket server did not validate the
Originheader, Cross-Site WebSocket Hijacking (CSWSH) allowed any origin to open a connection tows://localhost:18789. - With the stolen token, the attacker sent API calls to set
exec.approvalstooff, settools.exec.hosttogateway, and send anode.invoke/system.runshell command. Full RCE, localhost-only instances included.
DepthFirst and Ethiack independently discovered the bug within hours of each other (Ethiack’s autonomous AI pentester “Hackian” found it in under two hours)2122. Blink’s CVE timeline writes that ClawBleed is “the only 2026 OpenClaw CVE confirmed as actively exploited in the wild,” and the fix — v2026.1.29, released the following day — introduces an explicit user-confirmation modal before any gateway URL change23. NVD’s reference table tags the DepthFirst write-up as Exploit. The Sploitus PoC remains publicly available as of 18 April 202624.
2.3 Moltbook — the Supabase mistake that wasn’t an OpenClaw bug but taught the whole ecosystem a lesson
On 2 February 2026 Wiz Research disclosed that Moltbook — an AI-agent social network by Matt Schlicht, deployed with OpenClaw as the backend execution environment — had shipped Supabase to production without Row-Level Security enabled, and with the anon API key visible in client-side JavaScript25. The result: 1.5 million agent authentication tokens, 35,000 email/Twitter handles, and 4,060 private agent-to-agent DMs containing plaintext OpenAI API keys were readable (and writable) by any unauthenticated user. Wiz patched and disclosed within hours; no CVE was issued because the defect was configuration, not code.
Moltbook mattered to the OpenClaw ecosystem for two reasons. First, compromised Moltbook tokens could be replayed against any OpenClaw instance that had paired with a Moltbook agent, so the breach cascaded into any household running both. Second, the incident forced a conversation about inter-agent supply chains: it is not enough to audit skills, dependencies, and channels; any other agent your agent talks to is also a trust boundary. Adversa AI’s threat model explicitly added “Inter-agent lateral movement via Moltbook and shared channels” as threat class T8 on the back of this incident26.
3. The Comparison Table
The table below lists the third-party projects, academic papers, incidents, and proposals that matter most for a red-team or blue-team view of OpenClaw. Rows are annotated with role (Red / Blue / Dual / Report / Proposal), the surface they exercise or defend, maturity, a risk rating (which for red-team tools means operator risk from the thing the tool exercises, and for blue-team tools means residual risk of the tool itself), and a one-line reason it belongs in the record. Primary-source URLs are consolidated at the end of the chapter.
| # | Name | Role | Attack / Defense Surface | Maturity | Risk | Why It Matters |
|---|---|---|---|---|---|---|
| 1 | SecureClaw (Adversa AI) | Blue | Full-stack: 56 audit checks + 15 behavioral rules + plugin + skill layers | Active, v2.1.0-mvp 17 Feb 2026, npm @adversa/secureclaw, ~325⭐ |
Low | First tool to formally map controls to seven agentic-AI security frameworks (OWASP ASI, MITRE ATLAS, CoSAI, CSA MAESTRO, CSA Singapore, NIST AI 100-2 E2025). The reference implementation. |
| 2 | Claw EA | Blue | Policy-as-code enforcement: Work Policy Contract (WPC), scoped tokens (CST), clawproxy receipts, proof bundles |
Active, commercial | Medium | Commercial answer to “prompts aren’t enforcement.” If your threat model requires machine-enforced allow/deny and tamper-evident audit, Claw EA is the grown-up option. Closed source. |
| 3 | ClawVet (MohibShaikh) | Blue | Six-pass install-time scanner for SKILL.md — RCE patterns, creds, prompt injection, typosquats, social engineering | Active, open, MIT, ~580 npm downloads/week | Low | The community’s answer to the ClawHavoc campaign. Install-time only; does not address post-install drift. |
| 4 | ClawSecure (ClawSecure org) | Blue | Three-layer audit protocol; OWASP ASI 10/10 mapped; free; 2,890+ agents self-reported audited | Active, open | Low | Slimmer, more deployable alternative to SecureClaw for smaller installs. Self-reported metrics — verify before relying on them. |
| 5 | ClawGuard (joergmichno) | Blue | Prompt-injection firewall: 225 detection patterns, 15 languages, F1=0.983 claimed, sub-10ms latency, REST API, EU-AI-Act compliance map | Active, open | Low | Best purpose-built prompt-injection detector in the ecosystem. F1 claim is self-reported; pattern-DB gaps are a known bypass class. |
| 6 | p3nchan/openclaw-skill-policy | Blue / Proposal | Layer-1–4 community skill security policy doc: source trust → static analysis → permission declaration → runtime enforcement | Active, open, Feb 2026 | Low | The best operator-facing checklist in the ecosystem. Proposes a manifest.json permission declaration that OpenClaw itself does not yet enforce. |
| 7 | SlowMist OpenClaw Security Practice Guide | Dual | Agent-facing validation guide with 13-item nightly audit shell script; red-team matrix (pre/in/post action) | Active, open, MIT, ~2,787⭐, last push 6 Apr 2026 | Medium | Unusual design — the agent itself runs the audit. That makes it both a defensive tool and (as a byproduct) a ready-made attacker’s checklist of every control to bypass. |
| 8 | Snyk mcp-scan + AI-BOM |
Blue | Static analysis of SKILL.md patterns; generates AI Bill of Materials for agent components | Active, commercial + open | Low | Snyk’s Liran Tal coined the “SKILL.md prerequisite trap” category — attackers using the AI to socially-engineer the human into installing a fake binary. mcp-scan detects those patterns. |
| 9 | Peleke/openclaw-sandbox | Dual | Lima VM provisioning for isolated Gateway, plus an open STRIDE red-team epic (Issue #44) | Early, 0⭐, but high-signal; Issue #44 open since 8 Feb 2026 | Medium | The community’s cleanest isolation recipe. The STRIDE issue is a planning doc of every bypass an attacker would try — useful for both sides. |
| 10 | deduu/ClawSandbox | Red | Adversarial benchmark: 9 attack types (prompt injection, memory poisoning, priv-esc, data exfil) vs. fixed system prompt | Stale (last push 5 Mar 2026, 4⭐) | Medium | Published data: Gemini 2.5 Flash fell to 7/9 attacks, GPT-5.3 Codex defended 9/9. Useful data point for model choice. |
| 11 | TerminalGravity/openclaw-swarm-security-audit | Dual | Multi-agent Claude swarm template for red/blue auditing of OpenClaw; ~$50–100 budget/Phase 1 | Planning stage, 0⭐, Feb 2026 | Low | Interesting pattern — agents attacking and defending agents. Worth watching; don’t deploy yet. |
| 12 | adversa-ai/secureclaw docs/openclaw-attack-examples.md |
Red (inside a blue tool) | MITRE ATLAS–mapped attack cookbook (supply-chain, injection, sandbox escape) | Active | Medium | Ships inside a defensive tool but reads as a red-team playbook. Adversary reading it gets a pre-mapped tactic list. |
| 13 | BrokenClaw series (veganmosfet, 5 parts) | Red | Parts 1–5: 0-click RCE via Gmail hook; sub-agent sandbox escape; email tool RCE; web-fetch-to-RCE; GPT-5.4 model-agnostic demonstration | Active, Feb–Apr 2026 | Critical | The most cited primary-research series against OpenClaw. Part 2 became CVE-2026-32048; Part 4 fed PR #57782. |
| 14 | HiddenLayer “Claws for Concern” (McCauley, Schulz, Tracey, Martin) | Red / Report | Full RCE chain via indirect prompt injection + HEARTBEAT.md persistence + plaintext .env exfil + W^X violation diagnosis |
Published 3 Feb 2026 | Critical | Most thorough public attack research on OpenClaw’s core architecture. Introduced the framing that the architecture — not the LLM — is the defect. |
| 15 | Oasis Security “ClawJacked” | Red / Report | Website-to-localhost WebSocket takeover with no plugins; hundreds of auth attempts/sec from browser JS; localhost rate-limit exemption | Published 26 Feb 2026; patched 24 h | Critical | Companion disclosure to ClawBleed. Demonstrated that browser-to-localhost threat model was broken at every layer. |
| 16 | CertiK OpenClaw Security Report | Report | Four-category analysis: gateway takeover, identity bypass, prompt injection, supply chain; companion post “Skill Scanning Is Not a Security Boundary” | Published 31 Mar 2026 | High | Brings web3-audit methodology to agent security. CertiK’s argument that Markdown scanning can’t be a boundary is the cleanest critique of ClawHub’s VirusTotal response. |
| 17 | MITRE ATLAS OpenClaw Investigation (PR-26-00176-1) | Report | 4 case studies (AML.CS0048 / CS0049 / CS0050 / CS0051 “under investigation”), 17 techniques, new techniques in ATLAS v5.5.0 incl. AML.T0108–T0112 | Published 9 Feb 2026 | Critical | Highest-authority threat mapping available. Canonical source for book-chapter-quality taxonomy. |
| 18 | Texas A&M “Systematic Taxonomy” (arXiv:2603.27517, Suwansathit / Zhang / Gu) | Report | Taxonomy of 190 advisories; introduces “Context Manipulation” as sixth Kill Chain stage; shows exec-allowlist closed-world assumption fails under line continuation, busybox multiplexing, GNU long-option abbreviation | Published 31 Mar 2026 | Critical | Strongest academic critique of the architecture. If the book chapter cites one paper, cite this one. |
| 19 | arXiv:2603.23064 “Mind Your HEARTBEAT!” (NTU / A*STAR / JHU) | Report | Heartbeat-loop memory pollution; direct precursor to CVE-2026-41329 (CVSS 9.9) | Published 24–25 Mar 2026 | Critical | Shows that an attacker controlling what the heartbeat fetches (poisoned RSS, webhook) can achieve silent operator-level privilege. |
| 20 | arXiv:2603.26221 “Clawed and Dangerous” (CSIRO Data61 / UTS) | Report | Survey: planning + external capabilities + persistent memory + privileged exec as a new class | Published 27 Mar 2026 | High | Frames OpenClaw as an archetype, not an outlier. |
| 21 | arXiv:2604.03131 (Xidian / China Unicom) | Report | Systematic security evaluation of OpenClaw variants including the OpenAI-acquired distribution | Published 3 Apr 2026 | High | Only paper that directly evaluates the post-acquisition distribution. |
| 22 | arXiv:2603.00902v1 “Clawdrain” (Dong / Feng / Wang) | Red / Report | Tool-calling-chain stealthy token exhaustion; maps to OWASP ASI08 cascading failures | Published 1 Mar 2026 | Medium | Budget-level DoS, not RCE — but cheap, quiet, and unpatched as a class. |
| 23 | arXiv:2604.06550 “SkillSieve” (Imperial College London / UCL) | Blue / Report | Hierarchical triage framework for detecting malicious skills; reports 13–26% of ClawHub containing vulnerabilities | Published April 2026 | High | Academic counterpart to ClawVet / mcp-scan. |
| 24 | arXiv:2603.00195 (Bhardwaj, independent) | Report | Formal analysis of agentic-AI skill supply chain, ClawHavoc as primary case | Feb/Mar 2026 | Medium | Formal model useful for reasoning about future ClawHub-class attacks. |
| 25 | JFrog “GhostClaw” (research.jfrog.com) | Red / Report | Malicious npm package @openclaw-ai/openclawai masquerading as official CLI; multi-stage creds/SSH-key stealer |
Identified 8 Mar 2026 | Critical | npm scope impersonation is not preventable by ClawHub controls — operator must verify package source. |
| 26 | @coygeek (GitHub contributor) | Red | Sustained research stream: #4951 (prompt-injection bypass), #7768 (DNS rebinding in browser control server), #8516 (browser arbitrary file write), #11031 (.openclaw/extensions/ auto-load), #15313 (browser /evaluate ACE), #53433 (config redaction bypass / cred leak), #65625 (openclaw.podman.env empty token + LAN bind) |
Active, Jan–Apr 2026 | Critical | Most prolific single researcher filing against the core repo. Tracking their issue list is a cheap OSINT proxy for “what’s about to be patched.” |
| 27 | Jamieson O’Reilly — “What Would Elon Do?” poisoned skill | Red | Published to ClawdHub as a PoC; 16 users downloaded in 8 hours; exfil via curl to a clawdhub-skill.com impersonation domain |
Case-study published 26 Jan 2026 (AML.CS0049) | Critical | Canonical published poisoned-skill demonstration. Cited directly by MITRE ATLAS. |
| 28 | ClawHavoc supply-chain campaign | Incident | 341–1,184 malicious skills on ClawHub; AMOS, token harvesters, backdoors, keyloggers | Incident late Jan–early Feb 2026 | Critical | Historical event. Any skill installed before ~10 Feb 2026 should be treated as presumptively compromised. |
| 29 | Moltbook Supabase breach (Wiz) | Incident | 1.5M agent tokens, 35k handles, 4,060 agent-to-agent DMs incl. plaintext OpenAI keys | 2 Feb 2026 | Critical | Not an OpenClaw code bug — but the cleanest demonstration that inter-agent trust is a real attack surface. |
| 30 | CVE-2026-25253 “ClawBleed” | Incident | 1-click RCE via CSWSH; CVSS 8.8; fixed 2026.1.29; confirmed exploited in the wild | Disclosed 31 Jan 2026 | Critical | The canonical “your AI was taken over because the browser has no CORS on WebSocket to localhost” bug. |
| 31 | CVE-2026-28472 “ClawJacked” | Incident | Gateway WS auth bypass via device identity check that validates presence, not validity; CVSS 9.8; fixed 2026.2.2 | Disclosed Feb 2026 | Critical | 63% of exposed instances had no auth at all, so the “gate” protecting this bug was absent on most public installs. |
| 32 | CVE-2026-32048 (sessions_spawn sandbox escape) | Incident | Cross-agent spawn bypasses sandbox inheritance; a sandboxed skill can spawn a child with sandbox.mode=off; fixed 2026.3.1 |
Published 20 Mar 2026 | Critical | Makes sandboxing a false sense of security unless inheritance is enforced end-to-end. Predicted by veganmosfet BrokenClaw Part 2. |
| 33 | CVE-2026-32922 (device.token.rotate priv-esc) | Incident | operator.pairing → operator.admin via token rotation race; CVSS 9.9; fixed 2026.3.11 |
Published 29 Mar 2026 | Critical | Any legitimately paired low-privilege device has a direct path to full admin. |
| 34 | CVE-2026-41329 (heartbeat sandbox bypass) | Incident | Heartbeat context inherits senderIsOwner from parent scope; poisoned RSS/webhook → operator privilege; CVSS 9.9; fixed 2026.3.31 |
Published 21 Apr 2026 | Critical | Academic precursor at arXiv:2603.23064. Demonstrates that background execution is an attack surface on its own. |
| 35 | CVE-2026-35629 (channel-extension SSRF) | Incident | Multiple channel extensions accept configurable base URLs without SSRF guards; fixed 2026.3.25 | Published 9 Apr 2026 | High | Enables cloud-metadata-endpoint reconnaissance from any low-privilege channel integration. |
| 36 | April 21 2026 GHSA batch (10 advisories) | Incident | Coordinated disclosure of 10 advisories in a single day: MCP env injection, config-mutation bypass, hook session-key bypass, QQBot SSRF, workspace dotenv override, pairing action scope, assistant media scope, MCP/LSP policy bypass, Feishu dmPolicy misclassification, cron-event trust | Published 21 Apr 2026 | High | Consistent with OpenAI acquisition due-diligence audits. Watch the 2026.4.x release notes. |
| 37 | Openclaw Gateway default bind 0.0.0.0 (no-auth) | Incident / Proposal | Default listens on all interfaces; SecurityScorecard STRIKE found 135,000+ internet-exposed instances, 63% with no auth | Ongoing (default unchanged as of 2026-04-23) | Critical | The single most-cited design defect in the project. All defensive tools ship with “rebind to 127.0.0.1” as step one. |
| 38 | Issue #22196: “No code-level enforcement distinguishing system messages from user-crafted lookalikes” | Proposal | RFC: delimit trusted vs. untrusted content at the tool boundary | Closed not_planned |
High | Closure of this issue is frequently cited by critics as evidence of maintainer posture on architectural fixes. |
| 39 | Issue #62939: structural delimiters for injection defense at tool/message boundaries | Proposal | RFC to add delimiter-based separation of instructions and tool results | Open | Medium | Most promising open proposal for addressing indirect-prompt-injection structurally. |
| 40 | SkillScan (ClawHub skill, 92.9k installs) | Incident / Red | Platform-flagged malicious: uploads skill packages to skillscan.tokauth.com, collects MAC address, silent daily auto-update, 0 current active users despite install count |
Live on ClawHub at time of audit | Critical | Illustrative of the ClawHub trust failure — a skill with “scan” in the name that the platform itself flags as malicious still had 92.9k nominal installs. |
| 41 | zaycv/clawhub malicious skill | Incident | Malware distributed via base64-encoded payload in skill Markdown, 7,754 downloads before takedown | Closed 13 Mar 2026 | High | Canonical base64-in-Markdown example. ClawHub did not display all skill files at time of discovery. |
4. Red-Team Inventory (Narrative)
4.1 Primary research that shaped the record
Four pieces of primary research define the red-team record on OpenClaw.
HiddenLayer, “Claws for Concern,” 3 February 2026. Conor McCauley, Kasimir Schulz, Ryan Tracey, and Jason Martin demonstrated a full attack chain: a user asks the agent to summarize a malicious web page; the agent, reading the page, is persuaded to curl -fsSL … | bash; the script appends attacker-controlled instructions to ~/.openclaw/workspace/HEARTBEAT.md, a file that is re-read into the system prompt every session27. Persistent C2, 30-minute heartbeat poll. API keys and tokens are stored in plaintext in ~/.openclaw/.env, so once RCE is achieved, credential exfiltration is a cat away. The paper’s clearest line is architectural: “A strongly desirable security policy for systems is W^X (write xor execute). OpenClaw violates this: the instructions executed are also modifiable during execution.”
Oasis Security, “ClawJacked,” 26 February 2026. The Oasis team showed that any website the user visited while OpenClaw was running could open a WebSocket to localhost:18789, brute-force the auth token at hundreds of attempts per second from browser JavaScript alone (the Gateway’s rate limiter exempted localhost), auto-pair as a trusted device, and take over the instance28. No plugins, no extensions, no user interaction beyond visiting the page. OpenClaw patched within 24 hours — impressive for a volunteer-driven project, and a sign of how sharp this particular team’s response loop is.
veganmosfet, “BrokenClaw” series, five parts between 2 February and 8 April 202629. Part 1 demonstrated 0-click RCE via the Gmail pub/sub webhook — no user action beyond receiving an email; prompt injection in the email body convinced the agent to clone a malicious .openclaw/extensions/ repo and restart the Gateway. Part 2 showed that the obvious fix (enable the sub-agent sandbox) could be escaped via prompt-injected sessions_spawn calls — the research that later became CVE-2026-32048. Part 3 repeated the RCE through the built-in email reading tool rather than the webhook. Part 4 generalised the pipeline: anything the agent fetches from the web is a potential code execution channel. Part 5 tested the same chains against GPT-5.4, confirming the vulnerability class is model-agnostic.
Texas A&M (arXiv:2603.27517), 31 March 2026. Surada Suwansathit, Yuxuan Zhang, and Guofei Gu published the most rigorous academic critique to date30. Three key findings: (1) three independently moderate advisories compose into a complete unauthenticated RCE path; (2) the exec allowlist encodes a closed-world assumption that command identity is recoverable by lexical parsing — “invalidated by line continuation, busybox multiplexing, and GNU long-option abbreviation in independent and non-overlapping ways”; (3) a malicious skill executed a two-stage dropper entirely within the LLM context, bypassing the exec pipeline entirely. They propose “Context Manipulation” as a sixth Kill Chain stage with no analog in MITRE ATT&CK.
4.2 The supply-chain campaigns
Two named campaigns matter. ClawHavoc (§2.1) is the registry-level campaign; GhostClaw is the npm-level one. JFrog Security Research identified a live malicious package @openclaw-ai/openclawai on npm, masquerading as the official OpenClaw Installer and exploiting scope-name visual similarity. The multi-stage payload stole credentials, env variables, and SSH keys31. JFrog did not publish an infection count. The defense is simple and unsatisfying: install OpenClaw only via the unscoped package openclaw, verify the npm registry owner before running any @openclaw* package, and never run npm install scripts without reading them.
Separately, the zaycv/clawhub skill distributed malware via a base64 payload embedded in its Markdown; 7,754 downloads before GitHub issue #108 closed it on 13 March 202632. At the time of discovery, ClawHub’s UI did not display every file in a skill package, which is why the injection point was not auditable before install — an ergonomics-is-security finding that the registry has since partially remediated.
4.3 The prolific community researcher
GitHub user @coygeek has — as of this writing — filed seven confirmed security issues against openclaw/openclaw, spanning the browser control server (#7768 DNS rebinding, #15313 /evaluate ACE, #8516 arbitrary file write), extension auto-load (#11031), config redaction (#53433), and the default empty-token LAN bind in the Podman installer (#65625)33. The GitHub association flag is CONTRIBUTOR, which is an unusually privileged bucket — possible reasons include a long commit history, Steinberger adding them to a trust list, or a standing arrangement with the maintainers. For the purposes of this chapter, the important observation is that watching a single account’s public GitHub issue feed is an unusually high-signal OSINT stream for what’s about to be patched.
4.4 Red-team harnesses
deduu/ClawSandbox is a small adversarial benchmark — nine attack types against a fixed system prompt — and its most useful published result is a model comparison: Gemini 2.5 Flash fell to 7/9 attacks, GPT-5.3 Codex defended 9/934. That’s not an endorsement; it’s a reminder that model choice matters and that the ceiling on defense is higher than the mean.
Peleke/openclaw-sandbox is a Lima VM provisioning recipe for running the Gateway in an isolated VM, and — more interestingly — it ships a P0-priority STRIDE Red Team Epic (Issue #44) that enumerates every bypass an attacker would try against the Gateway35. It’s open since 8 February 2026 with 0⭐ — classic low-star, high-signal repository.
TerminalGravity/openclaw-swarm-security-audit is an experimental multi-agent Claude swarm that runs red-team and blue-team roles in parallel against a target OpenClaw instance36. Early stage, no published results, budget estimated $50–$100 for Phase 1. The pattern is worth watching; the implementation is not yet worth deploying.
4.5 Academic attack research worth knowing
Beyond Texas A&M’s taxonomy (§4.1), the two most useful offensive academic works are arXiv:2603.23064 “Mind Your HEARTBEAT!” (NTU / A*STAR / JHU)37, which predicted CVE-2026-41329 and demonstrates silent memory pollution via the background execution loop, and arXiv:2603.00902v1 “Clawdrain”38, which shows how tool-calling loops can drain API budget without triggering safety stops — a cheap, quiet DoS that maps to OWASP ASI08 (cascading failures) and for which no standard mitigation exists.
5. Blue-Team Inventory (Narrative)
5.1 The layered-enforcement consensus
Every serious defensive tool in the ecosystem has converged on a variant of the same idea: don’t trust any one layer. SecureClaw (Adversa AI) ships the clearest version: a code-layer plugin that performs 56 audit checks on a live install (gateway bind, credential storage, sandbox config, file permissions, dependency CVEs) and applies five hardening modules; and a parallel skill layer with 15 behavioral rules (~1,230 tokens injected into the system prompt) that handle the things infrastructure alone cannot — injection awareness, PII scanning, command-integrity monitoring, inter-agent communication rules, and a kill-switch that blocks OpenClaw from starting if SecureClaw itself is disabled39. The framework mapping — 10/10 OWASP ASI, 10/14 MITRE ATLAS agentic TTPs, 13/18 CoSAI Secure-by-Design, 4/4 MITRE ATLAS OpenClaw case studies — is the most thorough in the ecosystem. Adversa’s Polyakov says it directly: “Most competing tools are skill-only, meaning the security logic lives inside the agent’s context window as natural language instructions. The problem is that skills can be overridden by prompt injection.”
The argument is right, but it does not reach all the way to enforcement. The skill layer is still LLM-directive: the model must choose to follow the rules. SecureClaw’s v2.1 release notes admit that “weaker models may misclassify red-line commands” and that injected guide text can itself be tampered with by prompt injection. The audit / hardening plugin layer is real enforcement, but it acts on configuration, not on runtime actions.
Claw EA goes a step further in the direction of machine enforcement40. A Work Policy Contract (WPC) is a signed, hash-addressed policy that defines what the agent may do; a Cryptographic Scoped Token (CST) bounds a single run; clawproxy sits in front of the model call and emits Ed25519-signed receipts; each job yields a proof bundle that can be verified independently. The pitch is that “safety lives in prompts” is a fallacy; the enforcement layer should be below the model. The cost is operational complexity and a commercial price tag. The benefit is that an attacker who successfully prompt-injects a skill still cannot invoke a tool outside the CST’s scope, because the execution layer evaluates the policy and not the model.
This combination — SecureClaw for configuration + behavior and Claw EA (or a home-grown equivalent) for policy-as-code — is the closest thing to a defensive reference architecture in the ecosystem.
5.2 Skill-level scanning
ClawVet (MohibShaikh)41 is the most downloaded open-source SKILL.md scanner, with six analysis passes: RCE patterns (reverse shells, piped downloads), credential theft (SSH keys, API tokens, browser cookies), prompt injection patterns, typosquat proximity, social-engineering markers, and network-egress detection. It runs at install time. It does not solve the post-install drift problem (skills that start clean and phone home later), and the Hacker News discussion that surfaced it explicitly flagged that gap.
ClawSecure42 is a lighter, OWASP-ASI-aligned scanner with a three-layer audit protocol and a free tier; claims 2,890+ agents audited (self-reported). ClawGuard (joergmichno)43 is a prompt-injection firewall with 225 detection patterns and a claimed F1 of 0.983 across 15 languages. Both are useful; both are vulnerable to the standard critique that a regex-based scanner loses to trivial obfuscation (base64, dynamic require, runtime code assembly), which is why Snyk’s mcp-scan44 — which was purpose-built to detect Markdown-instruction patterns as well as binary ones — is a useful complement. Snyk’s Liran Tal also named the underlying new attack category: “SKILL.md prerequisite trap” — an instruction file that tells the AI to instruct the user to install a fabricated utility. The user sees a trusted AI telling them to install openclaw-core; openclaw-core does not exist; the link leads to a payload.
5.3 Operator-facing policies and playbooks
p3nchan/openclaw-skill-policy45 is the best operator-facing checklist: four layers — source trust (stars, contributors, recency, official badges), static analysis (injection patterns, dependency audit, lockfile presence, --ignore-scripts for install scripts), permission declaration (proposes a manifest.json of fs/network/tool/env requirements), and runtime enforcement (sandbox-exec, firejail, bubblewrap; hard blocks on ~/.ssh, ~/.gnupg, ~/.aws, ~/.config/gh; skill isolation). The current OpenClaw runtime does not enforce layers 3–4; the doc makes the gap explicit and references Feature Request #28298 for platform support.
SlowMist’s openclaw-security-practice-guide46 is the highest-starred operator guide (~2,787⭐, last push 6 April 2026). The distinguishing feature is that it is agent-facing: the 13-item nightly audit shell script (scripts/nightly-security-audit-v2.8.sh) is designed to be invoked by OpenClaw itself as a verification task. This inverts the usual operator-runs-script model. The upside is a self-auditing agent. The downside is that a compromised agent now has a clear enumeration of every check it is supposed to pass — and therefore of every control an attacker needs to bypass. The design is clever and load-bearing in roughly equal measure.
5.4 Hardening defaults
Every defensive tool ships the same first hardening step: rebind the Gateway from 0.0.0.0:18789 to 127.0.0.1 and force authentication. SecurityScorecard’s STRIKE dashboard at declawed.io found 135,000+ internet-exposed instances, 63% of them running with no authentication at all47. Jeremy Turner’s line in The Register is both the sharpest and the most useful: “Think of it like hiring a worker with a criminal history of identity theft who knows how to code well and might take instructions from anyone.”
The harder, less-automatable layer is per-agent network namespacing. The r/LocalLLaMA consensus — echoed in Thick-Protection-458’s line that “any giving the model more or less unrestricted access to your machine should be a big no” — is that the long-term fix is zero-trust architecture at the agent boundary: default-deny network egress, credentials injected at runtime (never baked into the agent’s environment), a forward proxy that logs every outbound request, and per-agent container network namespaces48. None of this is an OpenClaw-specific tool; it is a deployment pattern the ecosystem’s credible voices consistently recommend.
6. Dual-Use Tools and Proposals
Several tools do not cleanly fall on one side.
adversa-ai/secureclaw is a defensive plugin that ships a file — docs/openclaw-attack-examples.md — which is effectively an adversarial playbook mapped to MITRE ATLAS. The tool itself reduces attack surface; the documentation inside it expands the attacker’s available knowledge. This is the standard dual-use property of any mature defensive tool (Metasploit, Ghidra, BloodHound) and is not a reason to avoid it; it is a reason to treat it as source material by both sides.
SlowMist’s guide (§5.3) is dual-use by design: the agent-facing audit is a defender’s tool, but because it enumerates every check, it gives attackers the same checklist.
Peleke/openclaw-sandbox (§4.4) is a defensive VM provisioning recipe that ships with a STRIDE red-team planning epic. The repo mostly defends; the issue mostly attacks.
TerminalGravity/openclaw-swarm-security-audit (§4.4) is a red/blue swarm template where both roles run simultaneously.
The relevant proposals in the OpenClaw repo worth knowing:
- Issue #22196 — “No code-level enforcement distinguishing system messages from user-crafted lookalikes.” Closed as
not_planned. Repeatedly cited as evidence of the project’s architectural posture on the hardest problem. - Issue #62939 — “Prompt injection defense at tool result and message boundaries (structural delimiter proposal).” Open as of writing. Most promising active proposal for a structural fix.
- Issue #8093 — RFC: Security Hardening Architecture. Community proposal for a unified approach; limited traction.
- PR #1827 —
fix(security): prevent prompt injection via external hooks (gmail, webhook). Merged, 549 additions. Direct response to BrokenClaw Part 1. - PR #57782 — Indirect prompt injection hardening by
@pyn3rd. Merged.
7. Incidents and the MITRE ATLAS OpenClaw Investigation
The MITRE ATLAS OpenClaw Investigation (publication ID PR-26-00176-1, dated 9 February 2026) is the single most authoritative document in the record49. It reviews four case studies and extracts 17 distinct techniques across the ATLAS matrix; three of those are techniques new to ATLAS v5.5.0, directly added because of OpenClaw observations — AI Agent Tool Poisoning (AML.T0108 / T0110), AI Supply Chain Rug Pull (AML.T0109), AI Supply Chain Reputation Inflation (AML.T0111), and the Machine Compromise series (AML.T0112-family)50.
7.1 The four case studies
AML.CS0048 — Exposed OpenClaw Control Interfaces Lead to Credential Access and Execution (date of incident 25 January 2026). A researcher identified hundreds of internet-exposed Control UIs with no authentication. Reading the configuration file harvested credentials for all connected applications; prompting the agent via the chat interface produced root-level execution inside the container. No exploit code was required. The attack surface was the combination of no authentication and a capable skill framework — the agent’s own features, turned against it. Techniques: AML.T0051.000 (LLM Prompt Injection: Direct), AML.T0053 (AI Agent Tool Invocation), and the mitigation-class Privileged AI Agent Permissions Configuration51.
AML.CS0049 — Supply Chain Compromise via Poisoned ClawdBot Skill (date 26 January 2026, actor Jamieson O’Reilly, type exercise). O’Reilly published a skill named “What Would Elon Do?” to ClawdHub. The skill’s rules/logic.md contained a prompt injection that caused the backend (Claude Code running as the OpenClaw agent) to execute a curl to clawdhub-skill.com — a domain deliberately registered to impersonate the legitimate registry. Sixteen users downloaded and triggered the skill within eight hours. Techniques: AML.T0017 (Develop Capabilities), T0008.002 (Domains), T0065 (LLM Prompt Crafting), T0104 (Publish Poisoned AI Agent Tool), T0111 (AI Supply Chain Reputation Inflation), T0010.005 (AI Agent Tool — User Execution), T0011.002 (Poisoned AI Agent Tool), T0051.000 (LLM Prompt Injection: Direct), T0074 (Masquerading), T0053 (AI Agent Tool Invocation), T0048 (External Harms)52.
AML.CS0050 — OpenClaw 1-Click Remote Code Execution (date 1 February 2026, actor DepthFirst / Ethiack, type exercise / CVE disclosure, CVE-2026-25253). The full ClawBleed chain described in §2.2. Techniques: T0017, T0079 (Stage Capabilities), T0011.003 (Malicious Link / User Execution), T0106 (Exploitation for Credential Access), T0107 (Exploitation for Defense Evasion — CSWSH), T0012 (Valid Accounts — stolen token reuse), T0081 (Modify AI Agent Configuration), T0105 (Escape to Host), T0050 (Command and Scripting Interpreter — node.invoke)53.
AML.CS0051 — Still Under Investigation as of the 9 February 2026 PDF. The attack-graph table describes a fourth scenario involving AI Supply Chain Compromise via model-level attack — an agent using an upstream LLM without validation of fine-tuning data or safety alignment. No separate case-study page was found at atlas.mitre.org/studies/AML.CS0051; treat this case study as unverified in full pending publication of the standalone page.
7.2 The CVE table
A compact reference for the CVEs that matter to operators (all verified in NVD, GHSA, or VulnCheck):
| CVE | Nickname | CVSS | Class | Fixed In | Exploited? |
|---|---|---|---|---|---|
| CVE-2026-25253 | ClawBleed | 8.8 H | 1-click RCE / CSWSH | 2026.1.29 | Yes (confirmed) |
| CVE-2026-27002 | — | — | Priv-esc (SentinelOne ref) | — | No |
| CVE-2026-28472 | ClawJacked | 9.8 C | WS auth bypass | 2026.2.2 | No |
| CVE-2026-32048 | — | 9.9 C (PT) / 7.5 H (VulnCheck) | Cross-agent sandbox escape | 2026.3.1 | No |
| CVE-2026-32915 | — | 8.8 H | Leaf subagent boundary bypass | 2026.3.11 | No |
| CVE-2026-32922 | — | 9.9 C | Priv-esc via device.token.rotate |
2026.3.11 | PoC exists |
| CVE-2026-33579 | — | 8.1–9.8 | Pair-approval path injection | 2026.3.28 | PoC exists |
| CVE-2026-35629 | — | 7.4 H | Channel-extension SSRF | 2026.3.25 | No |
| CVE-2026-35653 | — | — | Authorization bypass (SentinelOne ref) | — | No |
| CVE-2026-41329 | — | 9.9 C | Heartbeat sandbox bypass | 2026.3.31 | No |
On top of the CVEs, the openclaw/openclaw repo’s Security Advisories page lists at least a dozen further GHSAs, including GHSA-56pc-6hvp-4gv4 (path traversal via $include), GHSA-7wv4-cc7p-jhxc (workspace .env can inject runtime-control variables), GHSA-m3mh-3mpg-37hw (install-phase arbitrary code execution), GHSA-4564-pvr2-qq4h (shell injection in macOS keychain write), GHSA-h9g4-589h-68xv (auth bypass in sandbox browser bridge), GHSA-xw4p-pw82-hqr7 (sandbox skill-mirroring path traversal), GHSA-3fqr-4cg8-h96q (CSRF via loopback browser mutation endpoints), and the 21 April 2026 coordinated batch of ten54.
7.3 Community vulnerability tracker
Joel Gamblin maintains a public tracker at github.com/jgamblin/OpenClawCVEs that, according to multiple secondary sources, had logged 137 advisories between 2 February and 4 April 2026. Direct primary crawl was not performed for this chapter; treat the existence as confirmed-by-reference and the precise count as unverified.
8. Narrative Analysis
8.1 The biggest risks in the ecosystem
Four risks dominate the record, in rough order of severity.
First, indirect prompt injection remains unsolved and unpatchable. Every major research group — HiddenLayer, Adversa AI, Oasis Security, Snyk, Texas A&M, the OpenClaw maintainers themselves in Issue #22196 — agrees that the LLM cannot enforce access control once untrusted content is in its context window. Patches close specific ingress points (Gmail webhook, email tool, web fetch) but the underlying property — that the model treats instructions and data identically at the attention layer — does not patch. Every skill, every channel, every tool result, every retrieved document is a potential injection point. The Texas A&M paper’s “Context Manipulation” sixth-stage Kill Chain is the right framing: in a traditional attack, the adversary must execute code or bypass a policy; in an OpenClaw-class attack, controlling what the model believes is sufficient to induce arbitrary tool calls.
Second, the default configuration is indefensible for any sensitive deployment. The Gateway binding to 0.0.0.0:18789 with no mandatory authentication is the direct cause of 135,000+ internet-exposed instances. Plaintext storage of API keys and tokens in ~/.openclaw/.env makes any successful RCE also a full credential-theft event. Both are design choices, not user errors. The maintainers have pushed incremental hardening (authentication now available, token confirmation modal after ClawBleed) but the defaults have not moved. SecureClaw, Claw EA, ClawVet, ClawSecure, and p3nchan’s skill policy all ship with “rebind to 127.0.0.1” as step one — the ecosystem has effectively voted on the default with its tooling.
Third, ClawHub is a supply-chain liability. Between 13% and 41.7% of audited skills contain security vulnerabilities, depending on which researcher you ask and when they sampled. Approximately 1 in 9 skills are actively malicious by Antiy CERT’s February 2026 count. The attacker innovation in this space — the SKILL.md prerequisite trap (Snyk), agent-driven social engineering (where the AI tricks the human), base64-in-Markdown (zaycv), registry impersonation via clawdhub-skill.com (O’Reilly), npm scope impersonation via @openclaw-ai/openclawai (JFrog GhostClaw) — is outpacing ClawHub’s moderation capacity. VirusTotal scanning of Markdown is not a boundary, as CertiK argued. Until the registry operates a verified-publisher system with mandatory manifest declarations and a public takedown log, the correct operator posture is: assume any skill you didn’t write is presumptively hostile.
Fourth, the cross-layer composition problem. Texas A&M’s key finding is not about any one CVE; it is that OpenClaw’s dominant architectural pattern — per-layer, per-call-site trust enforcement — makes cross-layer composition attacks “systematically resistant to layer-local remediation.” Three independently moderate vulnerabilities compose into a complete unauthenticated RCE. The exec allowlist’s lexical-parsing closed-world assumption is defeated by at least three independent techniques (line continuation, busybox multiplexing, GNU long-option abbreviation). You cannot patch this class of property with any one commit. It is the shape of the codebase.
8.2 The most promising defenses
The defensive state of the art converges on a stack with four layers, each required, none sufficient.
Layer 1 — deployment isolation. A dedicated VM or physical host (per Microsoft’s guidance relayed in Growexx’s March 2026 consulting guide55). Gateway rebound to 127.0.0.1; remote access only via VPN or Tailscale tail-net. Default-deny network egress, per-agent container network namespaces, forward proxy that logs every outbound request. Credentials injected at runtime by a secrets manager — never stored in ~/.openclaw/.env in plaintext. Read-only SOUL.md and AGENTS.md at runtime. This is not a product; it is a deployment pattern. Peleke/openclaw-sandbox is the cleanest open recipe.
Layer 2 — configuration hardening. SecureClaw’s 56 audit checks are the most complete, with framework mappings operators can point at in a compliance conversation. ClawSecure is a lighter alternative. Both should run on first install and in a nightly audit loop; SlowMist’s agent-facing audit script is a convenient trigger.
Layer 3 — policy-as-code execution. Claw EA’s WPC + CST + clawproxy pattern is the best-articulated commercial answer. A home-grown equivalent is possible for small deployments: an OPA policy in front of the tool dispatcher, a signed JSON policy artifact hashed into every run, an immutable audit log of tool invocations. The important property is that the enforcement point sits below the model, not inside it.
Layer 4 — skill scanning and runtime behavioral rules. ClawVet at install time; ClawGuard for prompt-injection firewall patterns; Snyk mcp-scan for SKILL.md-prerequisite-trap detection; SecureClaw’s 15 behavioral rules for in-context guardrails. None of these solve prompt injection; all of them raise the cost of common exploits.
No serious advocate of any of these tools claims prompt injection is solved. The goal is blast-radius reduction: turn a successful prompt injection from full host compromise into a failed tool call with an alert.
8.3 Overhyped and dangerous projects
Three patterns recur.
Skills that name-drop security. ClawHub’s own “Security” category includes a skill called SkillScan with 92.9k nominal installs and 0 current active users. The platform itself flags it as malicious: it uploads submitted skill packages to skillscan.tokauth.com, collects the host MAC address, and silent-auto-updates daily. NeoGriffin Security presents a package.json version mismatch against the registry listing and requires an unexplained payment-wallet environment variable. “Security Scanner” wraps nmap and nuclei without any author attribution, any install-source pinning, or any ethics review. These are not edge cases; these are the results of searching the default “Security” category on the official registry. Treat security-branded skills with the same skepticism as security-branded browser extensions.
Marketing-grade “AI security” wrappers. AI.com’s Super Bowl claim of being “the world’s first easy-to-use and secure implementation of OpenClaw” was called “vaporware” by Simon Willison in February 202656. The pattern — repackaging the OpenClaw Gateway behind a glossy frontend and claiming “secure by default” — is common. The signal to watch for is whether the vendor has published a threat model and mapping against OWASP ASI or MITRE ATLAS that lists unaddressed items, not just covered ones. SecureClaw does this; most marketing pages do not.
Abandoned forks and lookalike projects. The Clawdbot → Moltbot rename (forced by trademark conflict) left abandoned @clawdbot handles on X and GitHub that scammers seized within seconds. A fake $CLAWD Solana token pumped to a $16 million market cap before collapsing; fake “Clawdbot Agent” VS Code extensions that installed ScreenConnect-based RATs were published and distributed57. Any package, handle, or extension bearing a name adjacent to OpenClaw / ClawHub / Moltbot / Clawdbot should be verified against the official openclaw/openclaw repo’s README links before install. This is also the lesson of JFrog’s GhostClaw discovery: npm scopes are not identity, and the string @openclaw-ai does not guarantee the package is published by the OpenClaw maintainers.
8.4 Practical advice for safely experimenting
For a researcher or operator who wants to work with third-party OpenClaw tooling and not get owned:
- Pick your lab first. Dedicated VM (Peleke/openclaw-sandbox or your own Lima/Multipass/Proxmox recipe). Default-deny egress. Separate cloud account with no credentials in common with your personal or work accounts. Treat every credential the agent might access as disposable — if it can’t be burned and rotated, it doesn’t belong in the lab.
- Rebind the Gateway before you install anything.
127.0.0.1:18789, mandatory auth token, and if you need remote access use Tailscale — not a public IP with a port-forwarded Gateway. - Install OpenClaw only via the unscoped
openclawnpm package. Verify the publisher is the official org. Nevernpm installa scoped@openclaw*package without reading its README,package.json, and install scripts. - Run SecureClaw’s 56-check audit at first install and nightly. The first run will find configuration findings you did not expect. Fix them; re-run; commit the clean baseline to a private git repo so you can diff against it later.
- Treat ClawHub as a source of things to read, not things to install. Before installing any skill, download the folder locally, open every file (not just SKILL.md — all of them), and run ClawVet, ClawGuard, and
mcp-scanagainst it. Check the skill manifest against the permissions it actually needs. If a skill reads~/.ssh,~/.aws, or~/.config/gh, it doesn’t. - Enforce policy below the model. Pick a policy-as-code mechanism — Claw EA if you can afford it, OPA-in-front-of-tool-dispatch if you can’t. The policy artifact is hashed and bound to each run. The model does not get to modify the policy.
- Watch @coygeek’s issue feed and the
openclaw/openclaw/security/advisoriespage. Weekly. The cheapest OSINT in the ecosystem. - Subscribe to the veganmosfet BrokenClaw blog, the Adversa AI research feed, and the MITRE ATLAS changelog. These three sources produced most of the record you rely on.
- Assume compromise on any install that ran before approximately 10 February 2026, or on any instance that was internet-exposed at any point. Rotate all credentials OpenClaw had access to.
- Read
adversa-ai/secureclaw/docs/openclaw-attack-examples.mdcover to cover. It is the best single red-team playbook for this ecosystem, and it is published by the author of the best defensive tool. That is not an accident.
9. What the Ecosystem Is Missing
A last inventory, because the gaps are as load-bearing as the present tools.
- A verified-publisher system on ClawHub. Install counts are manipulable (AML.T0111); the 4.8 average rating does not carry signal; there is no cryptographic identity on the publisher side.
- A public takedown log. When a skill is removed for malicious content, no public record exists of what was removed, when, or why. Operators cannot tell whether a skill they installed last month has since been flagged.
- A mandatory permission manifest. p3nchan’s Layer 3 — a
manifest.jsondeclaring fs/network/tool/env requirements — has no platform enforcement today. Feature Request #28298 tracks the gap. - A CISA KEV entry for CVE-2026-25253. Despite NVD tagging the DepthFirst PoC as an Exploit and the CVE being confirmed exploited in the wild, no CISA KEV catalog inclusion was confirmed at time of writing.
- A formal OWASP ASI case study. OpenClaw is the archetypal “lethal trifecta” agent (planning + tools + persistent memory) and yet no standalone OWASP ASI case study publication exists for it. Adversa AI’s OWASP ASI mapping is the closest proxy.
- A bug bounty. OpenClaw has no formal bounty program. Given the volume of incoming research and the OpenAI acquisition, this is the single most tractable maintainer-side change that would reshape the economics of the ecosystem.
- A reference hardened distribution. The closest thing today is the combination of SecureClaw + Claw EA + Peleke’s VM + p3nchan’s skill policy. A distribution that ships with those defaults applied, signed, and versioned would be a useful community artifact.
10. Closing
If you strip the ClawHavoc numbers and the CVE list down to a single observation, it is this: OpenClaw was built fast, for an audience that wanted a personal AI assistant that felt instantaneous and omni-channel, and the design choices that made it feel that way — a Gateway on 0.0.0.0 with no auth, plaintext credentials, an agent that happily executes tool calls on behalf of anyone who can inject a sentence into its context window — are exactly the choices that produced the 2026 security record. The corrective is not to ban the project or to claim agentic AI is doomed; it is to build, buy, or steal a layered architecture that assumes the model will be convinced of whatever the last untrusted content said, and to keep the blast radius of that conviction small.
That is the story the third-party ecosystem has been writing. SecureClaw and Claw EA represent the defensive steel; BrokenClaw and the Texas A&M taxonomy represent the offensive cartography; MITRE ATLAS represents the framing; ClawHub represents the problem that is not yet solved. A cautious operator, in April 2026, can run OpenClaw responsibly. A cautious operator has to want to. The ecosystem is no longer short of the tools to do it; it is still short of the defaults that would make doing it feel natural.
Primary Sources
1. Laurie Voss, LinkedIn post, 2026. Via The Register, “DIY AI bot farm OpenClaw is a security ‘dumpster fire’,” 3 Feb 2026. https://www.theregister.co.uk/2026/02/03/openclaw_security_problems/ · https://www.linkedin.com/posts/seldo_openclaw-analysispdf-activity-7423936260798484480
2. Andrej Karpathy, via The Register same article; reported X post https://x.com/karpathy/status/2017442712388309406
3. Ars Technica, “OpenClaw gives users yet another reason to be freaked out about security,” 3 Apr 2026. https://arstechnica.com/security/2026/04/heres-why-its-prudent-for-openclaw-users-to-assume-compromise/
4. Simon Willison, “openclaw” tag, 15 Feb 2026. http://blog.simonwillison.net/2026/Feb/15/openclaw/ · LiveMint / Bloomberg (Parmy Olson), “OpenClaw is an OpenAI security nightmare,” 25 Feb 2026. https://www.livemint.com/opinion/online-views/openclaw-openai-security-nightmare-artificial-intelligence-sam-altman-technology-ai-11771934381262.html
5. OpenClaw README, https://github.com/openclaw/openclaw (retrieved 23 Apr 2026, v2026.4.22).
6. OpenClaw docs home, https://docs.openclaw.ai/
7. OpenClaw docs, Tools section (tool inventory); Sandboxing default allow/deny quoted in repo README.
8. OpenClaw README, “Security defaults” section.
9. OpenClaw docs, Skills: https://docs.openclaw.ai/tools/skills · Skills Config: https://docs.openclaw.ai/tools/skills-config · CLI Skills: https://docs.openclaw.ai/cli/skills
10. OpenClaw docs, Skills “Security notes” block. https://docs.openclaw.ai/skills/
11. ClawHub homepage, https://clawhub.ai/ (retrieved 23 Apr 2026).
12. OpenClaw docs, Skills “Format (AgentSkills + Pi-compatible)” section.
13. The Hacker News, “Researchers Find 341 Malicious ClawHub Skills Stealing Data from OpenClaw Users,” 2 Feb 2026. https://thehackernews.com/2026/02/researchers-find-341-malicious-clawhub.html · ClawHub incident page https://claw-hub.net/clawhub-havoc-incident.html · eSecurity Planet https://www.esecurityplanet.com/threats/hundreds-of-malicious-skills-found-in-openclaws-clawhub/
14. Antiy CERT count, via Growexx.com OpenClaw Skills Development Guide (2026 Edition), 10 Mar 2026. https://www.growexx.com/blog/openclaw-skills-development-guide-for-developers-2026-edition/
15. Snyk Research, “ClawHub Malicious Google Skill / ToxicSkills report,” 10 Feb 2026. https://snyk.io/blog/clawhub-malicious-google-skill-openclaw-malware/
16. Hou & Yang, “SkillSieve,” arXiv:2604.06550, Apr 2026. https://arxiv.org/pdf/2604.06550
17. MITRE ATLAS data changelog, v5.5.0 (30 Mar 2026). https://github.com/mitre-atlas/atlas-data/blob/main/CHANGELOG.md
18. CertiK OpenClaw Security Report, 31 Mar 2026. https://www.certik.com/blog/openclaw-security-report · companion post “Skill Scanning Is Not a Security Boundary,” https://www-cn.certik.com/blog/skill-scanning-is-not-a-security-boundary
19. NVD, CVE-2026-25253. https://nvd.nist.gov/vuln/detail/CVE-2026-25253
20. GHSA-g8p2-7wf7-98mq. https://github.com/openclaw/openclaw/security/advisories/GHSA-g8p2-7wf7-98mq
21. DepthFirst, “1-click RCE to steal your Moltbot data and keys.” https://depthfirst.com/post/1-click-rce-to-steal-your-moltbot-data-and-keys
22. Ethiack, “One-click RCE Moltbot.” https://ethiack.com/news/blog/one-click-rce-moltbot
23. Blink Blog, “OpenClaw 2026 CVE Complete Timeline & Security History.” https://blink.new/blog/openclaw-2026-cve-complete-timeline-security-history
24. Sploitus, CVE-2026-25253 PoC. https://sploitus.com/exploit?id=84AE8E47-F316-5E2E-8386-DFF0AE27F49E
25. Wiz Research, “Exposed Moltbook database reveals millions of API keys,” 2 Feb 2026. https://www.wiz.io/blog/exposed-moltbook-database-reveals-millions-of-api-keys
26. Adversa AI, “OpenClaw AI Agent Security Threats Mapped to OWASP / MITRE,” 19 Feb 2026. https://adversa.ai/blog/openclaw-ai-agent-security-threats-mapped-owasp-mitre/
27. HiddenLayer, “Exploring the Security Risks of AI Assistants like OpenClaw” (a.k.a. “Claws for Concern”), 3 Feb 2026. https://hiddenlayer.com/research/exploring-the-security-risks-of-ai-assistants-like-openclaw
28. Oasis Security, “OpenClaw Vulnerability (ClawJacked),” 26 Feb 2026 (updated 31 Mar 2026). https://www.oasis.security/blog/openclaw-vulnerability
29. veganmosfet “BrokenClaw” series, Parts 1–5, 2 Feb – 8 Apr 2026. https://veganmosfet.codeberg.page/posts/2026-02-02-openclaw_mail_rce/ · https://veganmosfet.codeberg.page/posts/2026-02-15-openclaw_sandbox/ · https://itmeetsot.eu/posts/2026-03-03-openclaw3/ · https://veganmosfet.codeberg.page/posts/2026-03-27-openclaw_webfetch/ · https://veganmosfet.codeberg.page/posts/2026-04-08-openclaw_gpt5_4/
30. Suwansathit, Zhang & Gu, “A Systematic Taxonomy of Security Vulnerabilities in the OpenClaw AI Agent Framework,” arXiv:2603.27517, 31 Mar 2026. https://arxiv.org/html/2603.27517 · https://huggingface.co/papers/2603.27517
31. JFrog Security Research, “GhostClaw Unmasked,” 8 Mar 2026. https://research.jfrog.com/post/ghostclaw-unmasked/
32. openclaw/clawhub Issue #108. https://github.com/openclaw/clawhub/issues/108
33. @coygeek issue stream: #4951, #7768, #8516, #11031, #15313, #53433, #65625. https://github.com/openclaw/openclaw/issues/4951 · /7768 · /8516 · /11031 · /15313 · /53433 · /65625
34. deduu/ClawSandbox. https://github.com/deduu/ClawSandbox
35. Peleke/openclaw-sandbox + Issue #44 (STRIDE Red Team Epic). https://github.com/Peleke/openclaw-sandbox · https://github.com/Peleke/openclaw-sandbox/issues/44
36. TerminalGravity/openclaw-swarm-security-audit. https://github.com/TerminalGravity/openclaw-swarm-security-audit
37. “Mind Your HEARTBEAT! Claw Background Execution Enables Silent Memory Pollution,” arXiv:2603.23064, 24–25 Mar 2026. https://arxiv.org/abs/2603.23064
38. Dong, Feng & Wang, “Clawdrain: Exploiting Tool-Calling Chains for Stealthy Token Exhaustion in OpenClaw Agents,” arXiv:2603.00902v1, 1 Mar 2026. https://arxiv.org/abs/2603.00902v1
39. adversa-ai/secureclaw, https://github.com/adversa-ai/secureclaw · Help Net Security interview with Alex Polyakov, “SecureClaw: Dual-stack open-source security plugin and skill for OpenClaw,” 18 Feb 2026. https://www.helpnetsecurity.com/2026/02/18/secureclaw-open-source-security-plugin-skill-openclaw/ · Adversa launch post https://adversa.ai/blog/adversa-ai-launches-secureclaw-open-source-security-solution-for-openclaw-agents/
40. Claw EA, Agent Supply Chain Security page. https://clawea.com/agent-supply-chain-security · https://www.clawea.com/
41. MohibShaikh/clawvet. https://github.com/MohibShaikh/clawvet · associated HN thread https://news.ycombinator.com/item?id=47370624
42. ClawSecure/clawsecure-openclaw-security. https://github.com/ClawSecure/clawsecure-openclaw-security
43. joergmichno/clawguard. https://togithub.com/joergmichno/clawguard
44. Snyk mcp-scan + AI-BOM, Snyk blog reference above (cite 15).
45. p3nchan/openclaw-skill-policy. https://github.com/p3nchan/openclaw-skill-policy
46. slowmist/openclaw-security-practice-guide. https://github.com/slowmist/openclaw-security-practice-guide
47. The Register, “More than 135,000 OpenClaw instances exposed to internet in latest vibe-coded disaster,” 9 Feb 2026. https://www.theregister.com/2026/02/09/openclaw_instances_exposed_vibe_code/ · SecurityScorecard STRIKE dashboard https://declawed.io
48. r/LocalLLaMA threads: “Every OpenClaw security vulnerability documented in one place,” https://www.reddit.com/r/LocalLLaMA/comments/1r81vw2/ · “We tested what actually stops attacks on OpenClaw,” https://www.reddit.com/r/LocalLLaMA/comments/1r71x3j/
49. MITRE ATLAS OpenClaw Investigation (PR-26-00176-1), 9 Feb 2026. https://www.mitre.org/news-insights/publication/mitre-atlas-openclaw-investigation · PDF https://www.mitre.org/sites/default/files/2026-02/PR-26-00176-1-MITRE-ATLAS-OpenClaw-Investigation.pdf · CTID post https://ctid.mitre.org/blog/2026/02/09/mitre-atlas-openclaw-investigation/
50. MITRE ATLAS data changelog v5.5.0 (30 Mar 2026), same as cite 17.
51. AML.CS0048 summary via startupdefense.io mirror (direct atlas.mitre.org page returned 404 at crawl time; PDF is primary). https://www.startupdefense.io/mitre-atlas-case-studies/
52. AML.CS0049 “Supply Chain Compromise via Poisoned ClawdBot Skill.” Mirror with full technique table https://www.startupdefense.io/mitre-atlas-case-studies/aml-cs0049-supply-chain-compromise-via-poisoned-clawdbot-skill-cc92c
53. AML.CS0050 “OpenClaw 1-Click Remote Code Execution.” https://www.startupdefense.io/mitre-atlas-case-studies/aml-cs0050-openclaw-1-click-remote-code-execution-9423b
54. openclaw/openclaw Security Advisories index. https://github.com/openclaw/openclaw/security/advisories
55. Vikas Agarwal, “OpenClaw Skills Development Guide for Developers (2026 Edition),” Growexx, 10 Mar 2026. https://www.growexx.com/blog/openclaw-skills-development-guide-for-developers-2026-edition/
56. Simon Willison, same as cite 4.
57. Adversa AI, “OpenClaw Security 101: Vulnerabilities & Hardening 2026,” 5 Feb 2026. https://adversa.ai/blog/openclaw-security-101-vulnerabilities-hardening-2026/
Unverified leads (flagged in-line above)
- AML.CS0051 standalone case-study page at atlas.mitre.org not located at crawl time.
- Specific Johann Rehberger / Embrace The Red post naming “OpenClaw” not located (general agentic-AI prompt-injection work is directly relevant but not OpenClaw-specific).
- CVE-2026-22708 attributed to OpenClaw in Adversa AI’s 5 Feb 2026 post; SentinelOne attributes the same CVE ID to Cursor AI. Attack technique (CSS-invisible prompt injection) is real; the CVE ID is potentially misattributed in the Adversa post.
- jgamblin/OpenClawCVEs tracker existence confirmed by reference; 137-advisory count not directly crawl-verified.
- Kaspersky’s independently-cited 512-vulnerability / 8-critical count referenced in Growexx but no primary Kaspersky URL retrieved.
- Microsoft Defender team’s explicit warning referenced in Growexx; no primary Microsoft URL retrieved.
- Black Hat USA 2026 / DEF CON 33 / USENIX Security 2026 talks specifically titled for OpenClaw — not located as of 23 Apr 2026 (summer 2026 conference season had not published full agendas at time of writing).
- CISA KEV catalog inclusion for CVE-2026-25253 not confirmed despite NVD Exploit tagging.
End of chapter draft. Research conducted 23 April 2026.
