Back to Blog
Security·8 min read

Open-Source Agents Are Finally Doing Real Work. The Blast Radius Just Got Real.

An octopus at a security operations desk surrounded by monitors, with a shadowy figure lurking in a doorway

OpenClaw hit 180,000 GitHub stars in under three months. It rebranded twice, shipped 34 security commits in a single release, and added Twitch, Google Chat, and a handful of new LLM providers along the way. Ops teams are Dockerizing it. Enterprises are wiring it into Slack. Someone on GitHub is asking the agent to deploy a password manager before it deploys anything else.

That last part should make you pause.

We've crossed a line. Open-source AI agents aren't weekend toys anymore. They're reading inboxes, sending messages on your behalf, executing shell commands, and managing credentials. All from a single Node.js process running on your laptop or a VPS.

The capability is genuinely impressive. The security posture? Still catching up.

Here's what the community is building, what they're worried about, and where the cracks are already showing.

The Setup: From Side Project to Infrastructure

A few months ago, Peter Steinberger published a WhatsApp relay that talked to Claude. It went viral. Three names and a trademark complaint later, that weekend hack is now OpenClaw - a self-hosted personal AI agent with a gateway architecture, a skills marketplace, persistent memory, and integrations for nearly every messaging platform you can think of.

The adoption signals are hard to ignore. Collabnix published a Docker deployment guide and an architecture deep dive within the same week. The official trust site went live with a six-category threat model and a four-phase security roadmap. VirusTotal signed on to scan every skill in the ClawHub marketplace. Forbes and Nature wrote pieces. The Wikipedia article already exists.

This isn't hype. People are putting these agents into actual workflows, and that changes the threat model entirely.

Threat #1: The Skill Supply Chain Looks Secure. It Isn't - Not Fully.

OpenClaw recently partnered with VirusTotal to scan every skill published to ClawHub. The process is solid on paper: deterministic ZIP packaging, SHA-256 hashing, automated lookup against VT's database, and Gemini-powered Code Insight that analyzes what the code actually does from a security perspective.

Skills flagged as malicious get blocked instantly. Suspicious ones get warnings. Clean ones auto-approve. Daily re-scans catch regressions.

This is genuinely good. More than most agent ecosystems offer. But it solves only part of the problem.

VirusTotal catches malware: trojans, stealers, backdoors, embedded executables. What it doesn't catch is a skill that uses natural language to instruct the agent to exfiltrate your Notion workspace. Or one that requests overly broad tool permissions and quietly reads your calendar. Or a prompt injection payload tucked into a README that redirects the agent's behavior mid-session.

The OpenClaw team says this explicitly: "This is not a silver bullet." They're right. But the gap between "we scan for malware" and "we protect against agent-level manipulation" is where the real risk lives. And most users won't read the fine print.

Threat #2: The Threat Model Is Published. The Hardening Isn't Finished.

Credit where it's due: OpenClaw's trust site is more transparent than anything I've seen from comparable projects. They publish a six-category threat matrix covering input manipulation, auth and access control, data security, infrastructure risks, operational concerns, and supply chain integrity. They lay out a four-phase security program: public threat model, defensive engineering roadmap, full code review, and formal vulnerability triage with SLAs.

The problem? Phases three and four are still underway. The code review hasn't been completed. The triage function is being stood up. And prompt injection, which they correctly identify as an industry-wide unsolved problem, sits at the top of their risk list with no mitigation beyond "use strong models."

So we have a situation where the maintainers are openly telling users that the agent can execute shell commands, send messages as you, read and write your files, fetch arbitrary URLs, and schedule automated tasks. And the controls for these capabilities are still being built.

That's honest. It's also terrifying if you're already running this in production.

Threat #3: Your Agent's Secrets Are Probably in Plaintext Right Now.

One of the more telling community signals is GitHub discussion #1237. A user asks OpenClaw to recommend deploying Vaultwarden - a self-hosted Bitwarden fork - via Docker, instead of storing API keys and credentials in plaintext environment files.

The argument is simple: if the agent goes rogue or a malicious skill gets through, plaintext secrets are instantly compromised. A password manager gives you one revocation path.

The discussion has community upvotes. It hasn't been implemented.

Right now, the standard OpenClaw setup has users pasting API keys for Anthropic, OpenAI, or whatever provider they use directly into .env files or config YAML. These keys sit on disk, readable by the agent process, and by extension readable by any skill the agent loads. There's no vault integration, no secret rotation, no kill switch.

This is fine for a hobbyist running experiments on a Raspberry Pi. It's a real problem for anyone who's connected their agent to Slack, given it access to email, or installed third-party skills from a marketplace.

Threat #4: Every New Channel Is Another Attack Surface.

The community wants more. Discussion #101 asks for a single OpenClaw instance that handles multiple Telegram groups - business in one, personal in another - with automatic skill and context routing. Discussion #192 requests IMAP and SMTP as native channels, so users can forward invoices to their agent and get HTML email replies.

Both are reasonable feature requests. Both dramatically expand the attack surface.

Every new ingress point - a Telegram group, a Slack workspace, an email inbox - is another place where untrusted content meets agent execution. A poisoned invoice PDF forwarded to the agent's email channel. A crafted message in a public Discord server the agent monitors. A prompt injection embedded in a calendar invite that the agent reads during a heartbeat check.

Cross-context contamination is the quieter risk. If the same agent handles your personal Telegram and your company Slack, a successful injection in one channel could leak data from another. Context isolation exists in the architecture, but it's configurable, not enforced. Most users won't configure it.

So What Do You Actually Do About It?

None of this means you shouldn't use OpenClaw or tools like it. The capability is real, the project is moving fast, and the security team is doing better work than most. But if you're crossing the line from experimentation to real workloads, treat the deployment like what it is: an always-on process with shell access, network access, and your credentials.

Treat skills like third-party apps. Read what they do before installing. Check the VirusTotal scan results. Understand what tools and permissions they request. "Clean scan" does not mean "safe."

Stand up a password manager before you onboard the agent. Vaultwarden, 1Password CLI, whatever works for your stack. Keep API keys out of plaintext files. Build a revocation path you can hit in under a minute.

Isolate every new channel. If you're adding Slack, email, or multi-chat support, think about what happens when one channel gets poisoned. Use separate sessions or context boundaries. Don't let your personal Telegram bleed into your work Slack through a shared agent context.

Watch the trust site. OpenClaw is publishing their security roadmap in public. That's rare and valuable. Follow the progress. When phases three and four land, re-evaluate.

The Bottom Line

Open-source AI agents have crossed from "interesting demo" to "running in production." The tooling is getting better fast: VirusTotal scanning, public threat models, Docker-native deployments, formal security programs.

But the gap between capability and hardening is still wide enough to drive a truck through.

If you're going to put an AI agent in the loop, act like you're onboarding an employee with root access. Because functionally, that's exactly what you're doing.