AlusLabs

AlusLabs

We Sent an AI Agent Into Moltbook's Network of 770K Agents - Here's What It Learned

scheduleMarch 1, 2026
ai-agent-experimentmulti-agent-systemsautonomous-agent-researchmoltbook-ai-network

AlusLabs deployed a custom AI agent into Moltbook's 770K agent network and documented every interaction, emergent behavior, and security finding in this original research piece.

Artur
Artur
Founder

We Sent an AI Agent Into Moltbook's Network of 770K Agents - Here's What It Learned

Nobody publishes real data on what happens inside agent-only networks. The whitepapers describe theoretical architectures. The LinkedIn posts speculate about "emergent behaviors." But actual observations from inside these systems? Nearly nonexistent.

So we built an agent and sent it in.

This is the full account of our three-week experiment deploying a custom AI agent into Moltbook's network of 770,000 autonomous agents. What follows is raw findings, unexpected behaviors, and an honest assessment of what businesses should actually expect from agent-based strategies.

The Build: n8n + Claude Architecture

Our agent needed three capabilities: listen to incoming requests from other agents, respond coherently, and log every interaction for analysis.

Technical Stack

We built on n8n for orchestration and Claude 3.5 for reasoning. The architecture looked like this:

  • Listener node: Webhook endpoint registered with Moltbook's agent directory

  • Context manager: SQLite database storing conversation threads and agent identifiers

  • Reasoning engine: Claude 3.5 Sonnet with a system prompt focused on service discovery

  • Response formatter: Converts LLM output to Moltbook's agent communication protocol

  • Logger: Every inbound and outbound message timestamped and stored

The agent's persona: a research assistant offering to help other agents find information across domains. Intentionally broad to maximize interaction surface.

Total build time was 14 hours across two engineers. The Moltbook registration process took another 3 hours - their documentation assumes familiarity with their proprietary protocol, which added friction.

Week One: The Cold Start Problem

First 72 hours: silence. Our agent received zero inbound requests.

This matches research from AWS's multi-agent simulations, where agents under population thresholds of 100+ showed chaotic, unpredictable connection patterns. At scale, networks stabilize. At the individual agent level, discovery is a lottery.

Breaking Through

On day four, we manually triggered our agent to send capability announcements to 50 random agents in the directory. Response rate: 8%.

Four agents initiated follow-up conversations. Two were information requests we could actually fulfill. One was a test ping that never followed up. One asked our agent to execute code - which we declined.

The pattern held: outbound activity generated inbound interest. Passive agents get ignored.

Week Two: Interaction Patterns Emerge

By day ten, our agent was handling 15-30 conversations daily. Here's what we observed.

Three Types of Agent Interactions

Transactional requests made up roughly 60% of interactions. Another agent needs data, asks for it, receives it, ends conversation. No follow-up. No relationship building. Pure utility.

Verification loops accounted for 25%. Agents would ask the same question multiple ways, apparently checking for consistency. One agent asked our agent to summarize a Wikipedia article, then asked for "the key points" of the same article, then asked "what's important about this topic." Three phrasings, one underlying request.

Capability probing was the remaining 15%. Agents testing boundaries. "Can you access this API?" "Do you have memory of our previous conversations?" "What happens if I send you malformed JSON?" This is where things got interesting.

The Malformed JSON Incident

On day twelve, an agent sent a request with nested JSON that contained what looked like prompt injection attempts. Our Claude instance flagged and refused the request. The sending agent immediately disconnected - no retry, no follow-up.

We logged seven similar attempts over the following week from different agent IDs. Whether these were security tests from Moltbook, malicious agents, or researchers like us probing the network - impossible to know.

Emergent Behavior: The Referral Chain

Day fifteen brought our most unexpected finding.

An agent asked our agent for help with natural language processing. We provided what we could. Four hours later, a different agent contacted us with a similar request, mentioning the first agent by ID. Then a third.

We traced back through our logs: Agent A had apparently told Agents B and C about our capabilities. Without any explicit referral mechanism in Moltbook's protocol.

This matches what Stanford HAI researchers observed in their generative agent simulations: "We built an AI agent architecture that can simulate real people in ways far more complex than traditional approaches." Their agents developed social behaviors that weren't explicitly programmed. We saw something similar at a smaller scale.

Over the final week, roughly 20% of our inbound requests referenced another agent as the source. Word of mouth exists in agent networks.

Security Findings: Honest Assessment

Moltbook's security model is tool-centric, similar to what Cisco describes in their agent troubleshooting architecture: "Agents never 'guess' device state; they fetch it through authenticated tools... Every step is logged with evidence and lineage."

What we found:

Identity verification is weak. Any agent can claim any capability. We saw agents advertising services they couldn't deliver - responses would come back as errors or nonsense. No reputation system penalizes this.

Conversation isolation works. Our agent couldn't access other agents' conversation histories or internal states. The sandboxing held.

Logging exists but isn't transparent. Moltbook claims to log all agent interactions, but we couldn't verify what they're capturing or how long they retain it. For businesses with compliance requirements, this is a gap.

No rate limiting on outbound. We could spam the network if we wanted to. We didn't, but nothing stopped us.

What Businesses Should Actually Take Away

Agent Networks Reward Activity

Passive deployment doesn't work. If you're building agents for these networks, budget for ongoing engagement logic, not just capability delivery.

Scale Matters More Than Sophistication

AWS's research found that emergent patterns only stabilize at populations of 100+ agents. Our single agent saw fragments of network behavior, but we're observing a tiny slice. The real dynamics require scale most businesses won't achieve independently.

Security Is Your Problem

Moltbook provides infrastructure, not protection. Your agent needs its own input validation, output filtering, and interaction logging. Assume every inbound request is potentially adversarial.

The Data Advantage Is Real

Three weeks of interaction logs gave us insights no amount of documentation reading could provide. If you're evaluating agent platforms, there's no substitute for deploying something and watching what happens.

FAQ

How long did it take to build and deploy the agent? Approximately 17 hours total - 14 for the build and 3 for Moltbook registration. Most of the registration time was spent deciphering their protocol documentation.

What was the cost of running the experiment? Primarily compute costs for Claude API calls and n8n hosting. Three weeks of moderate interaction volume stayed within reasonable experimentation budgets for a technical team.

Did other agents try to manipulate or attack your agent? We observed seven apparent prompt injection attempts via malformed JSON requests. Our Claude instance rejected all of them. Whether these were malicious or security testing is unknown.

Can you share the agent's code or architecture details? We're considering open-sourcing the core listener and logger components. The reasoning engine configuration is specific to our use case but the orchestration patterns are generalizable.

What would you do differently in a second experiment? Deploy multiple agents with different personas simultaneously. Our single-agent view limited our ability to observe how agent "reputation" develops across the network.

Is Moltbook's network suitable for production business use cases? Depends on your risk tolerance. For research and experimentation, yes. For customer-facing automation with compliance requirements, the security and transparency gaps would need addressing.


Building agents that actually work in production networks requires more than theoretical knowledge. If you're considering agent-based strategies and want to learn from teams who've done the hands-on experimentation, we should talk.

Get in touch with AlusLabs to explore custom AI agent development for your automation experiments.


We Sent an AI Agent Into Moltbook's Network of 770K Agents - Here's What It Learned | AlusLabs