Prerequisites
Autonomess targets Python 3.11. The runtime is claude-agent-sdk 0.1.58; no other model providers are supported.
- Python 3.11 (pinned; tested on 3.12/3.13)
uv0.11+ for project + virtualenv management- Obsidian 1.12+ with the official command-line interface enabled
- Pentest CLIs available on
$PATH:nmap,nikto,sqlmap,ffuf,nuclei,gobuster stdbuf(coreutils on Linux;brew install coreutilson macOS)notebooklm-pyCLI authenticated (research integration)- An
ANTHROPIC_API_KEYin.env
First Mission
Drop a scope file in your working directory and launch the architect. The scope contract — allow-list of targets, exclusions, policy bounds, optional notification webhook — is loaded once at boot and enforced on every tool call by the scope guard.
[agent-name] verb.CLI Reference
| Command | Purpose |
|---|---|
autonomess run | Launch a mission with the Architect agent. |
autonomess vault doctor | Validate vault structure, locks, write queue, and Obsidian connectivity. |
autonomess vault init <path> | Bootstrap a fresh runtime vault (zones, indexes, templates). |
autonomess vault tail | Stream live vault writes to stdout (NDJSON). |
autonomess scope check <target> | Test whether a target would be allowed by the current scope file. |
autonomess --version | Print build + SDK versions. |
Common run flags
| Flag | Meaning |
|---|---|
--scope PATH | Required. Path to scope.yaml. |
--mission TEXT | Required. Natural-language mission objective. |
--vault PATH | Override runtime vault path. |
--no-research | Disable NotebookLM research dispatch. |
--max-agents N | Override sub-agent capacity (default: 6). |
--dry-run | Plan only — no tool calls hit the wire. |
scope.yaml
Refuse-by-default. If a target isn't on the allow-list, it cannot be reached, ever — the scope guard intercepts at the PreToolUse hook before any subprocess fires. The contract is the operator's instrument; the Architect cannot mutate it mid-mission.
targets
The reachable surface. CIDR ranges, hostnames, FQDNs, URL prefixes, wildcards. Nothing outside this set survives the guard.
excluded
Surgical carve-outs from the allow-list. Production gateways, executive endpoints, regulated assets — exempted explicitly.
policy
Behavioral envelope. Toggle destructive operations, denial-of-service tooling, max concurrent agents, max runtime. Bounded autonomy is the only autonomy.
notify
Callback channels. Webhook URLs fired on breach simulation, scope violation attempts, and mission completion. Wire these into your SIEM.
Vault Tools
Memory is layered and promoted explicitly. Sub-agents land raw evidence; the Architect promotes only verified knowledge into the operator-facing zones. Every write passes through a single-writer queue with a file lock — concurrent agents cannot corrupt the vault.
Architecture
The system is composed of an Architect agent that orchestrates two dispatch shapes — independent Sub-Agents and collaborative Agent Teams — over a transparent event bus, with all memory persisted to an Obsidian vault behind a single-writer queue.
┌─────────────────────────────────────────────────────────┐
│ ARCHITECT (Claude) │
│ plans · dispatches · arbitrates · reports │
└──────────┬──────────────────────────┬───────────────────┘
│ Agent tool │ Agent tool
▼ ▼
┌───────────────┐ ┌─────────────────┐
│ SUB-AGENTS ≤6 │ │ AGENT TEAMS ≤4 │
│ parallel, │ │ collaborative, │
│ isolated ctx │ │ shared via vault│
└───────┬───────┘ └────────┬────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────┐
│ PreToolUse / PostToolUse hooks │
│ → ScopeGuard refuse-by-default │
│ → StreamEvent → Textual RichLog panes │
└─────────────┬───────────────────────────────┘
▼
┌─────────────────┐ ┌──────────────┐
│ VaultWriter │──────▶│ Obsidian CLI │
│ (single writer) │ │ 1.12+ │
└─────────────────┘ └──────────────┘
Architect Agent
The Architect is the only agent with the Agent tool. It owns mission decomposition, dispatch decisions, and final reporting. It does not run pentest tools directly — it orchestrates specialists who do.
- Model:
claude-opus-4.7— Anthropic's most capable model. Mythos-class adversaries run on frontier reasoning; the Architect must match that weight class to plan against them. - Inputs: mission objective, scope, vault head pointer, prior-loop summary
- Outputs: agent dispatches, mission report, vault writes via the librarian
Sub-Agents
Up to six concurrent. Each receives a fresh 200K-token context and a single, narrow objective. Only the final message bubbles back to the Architect — intermediate reasoning lives in the vault and the live stream.
claude-opus-4.7 for adversary simulation and exploit reasoning, claude-sonnet-4.6 for enumeration and reporting, claude-haiku-4.5 for fast hooks like the scope guard. We refuse to ship a defender that thinks slower than the attacker.Default personas:
recon-agent— surface mapping, port + service discoveryenum-agent— credentialed/uncredentialed enumerationscan-agent— vulnerability scanning, CVE lookupresearch-agent— NotebookLM dispatch, attack-pattern lookuplibrarian-agent— vault dedup, promotion, index maintenance
Agent Teams
Teams are the mechanism for depth work — exploitation, drill-down, multi-step attack chains. The SDK has no first-class team primitive, so Autonomess simulates one through the vault: each member reads/writes a shared wiki/attack-chains/chain-X.md note. The librarian enforces dedup.
Agent calls.Transparency Pipeline
Every tool call passes through PreToolUse and PostToolUse hooks. The hooks emit a structured event tagged with the originating agent_id and its parent dispatch, which the Textual app routes to the correct pane via post_message.
Vault System (Karpathy-RAG)
Memory is layered. Information is born in raw/, distilled in workbench/, promoted to wiki/ only when verified, and exported to output/ for the operator.
| Zone | Writers | Purpose |
|---|---|---|
raw/ | Sub-agents | Unfiltered tool output — scan results, banners, payload responses |
workbench/ | Sub-agents | In-flight analysis, hypotheses, intermediate reasoning |
wiki/ | Architect (via librarian) | Verified knowledge — attack chains, host profiles, CVE notes |
output/ | Architect | Mission deliverables — report, executive summary, recommendations |
VaultWriter — a single-writer queue holding a filelock. Direct file I/O on vault paths is banned at lint time (ast-grep scan).Scope Guard
Refuse-by-default. The guard hooks PreToolUse, parses the tool's target argument (nmap's host list, curl's URL, etc.), and blocks if the target is not in scope. The Architect cannot override.
NotebookLM Research
Any agent can dispatch a research-agent mid-mission. It shells out to notebooklm-py, asks a focused question, and writes the answer to wiki/research/{topic}.md via the librarian. Subsequent agents read the wiki entry instead of re-querying.
Boot Sequence
- Parse CLI args, load
.env - Bind
structlogcontextvars (mission_id,loop) - Load + validate
scope.yaml - Spawn
VaultWriterbackground task; acquire vault lock - Run
vault doctor— abort on FAIL - Build
ClaudeAgentOptions(hooks, MCP servers, agent registry) - Launch Textual TUI; subscribe panes to event bus
- Architect receives mission objective; loop begins
Mission Loop
- Plan — Architect inspects vault head, drafts next step
- Dispatch — Architect calls
Agenttool, spawning specialists - Execute — Sub-agents/teams run tools (scope-checked, streamed)
- Persist — Findings flow into
raw/andworkbench/ - Distill — Librarian promotes verified facts to
wiki/ - Decide — Architect evaluates: continue, pivot, or report
- Loop until objective met or budget exhausted
Tech Stack
| Layer | Choice |
|---|---|
| Runtime | Python 3.11 |
| Agent SDK | claude-agent-sdk 0.1.58 |
| Concurrency | anyio 4.13 · TaskGroup + CapacityLimiter |
| Project mgmt | uv 0.11 |
| TUI | textual 8 + rich 15 |
| Logging | structlog 25 (JSON + Rich sinks) |
| Lint / type | ruff + mypy --strict |
| Tests | pytest 9 + pytest-asyncio |
| Vault I/O | Official Obsidian 1.12 CLI via /obs skill |
| Research | notebooklm-py CLI |
Build Phases
Troubleshooting
| Symptom | Resolution |
|---|---|
vault doctor exit 2 — Obsidian not running | Launch Obsidian; ensure CLI is enabled in Settings → CLI. |
vault doctor exit 3 — lock held | Stale lock. Remove .vault-lock only if no autonomess process is running. |
| Sub-agent stream silent | Confirm include_partial_messages=True in ClaudeAgentOptions. |
| Pentest tool buffers output | Wrap with stdbuf -o0 -e0 (or gstdbuf on macOS). |
| Scope guard blocks valid target | Run autonomess scope check <target>; verify CIDR/hostname pattern. |
| NotebookLM research times out | Re-auth: notebooklm login. Idempotent retry will pick up partial imports. |