// LLM knowledge base for baseplane-ai/duraclaw — Claude Code session orchestrator
#id, a type badge, a one-paragraph description, and chips linking to what it depends on and what depends on it. When an LLM needs to reason about this repo, load this page as context rather than scanning the tree: it's the distilled graph of what exists, what it's for, and how it connects. Source of truth for the code is github.com/baseplane-ai/duraclaw.
┌──────────────────────────┐
│ Browser · React 19 SPA │ TanStack Router · Agents SDK client
└───────────┬──────────────┘
▼
┌──────────────────────────┐ ┌────────────────────────┐
│ CF Worker · orchestrator│◄──────►│ D1 · duraclaw-auth │
│ · Better Auth · static │ │ (users · prefs · tabs)│
└───────────┬──────────────┘ └────────────────────────┘
▼
┌──────────────────────────┐
│ SessionAgent DO · 1/session │ SQLite history · @callable RPC · useAgent
│ ├─ SessionRegistry DO │ worktree locks · user-scoped session index
│ └─ ProjectRegistry DO │ projects · user_preferences · Yjs drafts
└───────────┬──────────────┘
▼ dial-back WebSocket (CF tunnel)
┌──────────────────────────┐
│ agent-gateway · Bun :9877 │ VpsCommand router · bearer auth · systemd
└───────────┬──────────────┘
▼ spawn
┌──────────────────────────┐
│ session-runner · subprocess │ wraps Claude / Codex / OpenCode SDK
└───────────┬──────────────┘
▼
┌──────────────────────────┐
│ worktree · baseplane-dev* │ git · shell · tool execution
└──────────────────────────┘
One DO per session. Owns SessionState, SQLite message history, the gateway WebSocket, and all RPC methods the client calls via Agents SDK useAgent. onMessage delegates to super for RPC dispatch; WS route handles /agents/.
Singleton DO. Worktree lock table, session index, user-scoped filtering (added P0.1b). Prevents two sessions from claiming the same worktree.
Per-user project metadata and user_preferences (permission / model / budget / thinking / effort / hidden-projects / voice-input). Also hosts Yjs multiplayer draft state (PR #4).
Yjs collab room per session — hosts the shared Y.Text that the draft textarea binds to via useSessionCollab. Awareness fields drive the PresenceBar + TypingIndicator.
Cloudflare Worker + TanStack Router SPA (Vite 8). Better Auth on D1. Exports DO classes + the SPA handler. Path alias ~/ → ./src/.
VPS-side Bun WebSocket server on 127.0.0.1:9877. Bearer-authed, supervises session-runner subprocesses. Endpoints: GET /health, GET /worktrees. Systemd install via systemd/install.sh.
Subprocess spawned per session; wraps Claude / Codex / OpenCode adapter. Decoupling from the gateway (issue #1 / PR #2) lets sessions survive gateway restarts.
32-component React library extracted from baseplane agent-orch (A.1). Ships ChatThread · PromptInput · GateResolver primitives · and (Phase 1 of #20) VoiceInputButton.
8-mode workflow CLI: planning · implementation · research · task · debug · verify · freeform · onboard. Enforces phase tracking + stop-condition gates. Rendered as read-only KataStatePanel in the session view (Phase 5.3).
New in packages/ai-elements/src/components/voice-input.tsx (PR #36, A.5 Phase 1). Wraps browser SpeechRecognition; renders null when the API is absent or enabled=false; never auto-sends. Exposes onFinalTranscript + optional onInterimTranscript + onError.
Composer bound to the shared Y.Text via useSessionCollab. Footer wires PromptInputSubmit, image attach, and (PR #36) the mic button — final transcript appends to the draft without auto-send.
UI for resolving permission / AskUserQuestion gates. Spec #38 replaced PreToolUse hooks with canUseTool. PR #36 adds a mic button next to the free-text answer field — Approve/Deny remain user-initiated.
First-run resolver for the voiceInputEnabled preference. Defaults from window.SpeechRecognition support, persists to /api/preferences. Returns { enabled, setEnabled }.
Columnar table on duraclaw-auth. Columns: userId · permissionMode · model · maxBudget · thinkingMode · effort · hiddenProjects · voiceInputEnabled (added in migration 0010 via PR #36) · updatedAt. Migrated from legacy KV shape in 0008.
Wraps @anthropic-ai/claude-agent-sdk. bypassPermissions; strips CLAUDECODE* env vars before spawn to prevent nested-session detection. Default executor.
Codex (OAuth-verified) adapter. Same VpsCommand surface as Claude. Shipped with spec #16.
OpenCode adapter. Good local-first fallback; useful when the entitlement-backed providers are unavailable.
Emitted after the executor SDK initialises. Carries model + tool list. SessionAgent stores and relays.
Assistant message content streamed from the executor. Rendered by ChatThread.
Result of a tool execution. Paired with a preceding tool-use block in the assistant event.
AskUserQuestion gate intercepted. Session pauses awaiting answer command. Structured questions render in GateResolver.
File-change detection emitted by the post-tool-use hook. Rendered inline in ChatThread.
Session completed or failed. Carries duration + cost. Emitted once per turn.
Planned for A.5 Phase 2. Emitted by the withVoiceInput mixin after the Whisper call — lets multi-tab peers see the same transcript.
Zod-validated envelope from orchestrator → executor. Variants: execute · resume · abort · answer · rewind · compact · fork. Reverse stream is VpsEvent.
Return-path envelope from executor → orchestrator. Zod-validated. See the #events section for individual variants.
Turbo dev for the whole stack — orchestrator (miniflare) + agent-gateway + ai-elements tsup watch.
Turbo deploy. ship:worker deploys only the orchestrator; ship:gateway only the VPS executor.
Baseline verification: preflight → auth → gateway → session → browser → browser:session. Per-subphase verify:* layers on top.
Phase-1 delta for A.5 (PR #36). Static shape + UI wiring + migration/schema + vitest behavior with mocked SpeechRecognition.
PR #21 delta. Static devcontainer.json shape + port forwards + Bun feature + post-create.sh parse. Full Docker build deferred to the first contributor running it in a Docker-enabled env.
Drizzle-kit migration generator. Produces both the SQL and a snapshot in migrations/meta/. Rename generated file from the adjective/noun seed to a semantic NNNN_what.sql.
Resolves a pending AskUserQuestion. Dispatched by GateResolver through SessionAgent RPC. Optional voice-first flow fills the answer via VoiceInputButton.
Switches the current workspace into one of the 8 kata modes. Tracks phase + tasks; stop-condition gates block premature completion.
AGENTS.md policy. Every subphase that moves spec → in-progress adds or extends a targeted verify:<cap> script and saves evidence under .kata/verification-evidence/. Names are capability-oriented, not throwaway.
feature/<issue-number>-<slug>. Rebase-only branch protection — force-push to the feature branch after review; do not merge back. Commit subjects follow type(scope): description (#N).
Pre-commit hook (wired by pnpm run prepare) runs Biome on staged JS/TS/JSON files plus pnpm typecheck. Bypassing with --no-verify is strongly discouraged.
Every DO endpoint verifies userId; SessionRegistry filters by user. Baked in from P0.1b so the single-user → multi-user cutover is a data migration, not a code rewrite.
2-space indent · 100-col · LF · single quotes · no semicolons (in biome-managed files). Biome also handles import sort + snapshot format. Run pnpm exec biome check --write <paths> to auto-fix.
Use chrome-devtools-axi for browser UI verification (SPAs + hydration) instead of curl / WebFetch. Use gh-axi for GitHub issues / PRs / runs / releases instead of raw gh.
Per-subphase evidence bundles. Filename phase-<name>-<date>.md. Contents: which scripts ran, what passed, what's deferred (e.g. live-browser runs), what to append later.
Decouple session lifecycle from dial-back WebSocket. Implementation in PR #2 — session-runner subprocess supervised by agent-gateway.
Add /admin/dump-do-state endpoint for multi-user cutover rehearsal. Small, maintainer-authored, exact scope.
Unify message transport on TanStack DB — retire manual hydrate / optimistic / replace reconciliation. needs-spec; a spec PR is the right first move.
Infra: reproducible dev environment via Dev Containers / Codespaces. Closed by PR #21.
feat(mobile): voice input — withVoiceInput mixin on SessionAgent (A.5). Spec lands in PR #23; Phase 1 implementation in PR #36; Phase 2 (server Whisper) pending.
feat(infra): dev container + codespaces config. Adds .devcontainer/{devcontainer.json,post-create.sh,README.md}, scripts/verify/devcontainer.sh, verify:devcontainer script entry.
docs(spec): voice input — withVoiceInput mixin (#20). Locks six design decisions before code lands. Matches the planning/spec-templates/feature.md shape used by #38 / #16 / #24.
feat(voice): Phase 1 — Web Speech VoiceInputButton (#20). 15 files, +1400. New component + 5 mocked-SpeechRecognition behavior tests + migration 0010 + UI wiring in MessageInput & GateResolver + verify:voice:web-speech + evidence file.
From sibling gtm-autoresearch. Pattern for generating per-client evals — useful reference when building score functions for any Duraclaw sub-agent that needs to hill-climb on domain-specific signal.
Eight workflow modes the meta-agent or a contributor can enter for structured work: planning · implementation · research · task · debug · verify · freeform · onboard.
The human-readable operator guide. 13 tabs — Overview · Architecture · Durable Objects · Agent Gateway · Session Runner · Executors · Kata · Transport · Verify · Roadmap · Deploy · Contribute · Glossary.
Upstream Hermes Agent docs. Relevant because the Duraclaw executor adapters follow the same SDK shape when possible.
Runbook for the Hermes gateway as a Pi harness on claws-mac-mini. Useful context for how a Duraclaw harness graduates into a long-lived hosted agent.
Same-loop framing across harness engineering (AutoAgent) and external-system hill-climbing (autoresearch). Includes the Autogenesis Protocol (AGP) arxiv paper walk-through.
OpenClaw architecture reference. Source of the 5-phase visual system that a number of sibling guides reuse.