Fit-Gap Analysis & Feature Implementation Roadmap
Originally published: February 2026 · Last refreshed: 2026-05-03 (v0.5.7 cycle, v14 industry delta on top of v13 fitgap) Scope: VibeCody compared against 30+ competing AI coding products across terminal, editor, cloud-agent, code-review, completions-only, and mobile/watch categories.
This is now the single canonical roadmap — earlier iterations (v5, v6) and the 13 deep-dive fit-gap files are absorbed here and in the companion Fit-Gap Analysis. The original five-competitor delta (Codex CLI, Claude Code, Antigravity, Cursor, Windsurf) is preserved as a historical record below; phases 23–39 are summarised in the History appendices at the end of this document. The 2026-04-26 trend snapshot (§1bis) and the v13 audit-reconciliation work (Phase 53, Appendix D) are the freshest layer.
1bis. Q2 2026 industry trend snapshot (added 2026-04-26)
The eight days between the v0.5.5 fitgap refresh (2026-04-17) and today produced a denser-than-usual stream of AI-coding-tool releases. Headline shifts that the rest of this document does NOT yet absorb:
| Date | Release | What changed | Impact on VibeCody positioning |
|---|---|---|---|
| 2025-07 | Cognition acquires Windsurf | Devin + Windsurf + Cascade are now one company | §1.2 + §1.3 should treat them as the Cognition family rather than three competitors. |
| 2026-04-01 | GitHub Copilot Cloud Agent expanded | No longer PR-only; can branch-only work; CLI sessions controllable from web/mobile | New gap A9 — VibeMobile pairs with hosts but doesn’t resume CLI sessions. |
| 2026-04-02 | Cursor 3.0 | Agents Window, Design Mode (live DOM annotation), Agent Tabs (grid view), multi-repo | New gaps A5–A7. The await_tool parity claim from v8 is unchanged but the surface around it widened. |
| 2026-04-16 | Claude Opus 4.7 GA | 87.6% SWE-bench Verified (up from 80.8%); 64.3% SWE-bench Pro; 70% CursorBench | Refresh useModelRegistry.ts defaults; add to Counsel default panels. |
| 2026-04-22 | Devin 2.2 | Full Linux desktop access; agent self-verifies via computer use, auto-fixes | New gap A8 — visual_verify.rs doesn’t yet feed failures back into the agent loop. |
| 2026-04-24 | Cursor 3.2 | Async subagents, multi-root workspaces | New gap A5 (async) + A6 (multi-root). |
| 2026-04-24 | Copilot Inline Agent in JetBrains | Agent-mode inline chat in JetBrains IDEs | Our JetBrains plugin should match within one minor cycle. |
| 2026-Q1–Q2 | MCP 2026 roadmap published | MCP Apps (interactive UI), MCPB bundle format, .well-known capability discovery, stateless transport, enterprise extensions |
New gaps A1–A4. Largest single ecosystem shift this quarter. |
| 2026-Mar | JetBrains Junie CLI Beta | LLM-agnostic; runs in IDE/CLI/CI/CD/GitHub/GitLab; 1-click migration from Claude Code + Codex configs | New gap A11 — migration tooling. Junie sets the bar for switching cost. |
| 2026-Q1 | OpenAI Codex CLI v0.116+ | GPT-5.3-Codex-Spark (1000+ TPS), hooks GA, plugin marketplace, AWS Bedrock auth + SigV4 | Bedrock partial; Spark drops in via existing routing layer. |
| 2026-Mar | ACP v0.11.0 | Zed + JetBrains official partnership Oct 2025; Anthropic, OpenAI, GitHub, Google all ship implementations | New gap A4 — VibeCLI/VibeUI as ACP server, not just client. |
| 2026-Apr | Antigravity 1.20.3 → 1.22.2 | AGENTS.md fallback, Linux sandboxing, MCP auth | One-line memory.rs addition for GEMINI.md fallback. |
| 2026-Apr | Augment Code 72.0% SWE-bench Verified pass@1 | Highest open-system score, no best-of-N tricks | Updates §1.4 / §9.3 SWE-bench callout. |
| 2026-Q1 | Sandbox infrastructure GA wave | Cloudflare Sandboxes GA; E2B/Northflank/Modal/Vercel/Docker microVM platforms shipped | Optional roadmap track: VibeCLI --cloud provider for users without local sandboxing. |
The 17 gaps surfaced by these releases (11 new + 6 covered by existing infra) are catalogued in §16.1 of the Fit-Gap Analysis and queued in Phase 53 below.
1ter. May 2026 weekly delta + missed-quarter items (added 2026-05-03)
The seven days between the v13 snapshot (2026-04-26) and today shipped another dense wave; a handful of Q1-Q2 items v13 missed are folded in here as well. Sources surveyed: cursor.com/changelog, GitHub Copilot blog, Anthropic Claude Code releases, OpenAI Codex/ChatGPT release notes, Cognition Devin docs, blog.modelcontextprotocol.io, a2a-protocol.org, Linux Foundation press, JetBrains Junie + Air blogs, Ollama releases, ggml-org/llama.cpp, vLLM releases, SWE-bench leaderboards, sandbox provider coverage (E2B / Daytona / Modal / Blaxel / SmolVM), and OSS coding-agent repos (Cline / OpenHands / Aider / Continue).
1ter.1 Headline shifts not yet absorbed elsewhere
| Date | Release | What changed | Impact on VibeCody positioning |
|---|---|---|---|
| 2026-05-04 | MCP experimental-ext-skills |
Skills discovery & distribution as MCP primitives (modelcontextprotocol/experimental-ext-skills) |
New gap B1 — VibeCody’s 711 skill files could become MCP-discoverable across every host that speaks MCP. |
| 2026-05-01 | Cursor Plugin Marketplace v2 | Plugins bundle MCP servers + skills + subagents + rules + hooks; admin install policy (Default Off / Default On / Required); Team Marketplace decoupled from any specific repo | New gap B2 — VibeCody plugins are still single-purpose; no bundle format with admin-policy tiers. |
| 2026-04-30 | Cursor Security Review (beta) | Always-on Security Reviewer + Vulnerability Scanner agents on Teams + Enterprise plans | New gap B3 — /review runs on demand; no always-on security-agent pattern. |
| 2026-04-29 | VS 2026 + VS Code Integrated Cloud Agent | “Assign a task, close the IDE, get a PR” — Copilot Cloud Agent now controllable from inside the editor | Extends A9 — cloud-agent remote-control is now IDE-native, not just web/mobile. |
| 2026-04-23 | OpenAI GPT-5.5 GA | Recommended default Codex model (replaces 5.4); GPT-5 latency at higher intelligence; fewer tokens per Codex task; computer-use focus | Add to useModelRegistry.ts; route default Codex calls to 5.5. GPT-5.3-Codex-Spark from §1bis remains the latency tier. |
| 2026-Apr | Cursor SDK (TypeScript) | Same agent runtime / harness / models as desktop, CLI, and web — exposed as a @cursor/sdk TS library |
Direct competitor to packages/agent-sdk/; parity audit (subagents, hooks, plugins, skills, sandbox tiers). |
| 2026-04-21 | llama.cpp NVFP4 (PR #22196 reposted) | Blackwell-native FP4 path merged; MXFP4 progressing in ik_llama.cpp; b8196+ runs MXFP4 MoE on Blackwell |
TurboQuant target list: add NVFP4 alongside MXFP4 + AWQ-Marlin. CubeCL/Burn ban scope unchanged. |
| 2026-Apr | Chinese frontier wave | DeepSeek V4-Flash $0.14/$0.28 per 1M (~7.7× cheaper than Qwen 3.6-Plus on chatbot loads); Qwen 3.6-Plus + Qwen 3.6-35B-A3B (Apache 2.0); Kimi K2.6 long-horizon agentic; MiniMax M2.7; GLM-5.1 | Counsel default panels need an open-weight slot; cost_router.rs learns the new floor; useModelRegistry.ts Ollama defaults shift to Qwen 3.6-Coder family when released. |
| 2026-Apr–May | Ollama 0.22.x | /v1/messages (Anthropic Messages API compat — Claude Code can drive Ollama-hosted open models); ollama launch registers Claude Desktop / Cowork / Code; Gemma 4 thinking + tool calls; MLX runner gains logprobs + fused top-P/K |
Ollama-compat /api/* surface should mirror /v1/messages; existing routing layer absorbs this with one new route. |
| 2026-Q1 | A2A v1.2 (Linux Foundation Agentic AI Foundation) | 150+ orgs in production; signed agent cards (cryptographic signatures for domain verification); GA across Google / Microsoft / AWS | Extends A4 — VibeCody’s A2A server façade needs P-256-signed agent cards at /.well-known/agent.json; aligns with watch-pairing’s existing P-256 ECDSA constraint. |
| 2026-Q1 | ACP Registry live | Built into Zed + JetBrains; lists Claude Code, Codex CLI, GitHub Copilot CLI, OpenCode, Gemini CLI | Concrete follow-through on A4 — register VibeCLI as an ACP agent so Zed / JetBrains users discover it natively. |
| 2026-Q1 | SWE-bench Verified contamination disclosed | OpenAI stopped reporting Verified scores; audit found 59.4% of hard tasks have flawed tests; all frontier models contaminated | §9.3 leaderboard callout needs a contamination caveat; SWE-bench Pro / SWE-rebench / SWE-bench-Live become primary references. |
| 2026-Q1 | DAPO mainstreamed | OpenRLHF, verl, NeMo-RL all ship DAPO as default reasoning-RL alongside PPO/GRPO; ByteDance paper open-sourced (50% fewer training steps for AIME-class tasks) | RL-OS slice 2 should make GRPO + DAPO peers, not roll-forward; current vibe_rl/algos/ppo.py is the only algo file with recent edits. |
| 2026-Q1 | Sandbox cold-start floor | Blaxel 25 ms; Daytona 27–90 ms (Docker); E2B Firecracker microVMs ~150 ms; Modal gVisor; SmolVM debuted 2026-04-17 | Sandbox-tiers Firecracker tier should target ≤100 ms cold start; document the latency floor per backend in docs/design/sandbox-tiers/. |
| 2026-05-19 (planned) | Google I/O 2026 — Gemini 4 + Android 17 + Agentic Coding Dev Preview | I/O keynote with agentic coding as a headline track; Gemini 3.1 Pro Preview already shipping ahead of it | Calendar item: prep useModelRegistry.ts Gemini 4 entry; Android 17 may move VibeMobile API floor — re-check pubspec.yaml minSdkVersion after I/O. |
1ter.2 Context-only signals (positioning, not new gaps)
- Claude Sonnet 4.8 leaked (Mar 2026) — Anthropic skips a 4.7 designation for the Sonnet tier; pricing held at $3 / $15 per 1M (Anthropic source-code leak coverage, unverified for exact GA week). Add to
useModelRegistry.tsonce the API exposes it. - Cursor Interactive Canvases (Apr 15, missed in v13) — agent responses render as dashboards / forms / charts inside the chat panel. Overlaps MCP Apps (A1 / SEP-1865); a single workstream now covers both surfaces.
- Copilot training-default opt-in (Apr 2026) — GitHub flipped users’ AI-training preference to opt-in by default; community backlash drove visible migration to Cline (58k stars), OpenHands (72k), Aider (27k). VibeCody’s “no training on user code” stance becomes a measurable sales axis.
- JetBrains Air (Mar 2026) — agentic IDE rebuilt on Fleet remnants; supports OpenAI Codex, Anthropic Claude Agent, Gemini CLI, and Junie as native agents. Adds to §1.2 IDE list as a watch item, not a direct VibeUI competitor today.
- Doe v. GitHub Copilot (ongoing, 2026 status) — DMCA claims dismissed; license / contract claims still proceeding; reshaping AI-gen-code compliance standards. Reinforces (3) above; informs
/review’s open-source-license-scan UX. - Devin v3 API GA (Q1 2026) — Cognition’s v3 API is now the primary surface (out of beta); HTML/PDF/SVG inline rendering in session sidebar; focus mode. Devin-family parity continues to track A8 (self-verifying loop) more than UI ergonomics.
The 6 new external gaps surfaced (B1–B6) plus the 9 informational rows are catalogued in §16.4 of the Fit-Gap Analysis and queued for Phase 54 in Appendix D. The §16.4 text also revises §15.4 (SWE-bench callout) to flag the contamination finding.
1. Competitive Landscape Summary
AI-assisted development splits into six tool categories. VibeCody is the only project that ships a competitive entry in every single one of them from a shared Rust + TypeScript monorepo — plus two surfaces (Watch, Flutter Mobile) that have no serious competitor as of 0.5.5.
1.1 Terminal / CLI agents
| Tool | Owner | Stack | Standout capability |
|---|---|---|---|
| OpenAI Codex CLI | OpenAI | Rust + TS | Reference agent loop, OS sandbox, approval tiers, MCP |
| Claude Code | Anthropic | Node.js | Agentic multi-file edits, 300+ MCP integrations, subagents, hooks, skills |
| Gemini CLI | Node.js | Gemini-centric, built-in Google Search grounding, long-context | |
| Aider | Paul Gauthier (OSS) | Python | Git-aware pair programming, repo-map, very low-cost fastest iteration |
| Goose | Block / Square (OSS) | Rust | MCP-native extensible agent, session replay, any-provider |
| OpenHands (ex-OpenDevin) | All Hands AI (OSS) | Python + sandbox | Full browser + shell sandbox agent, SWE-bench benchmark leader |
| Cline (ex-Claude Dev) | OSS | VS Code ext. | In-editor autonomous agent, plan/act toggle, tight terminal loop |
| Amp | Sourcegraph | TS | Terminal companion to Cody, agent over an indexed monorepo |
| Plandex | Plandex AI (OSS) | Go | Long-running planning + diff review, self-hosted |
llm |
Simon Willison (OSS) | Python | Minimal provider-agnostic CLI, plugin ecosystem |
| Warp AI | Warp | Rust terminal | AI-native terminal, command-suggest and block AI |
| Mentat | AbanteAI (OSS) | Python | Interactive code-edit REPL with repo-map |
| VibeCLI | TuringWorks (this project) | Rust | All of the above + 22 providers, TUI + REPL + --serve daemon, 711 skills, OpenTelemetry |
1.2 AI-native IDE / editor
| Tool | Owner | Stack | Standout capability |
|---|---|---|---|
| Cursor | Anysphere | Electron + VS Code fork | Tab model (next-action prediction), 8 parallel agents in git worktrees, 200k-token indexing, BugBot |
| Windsurf | Codeium | Electron + VS Code fork | Cascade agent with flow-awareness, planning agent, memory, checkpoints |
| Google Antigravity | Electron + Gemini | Agent-first IDE, Manager View (5 parallel agents), Artifacts, knowledge base | |
| GitHub Copilot Workspace | GitHub / Microsoft | Web + VS Code | Spec → plan → implementation workflow, deep GitHub integration |
| JetBrains AI Assistant | JetBrains | All JetBrains IDEs | Deep language tooling, on-prem option, Junie agent |
| Amazon Q Developer | AWS | VS Code / JetBrains | AWS-aware completions, IAM/infra transformations |
| Zed | Zed Industries | Rust + GPUI | Collaborative editor with multi-model AI panel, low latency |
| Cody | Sourcegraph | VS Code / JetBrains / web | Repo-wide embeddings, enterprise graph, bring-your-own-LLM |
| Continue.dev | OSS | VS Code / JetBrains | Open-source Copilot-style autocompletion + chat |
| Aide | CodeStory (OSS) | VS Code fork | Open-source Cursor alternative |
| Void | OSS | VS Code fork | Open-source agent editor |
| PearAI | OSS | VS Code fork | Cursor-alternative with marketplace |
| Melty | OSS | Electron | Structured agent + changelog first editor |
| Tabnine | Tabnine | VS Code / JetBrains | Privacy-first completions, on-prem |
| VibeUI (this project) | TuringWorks | Tauri + React + Rust | 293 panels + 42 composites, all 22 providers, CRDT multiplayer, hooks, skills, WASM extensions |
1.3 Cloud / remote-agent products
| Tool | Owner | Stack | Standout capability |
|---|---|---|---|
| Devin | Cognition | Cloud + web | Fully hosted autonomous engineer, browser + shell in VM |
| Replit Agent | Replit | Replit cloud | App-generation agent + hosted runtime, mobile companion |
| Bolt.new | StackBlitz | Web (WebContainers) | “Prompt to full-stack app” running in the browser |
| v0 | Vercel | Web | Shadcn/Next.js UI generator; deep Vercel integration |
| Lovable | Lovable | Web | Prompt-to-app with Supabase + Stripe scaffolds |
| Builder.io Visual Copilot | Builder.io | Web | Figma-to-code + visual editor |
| Sweep AI | Sweep | GitHub App | Issue → PR automation |
VibeCLI --serve + VibeUI + agent-sdk |
TuringWorks | Self-hosted | Same capability, self-hostable and open-source; pair from any device over mDNS/Tailscale/ngrok |
1.4 AI code review / CI bots
| Tool | Owner | Integration | Standout capability |
|---|---|---|---|
| CodeRabbit | CodeRabbit | GitHub / GitLab | Line-by-line PR review, chat back to author |
| Qodo (ex-Codium AI) | Qodo | GitHub / IDE | PR-Agent, tests generation, coverage |
| Greptile | Greptile | GitHub App | Repo-aware review using a graph index |
| Cursor BugBot | Anysphere | GitHub | Cursor-branded review bot |
| Ellipsis.dev | Ellipsis | GitHub | Q&A + review across the repo |
| Graphite AI | Graphite | GitHub stacked-PRs | AI review tuned for stacked-diff workflow |
VibeCLI /review + VibeUI CIReviewPanel |
this project | GitHub App + CLI | Same capability, runs locally or in CI; red-team mode + Counsel multi-LLM deliberation |
1.5 Completion-only / IDE helper
| Tool | Owner | Integration | Standout capability |
|---|---|---|---|
| GitHub Copilot | GitHub / MS | All IDEs | Industry-default inline completions |
| Tabnine | Tabnine | All IDEs | On-prem / privacy-first |
| Codeium (free tier) | Codeium | All IDEs | Free Copilot-tier completions |
| Continue.dev | OSS | VS Code / JetBrains | Self-hosted model completions |
| Supermaven | Supermaven | VS Code / JetBrains | 1M-token context window completions |
| VibeUI inline completions | this project | VS Code + Monaco | FIM-enabled local Ollama + any cloud provider |
1.6 Mobile / Watch — no serious competitor
| Tool | Owner | Platforms | Status |
|---|---|---|---|
| Replit mobile | Replit | iOS, Android | Replit-only; runs Replit cloud |
| Cursor mobile (preview) | Anysphere | iOS | Early preview; read-only chat |
| Windsurf mobile | Codeium | — | Announced, not shipped |
| Devin mobile web | Cognition | Web PWA | Read-only session viewer |
| VibeMobile | this project | iOS, Android, macOS, Linux, Windows, Web | Full-duplex pairing + chat + session control against any host |
| VibeWatch — Apple Watch | this project | watchOS 10+ | No peer — native SwiftUI client with dictated reply + approval flow |
| VibeWatch — Wear OS | this project | Wear OS 3+ | No peer — native Kotlin/Compose client with the same capability |
Coverage summary
| Category | Total tools surveyed | VibeCody ships a competitive entry |
|---|---|---|
| Terminal / CLI agents | 13 | ✅ VibeCLI |
| AI-native IDE / editor | 15 | ✅ VibeUI |
| Cloud / remote-agent | 7 | ✅ --serve + agent-sdk (self-hosted) |
| AI code review bots | 7 | ✅ /review + CIReviewPanel |
| Completion-only | 6 | ✅ VibeUI inline completions (FIM) |
| Mobile / watch | 4 mobile, 0 watch | ✅ VibeMobile + first-class VibeWatch |
| Total | 52 tools across 6 categories | VibeCody is the only project that ships in all six — and the only one with a native watch client. |
2. Current VibeCLI — Feature Inventory
| Feature | Status | Notes |
|---|---|---|
| Multi-provider (22 providers) | Yes Done | All 22 providers implemented with failover |
| TUI (Ratatui) | Yes Done | Chat, FileTree, DiffView, Agent screens |
| REPL mode (rustyline) | Yes Done | History, tab completion, 14 slash commands |
| Git context injection | Yes Done | Branch, status, diff in system prompt |
/apply — single-file AI edits |
Yes Done | Shows diff, requires confirmation |
/exec — AI-generated shell commands |
Yes Done | Confirmation gate |
!cmd — direct shell execution |
Yes Done | Config-gated approval |
TOML config (~/.vibecli/config.toml) |
Yes Done | Per-provider + safety settings |
| Syntax highlighting in REPL | Yes Done | syntect |
| Streaming responses | Yes Done | Token-by-token via CompletionStream; TUI + REPL |
| Agent loop (autonomous multi-step) | Yes Done | plan→act→observe, 30-step max, AgentLoop |
| Structured tool use framework | Yes Done | 7 tools: read/write/patch/bash/search/list/complete |
| Approval tiers (Suggest/AutoEdit/FullAuto) | Yes Done | 3-tier; --suggest/--auto-edit/--full-auto flags |
| OS sandbox for command execution | Yes Done | macOS sandbox-exec, Linux bwrap |
| Codebase indexing / semantic search | Partial | Regex + heuristic symbol index; embeddings pending |
| Multi-file editing (batch apply) | Yes Done | Agent WriteFile tool handles any number of files |
| AGENTS.md / project memory | Yes Done | Loads VIBECLI.md / AGENTS.md / CLAUDE.md + global |
| MCP server integration | Yes Done | JSON-RPC 2.0 stdio; /mcp list, /mcp tools |
| Non-interactive / CI mode | Yes Done | --exec flag; JSON/Markdown report; exit codes 0-3 |
| Multimodal input (images/screenshots) | Yes Done | [image.png] syntax; Claude + OpenAI vision |
| Trace / audit log | Yes Done | JSONL per session; /trace + /trace view <id> |
| GitHub Actions integration | Yes Done | .github/actions/vibecli/action.yml |
3. Current VibeUI — Feature Inventory
| Feature | Status | Notes |
|---|---|---|
| Monaco Editor integration | Yes Done | Full VS Code engine |
| Rope-based text buffer | Yes Done | ropey |
| Async file I/O + file watching | Yes Done | notify |
| Multi-workspace | Yes Done | Multiple root folders |
| Git panel (status, diff, commit, push, pull) | Yes Done | git2; stash, branch list/switch, history |
| Terminal panel (PTY) | Yes Done | portable-pty + xterm.js |
| AI chat panel | Yes Done | All 22 providers; streaming |
| Command palette | Yes Done | fuse.js fuzzy search |
| Dark/light theme | Yes Done | localStorage persistence |
| LSP client (completions, hover, go-to-def) | Yes Done | Wired to Monaco; lazy-start per language |
| Extension system (WASM) | Yes Done | Full wasmtime host; loads ~/.vibeui/extensions/*.wasm |
| Inline AI completions (FIM) | Yes Done | Monaco registerInlineCompletionsProvider; Ollama FIM format |
| Agent mode (autonomous multi-file edits) | Yes Done | AgentPanel: steps, approval, streaming, events |
| @ context (reference files/symbols in chat) | Yes Done | @query popup; file search + @git context |
| Flow-awareness (edit/command tracking) | Yes Done | FlowTracker ring buffer; injected into AI context |
| Memory / rules system | Yes Done | MemoryPanel; .vibeui.md + ~/.vibeui/rules.md |
| Diff preview before AI apply | Yes Done | Monaco DiffEditor; accept/reject; auto git stash |
| Checkpoint / undo AI session | Yes Done | Backend (git stash) + CheckpointPanel UI |
| Trace / audit log (History panel) | Yes Done | HistoryPanel; list + detail view; JSONL traces |
| Multimodal (screenshot in chat) | Yes Done | Backend (Claude + OpenAI) + AIChat UI |
| Codebase indexing (semantic) | Yes Done | Regex/heuristic + embedding-based vector search (Ollama/OpenAI) |
| Planning agent | Yes Done | PlannerAgent; plan generation, approval, guided execution |
| Multi-agent parallel execution | Yes Done | MultiAgentOrchestrator; git worktrees; ManagerView UI |
| Web context (@web) | Yes Done | @web:<url> in chat/agent; fetch + HTML-strip; ContextPicker autocomplete |
| Artifacts (task lists, plans, recordings) | Yes Done | ArtifactStore + ArtifactsPanel; annotations, async feedback |
| Voice input | Yes Done | Web Speech API hook + mic button in AIChat; pulse animation |
| Knowledge base (persistent snippets) | Yes Done | MemoryPanel + SkillLoader; auto-activating skills |
4. Fit-Gap Matrix
4.1 VibeCLI vs. CLI agents (broad)
Y = native; P = partial; — = not supported.
| Capability | VibeCLI | Codex CLI | Claude Code | Gemini CLI | Aider | Goose | OpenHands | Cline | Amp | Plandex |
|---|---|---|---|---|---|---|---|---|---|---|
| Full agent loop | Y | Y | Y | P | P | Y | Y | Y | Y | Y |
| Streaming TUI | Y | Y | P | Y | Y | Y | P | — | — | Y |
| Multi-file batch edits | Y | Y | Y | P | Y | Y | Y | Y | Y | Y |
| Approval tiers (3-level) | Y | Y | Y | — | P | Y | Y | Y | P | Y |
| Codebase indexing | Y | Y | P | P | P (repo-map) | P | P | P | Y | P |
| OS sandbox | Y | Y | P | — | — | — | Y | — | — | — |
Project memory (*.md) |
Y | Y | Y (CLAUDE.md) | — | P | P | — | P | Y | Y |
| MCP integration | Y | Y | Y (300+) | — | — | Y | P | — | P | — |
| Multi-provider (≥10) | Y (22) | P | — | — | Y | Y | Y | Y | Y | Y |
| Hooks system | Y | — | Y | — | — | — | — | — | — | — |
| Skills system | Y (711) | — | Y | — | — | — | — | — | — | — |
| Parallel multi-agent | Y | P | Y | — | — | — | — | — | — | — |
| Plan Mode | Y | — | Y | — | — | — | Y | Y | — | Y |
| Session resume | Y | Y | Y | P | Y | Y | Y | Y | Y | Y |
| Web search tool | Y | Y | Y | Y | — | Y | Y | — | Y | — |
| Code review agent | Y | Y | Y | — | — | — | — | — | Y | — |
| Red-team / pentest pipeline | Y | — | — | — | — | — | — | — | — | — |
| Counsel (multi-LLM debate) | Y | — | — | — | — | — | — | — | — | — |
| GitHub Actions integration | Y | Y | Y | — | — | — | — | — | — | — |
| OpenTelemetry | Y | Y | — | — | — | — | — | — | — | — |
| Daemon / REST + SSE server | Y | P | P | — | — | Y | — | — | Y | — |
| Mobile companion | Y | — | — | — | — | — | — | — | — | — |
| Watch companion | Y | — | — | — | — | — | — | — | — | — |
4.2 VibeUI vs. IDE / editor competitors
| Capability | VibeUI | Cursor | Windsurf | Antigravity | Copilot WS | JetBrains AI | Amazon Q | Zed | Cody | Continue | Aide | Void |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Inline AI completions | Y | Y | Y | Y | Y | Y | Y | Y | P | Y | Y | Y |
| Agent mode (multi-file) | Y | Y | Y | Y | Y | Y | P | P | P | P | Y | Y |
| Diff review before apply | Y | Y | Y | Y | Y | Y | P | P | Y | Y | Y | Y |
| @ context system | Y | Y | Y | P | P | Y | P | Y | Y | Y | Y | Y |
| Flow-awareness | Y | P | Y | P | — | P | — | — | P | — | P | P |
| Memory / rules | Y | Y | Y | Y | P | Y | — | — | Y | Y | Y | Y |
| LSP | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y |
| WASM extension host | Y | — | — | — | — | — | — | — | — | — | — | — |
| Trace / audit log | Y | — | — | — | — | — | P | — | — | — | — | — |
| Checkpoint / undo | Y | P | Y | Y | P | P | — | — | — | P | Y | Y |
| Multimodal chat | Y | Y | P | P | P | P | — | P | P | Y | Y | Y |
| Semantic codebase index | Y | Y | Y | Y | Y | Y | P | — | Y | P | Y | Y |
| Planning agent | Y | P | Y | Y | Y | Y | — | — | — | — | Y | Y |
| Parallel agents | Y | Y (8) | Y | Y (5) | — | — | — | — | — | — | P | — |
| Next-edit prediction (Tab) | Y | Y | Y | P | — | P | — | — | — | — | P | P |
| Manager View (orchestration) | Y | — | — | Y | — | — | — | — | — | — | — | — |
| CI review bot | Y | Y (BugBot) | — | — | — | — | — | — | — | — | — | — |
| Artifacts | Y | — | — | Y | Y | — | — | — | — | — | — | — |
| Multiplayer CRDT | Y | — | — | — | — | — | — | Y | — | — | — | — |
| Rust native backend | Y | — | — | P | — | — | — | Y | — | — | — | — |
| Local / private AI (Ollama) | Y | P | P | — | — | — | — | P | P | Y | Y | Y |
| Open source | Y | — | — | — | — | — | — | Y | P | Y | Y | Y |
| Mobile + watch companions | Y | P | — | — | — | — | — | — | — | — | — | — |
4.3 VibeCLI --serve vs. cloud-agent products
| Capability | VibeCLI + agent-sdk | Devin | Replit Agent | Bolt.new | v0 | Sweep AI |
|---|---|---|---|---|---|---|
| Self-hostable | Y | — | — | — | — | — |
| Works offline (local model) | Y | — | — | — | — | — |
| Bring-your-own-LLM | Y | — | — | — | — | — |
| Full-stack code generation | P | Y | Y | Y | Y (UI only) | P |
| Long-horizon autonomy (hrs) | P | Y | Y | P | P | P |
| Browser / shell sandbox | Y | Y | Y | Y (WC) | — | — |
| GitHub issue → PR automation | Y | Y | P | — | — | Y |
| Mobile companion | Y | P | Y | — | — | — |
| Open source | Y | — | — | — | — | P |
4.4 VibeCLI /review vs. AI review bots
| Capability | VibeCLI /review + CIReviewPanel |
CodeRabbit | Qodo | Greptile | Cursor BugBot | Ellipsis |
|---|---|---|---|---|---|---|
| Inline PR comments | Y | Y | Y | Y | Y | Y |
| Security-focused review | Y (red-team) | Y | P | P | P | P |
| Self-hosted option | Y | — | P | — | — | — |
| Bring-your-own-LLM | Y | — | — | — | — | — |
| Runs locally from CLI | Y | — | — | — | — | — |
| Multi-LLM deliberation (Counsel) | Y | — | — | — | — | — |
| Cost metering / budgets | Y | — | — | — | — | — |
| Compliance reporting | Y | — | Y | — | — | — |
4.5 VibeMobile / VibeWatch vs. mobile + watch surfaces
| Capability | VibeMobile + VibeWatch | Replit mobile | Cursor mobile (preview) | Devin web | Others |
|---|---|---|---|---|---|
| Native iOS | Y | Y | Y | PWA | — |
| Native Android | Y | Y | — | PWA | — |
| macOS / Linux / Windows / Web | Y | — | — | Web | — |
| Apple Watch native | Y | — | — | — | — |
| Wear OS native | Y | — | — | — | — |
| Pairs with self-hosted host | Y | — | — | — | — |
| Full-duplex session (not read-only) | Y | Y | P | P | — |
| Zero-config LAN / Tailscale / ngrok | Y | — | — | — | — |
| Handoff-style continuity | Y | — | — | P | — |
| Dictated reply on watch | Y | — | — | — | — |
| Open source | Y | — | — | — | — |
5. Differentiators to Exploit
VibeCody has unique advantages to lean into — refreshed for v0.5.5:
- Full Rust backend — lower memory, faster startup, native performance vs. Electron apps (Cursor, Windsurf, Antigravity, Continue, Aide, Void, Cline, PearAI).
- Ollama first-class + 22 providers — the widest provider catalog of any AI coding tool. Cursor/Windsurf treat local models as afterthoughts; Cody requires explicit configuration.
- Monorepo synergy — VibeCLI, VibeUI, VibeCLI App, VibeMobile, and VibeWatch share
vibe-aiandvibe-core; one piece of agent work applies everywhere. - Privacy by design — no telemetry, no cloud indexing, fully local option. Only OpenHands, Aider, Goose, and Cody offer a comparable story, and none of those ship a polished desktop IDE.
- Open source — full transparency, extensibility, self-hostable. Cursor, Windsurf, Antigravity, Copilot, JetBrains AI, Amazon Q, Devin, Replit, Bolt.new, v0 are all closed.
- Wrist-to-terminal coverage — the only product that lets a developer move from desktop → phone → watch within the same session. This is a category VibeCody effectively owns as of 0.5.5.
- Zero-config networking — mDNS, Tailscale Funnel, and ngrok auto-detection give developers a Dropbox-simple setup story no competitor matches.
- Counsel + red-team pipelines — native multi-LLM deliberation and security-focused review bake capabilities into the CLI that competitors position as separate paid products (CodeRabbit, Qodo).
6. Implementation Plan
Organized into 5 phases. Each phase builds on the previous and targets specific gap areas.
Status: Phases 1–5 in this document are complete as of February 2026. Subsequent phases 6–39 (spanning the v5 and v6 cycles, March–April 2026) are summarised in the History appendices at the end of this document. VibeCody has feature parity with Codex CLI, Claude Code, Cursor, Windsurf, and Antigravity across all critical capabilities.
Phase 1 — Agent Foundation Yes Complete
Goal: Give VibeCLI a real agent loop with streaming, tool use, and approval tiers. This is the most critical gap — without it, VibeCLI is just a chat wrapper.
1.1 Streaming TUI Responses
Crate: vibecli-cli/src/tui/
Why: Currently AI responses appear all at once; competitors stream token-by-token.
- In
mod.rs: replacellm.chat()calls withllm.stream_chat()in the TUI event loop - Add
TuiMessage::AssistantChunk(String)variant to accumulate streaming tokens - Render partial message with a blinking cursor indicator in
ui.rs - Wire up
CompletionStream→tokio::spawn→mpsc::Sender<AppEvent::Chunk(String)>
Files: tui/mod.rs, tui/app.rs, tui/ui.rs
Estimate: 3 days
1.2 Tool Use Framework (vibe-ai)
Crate: vibeui/crates/vibe-ai/
Why: All competitors give the LLM structured tools. Without this, no agent loop is possible.
Add to vibe-ai:
// src/tools.rs
pub enum ToolCall {
ReadFile { path: String },
WriteFile { path: String, content: String },
ApplyPatch { path: String, patch: String },
BashCommand { command: String },
SearchFiles { query: String, glob: Option<String> },
ListDirectory { path: String },
GetGitStatus,
GetGitDiff { file: Option<String> },
}
pub struct ToolResult {
pub tool: String,
pub output: String,
pub success: bool,
}
pub trait ToolExecutor: Send + Sync {
async fn execute(&self, call: &ToolCall) -> Result<ToolResult>;
}
- Implement
VibeToolexecutor invibe-corethat dispatches each variant - Extend
AIProvidertrait withchat_with_tools()that sends tools in the provider’s native format (OpenAI function calling, Claude tool use, Ollama tool use) - Parse tool call responses from each provider
Files: vibe-ai/src/tools.rs (new), vibe-ai/src/provider.rs, each providers/*.rs
Estimate: 1 week
1.3 Agent Loop (vibe-ai)
Why: The core of Codex CLI and Claude Code — an autonomous plan-act-observe cycle.
// src/agent.rs
pub struct AgentLoop {
provider: Arc<dyn AIProvider>,
tools: Arc<dyn ToolExecutor>,
approval: ApprovalPolicy,
max_steps: usize,
}
pub enum ApprovalPolicy {
Suggest, // Show every action, require y/N
AutoEdit, // Auto-apply file patches, prompt for commands
FullAuto, // Execute everything autonomously
}
impl AgentLoop {
pub async fn run(&self, task: &str, context: &AgentContext) -> Result<AgentResult> {
// 1. Build system prompt with tools + context
// 2. Loop: LLM → parse tool calls → approve → execute → feed result back
// 3. Stop on: task_complete tool, max_steps, error
}
}
- VibeCLI TUI: add
/agent <task>command that invokesAgentLoop - Show a live “action feed” panel listing each step as it executes
- Wire
ApprovalPolicyto existing safety config + add--auto/--suggest/--full-autoCLI flags
Files: vibe-ai/src/agent.rs (new), vibecli-cli/src/main.rs, vibecli-cli/src/tui/
Estimate: 1 week
1.4 Approval Tiers (3-level)
Why: Codex CLI’s most visible safety feature; binary approve/deny is not enough.
Suggest(default): every file write and command shows diff/preview, requiresyAutoEdit: file patches auto-applied; bash commands require approvalFullAuto: all actions execute (only in sandbox or explicit opt-in)
Extend config:
[safety]
approval_policy = "suggest" # suggest | auto-edit | full-auto
sandbox = true # enable OS-level sandbox when full-auto
CLI flags: --suggest, --auto-edit, --full-auto
Files: vibecli-cli/src/config.rs, vibe-ai/src/agent.rs
Estimate: 2 days
1.5 Multi-File Batch Edits
Why: /apply only handles one file. Real agent work touches many files.
- Agent tool
WriteFile+ApplyPatchalready handles this at the tool level - Add
BatchApplyconfirmation UI in TUI: show all proposed changes as a unified diff across files, single y/N to accept all or file-by-file review - Preserve undo: create a git stash before applying batch changes
Files: vibecli-cli/src/tui/components/ (new batch_diff.rs), vibe-core/src/git.rs
Estimate: 3 days
Phase 2 — Context Intelligence Yes Complete
Goal: Make VibeCLI and VibeUI context-aware at the codebase level — the core of Cursor’s competitive moat.
2.1 Codebase Indexing Engine (vibe-core)
Why: Cursor indexes 200k tokens of codebase. Currently VibeCLI only injects a truncated git diff.
New module: vibe-core/src/index/
pub struct CodebaseIndex {
// tree-sitter parsed symbol table
symbols: HashMap<String, Vec<SymbolInfo>>,
// file content cache with modification times
file_cache: HashMap<PathBuf, (SystemTime, String)>,
// optional: vector embeddings for semantic search
embeddings: Option<EmbeddingStore>,
}
pub struct SymbolInfo {
pub name: String,
pub kind: SymbolKind, // Function, Struct, Trait, Class, etc.
pub file: PathBuf,
pub line: usize,
pub signature: String,
}
Implementation:
- Use
tree-sitter+ language grammars (Rust, TypeScript, Python, Go) to parse symbols - Walk workspace with
walkdir, skip.gitignoreentries - Incremental re-index on file-change events from
notify - Expose
search_symbols(query)→ rankedVec<SymbolInfo> - Expose
search_content(regex)→Vec<(PathBuf, line, snippet)> - For semantic search: embed code chunks using a local embedding model via Ollama (
/api/embeddings) and store in an in-memory HNSW index (instant-distancecrate)
Files: vibe-core/src/index/ (new directory: mod.rs, symbol.rs, content.rs, embeddings.rs)
Estimate: 2 weeks
2.2 Context Injection Upgrade
Why: Current context injection is naive — 2000 chars of git diff. Competitors provide full codebase understanding.
Replace the static diff injection in tui/mod.rs with a smart context builder:
pub struct ContextBuilder<'a> {
index: &'a CodebaseIndex,
git: &'a GitStatus,
open_files: &'a [PathBuf],
budget: usize, // token budget
}
impl ContextBuilder<'_> {
pub fn build_for_task(&self, task: &str) -> String {
// 1. Always include: branch, changed files, full diff of changed files
// 2. Include: symbols most relevant to the task (BM25 + semantic)
// 3. Fill budget with: content of open files
// 4. Truncate intelligently at symbol/function boundaries
}
}
Files: vibe-core/src/context.rs (new), vibecli-cli/src/tui/mod.rs
Estimate: 3 days
2.3 AGENTS.md / Project Memory
Why: Claude Code uses CLAUDE.md; Codex uses AGENTS.md for persistent project-specific instructions.
- On startup, look for
AGENTS.mdorVIBECLI.mdin CWD and parent directories - Inject contents as the first system message (before git context)
- VibeCLI command:
/memory edit— opensVIBECLI.mdin$EDITOR - VibeCLI command:
/memory show— prints current memory - Support tiered memory: global (
~/.vibecli/memory.md) → repo → directory
Files: vibecli-cli/src/memory.rs (new), vibecli-cli/src/main.rs, vibecli-cli/src/tui/mod.rs
Estimate: 2 days
2.4 @ Context System (VibeUI)
Why: Cursor’s most-loved UX feature — @file, @symbol, @web, @docs in the chat box.
In vibeui/src/components/AIChat.tsx:
- Detect
@in the input box and open a fuzzy-search popup - Options:
@file:<path>,@symbol:<name>,@web:<url>,@git:diff,@git:history - Inject the referenced content into the message before sending
- Backend: Tauri commands
search_files_for_context,get_symbol_context,fetch_url_content
Files: vibeui/src/components/AIChat.tsx, vibeui/src/components/ContextPicker.tsx (new), vibeui/src-tauri/src/commands/context.rs (new)
Estimate: 1 week
Phase 3 — Inline Intelligence Yes Complete
Goal: Wire up LSP and inline AI completions in VibeUI to match Cursor/Windsurf’s core editor experience.
3.1 LSP Client — Wire to Monaco
Why: The LSP stub exists; it needs to connect to Monaco for real editor intelligence.
Complete vibe-lsp:
- Spawn language server process (e.g.,
rust-analyzer,typescript-language-server,pyright) - Bridge LSP
textDocument/completion,textDocument/hover,textDocument/definition,textDocument/publishDiagnostics→ Tauri events → MonacoregisterCompletionItemProvider,registerHoverProvider,setModelMarkers - Language server discovery: look for executables in PATH; show install prompt if missing
- Auto-start LSP on file open based on language detection
Files: vibe-lsp/src/client.rs (complete), vibe-lsp/src/bridge.rs (new), vibeui/src-tauri/src/commands/lsp.rs (new), vibeui/src/App.tsx
Estimate: 2 weeks
3.2 Inline AI Completions
Why: Cursor’s Tab model is the #1 reason developers pay for it.
Implementation strategy:
- Wire
CompletionEnginefromvibe-aito Monaco’sregisterInlineCompletionsProvider - Debounce: trigger completion 300ms after the user stops typing
CodeContextis built from: Monaco cursor position, surrounding 1000 chars prefix/suffix, active file language- Render ghost text (grayed-out suggestion) in Monaco
- Accept with
Tab, dismiss withEscape - For local mode (Ollama): use FIM (fill-in-the-middle) format with
<|fim_prefix|>,<|fim_suffix|>,<|fim_middle|>tokens - For cloud models: use standard prefix+suffix prompt
Files: vibeui/src-tauri/src/commands/completion.rs (new), vibeui/src/App.tsx, vibeui/crates/vibe-ai/src/completion.rs
Estimate: 1 week
3.3 Flow Awareness Engine (VibeUI)
Why: Windsurf’s key differentiator — Cascade knows everything you’ve done. Replicate this.
New Tauri event bus: FlowTracker
Track and persist:
- Files opened/closed (with timestamps)
- Files edited (which lines)
- Terminal commands run (command + exit code)
- Clipboard content (on focus events, opt-in)
- Recent AI chat exchanges
Expose as context to AI agent:
pub struct FlowContext {
pub recently_viewed: Vec<(PathBuf, Instant)>,
pub recently_edited: Vec<(PathBuf, Vec<Range>)>,
pub recent_commands: Vec<(String, i32)>, // (command, exit_code)
pub current_file: Option<PathBuf>,
pub cursor_position: Option<Position>,
}
This gets injected into every AI request to give the model full awareness of what the developer is doing.
Files: vibeui/src-tauri/src/flow.rs (new), vibeui/src/App.tsx
Estimate: 1 week
3.4 Diff Review Before AI Apply (VibeUI)
Why: AI edits currently applied without review — a critical trust gap.
- Any AI-proposed file change goes through a
DiffReviewmodal: unified diff with syntax highlighting, accept/reject per hunk - Before applying: create git stash automatically (silent, named
vibeui-pre-ai-TIMESTAMP) - After applying: show “Changes applied — Undo all” button that pops the stash
Files: vibeui/src/components/DiffReview.tsx (new), vibeui/src-tauri/src/commands/git.rs
Estimate: 3 days
Phase 4 — Agentic Editor Yes Complete
Goal: Make VibeUI a full agentic IDE — matching Antigravity’s Manager View and Cursor’s Composer.
4.1 Agent Mode in VibeUI
Why: The gap between VibeUI (chat panel) and Cursor/Windsurf (full agent) is the biggest competitive delta.
- New “Agent” tab in the AI panel (alongside “Chat”)
- User describes a high-level task: “Add OAuth2 login to the Express app”
- Agent uses the tool framework from Phase 1 to:
- Read relevant files via
ReadFiletool - Search for symbols via
SearchFiles - Plan a list of changes (shown as a todo list, à la Windsurf)
- Execute changes file by file with diff preview
- Run tests via
BashCommandto verify - Report result
- Read relevant files via
- Show a live “Steps” panel listing each action with status (pending/in-progress/done/error)
- Each step is expandable to show tool input/output
Files: vibeui/src/components/AgentPanel.tsx (new), vibeui/src-tauri/src/agent.rs (new), vibeui/crates/vibe-ai/src/agent.rs
Estimate: 2 weeks
4.2 Memory / Rules System (VibeUI)
Why: Both Cursor (.cursorrules) and Windsurf (Cascade Memories) have persistent AI instructions.
- Support
.vibeui.mdin workspace root as project-level AI instructions - Global rules in
~/.vibeui/rules.md - Settings panel: “AI Rules” tab for editing rules inline
- Cascade-style auto-memory: after each AI session, offer to save key decisions as a memory snippet
- Knowledge base: searchable store of code snippets and past solutions, surfaced automatically in context
Files: vibeui/src/components/MemoryPanel.tsx (new), vibeui/src-tauri/src/memory.rs (new)
Estimate: 1 week
4.3 Checkpoint System (VibeUI)
Why: Windsurf’s checkpoints let you rewind the entire AI session.
- Before any agent action: create a named git snapshot (
git stash push -m "vibe-checkpoint-N") - Show checkpoint history in a timeline panel
- “Restore to checkpoint N” — pops stash, restores file state
- Checkpoints are auto-created at: session start, before each agent step
Files: vibeui/src/components/CheckpointPanel.tsx (new), vibeui/src-tauri/src/checkpoint.rs (new), vibe-core/src/git.rs
Estimate: 4 days
4.4 Planning Agent (two-level)
Why: Windsurf separates a “planner” (long-horizon) from an “executor” (single-step). This dramatically improves complex task performance.
Implement in vibe-ai/src/planner.rs:
pub struct PlannerAgent {
planner_model: Arc<dyn AIProvider>, // frontier model for planning
executor_model: Arc<dyn AIProvider>, // fast model for execution steps
}
pub struct Plan {
pub goal: String,
pub steps: Vec<PlanStep>,
}
pub struct PlanStep {
pub description: String,
pub estimated_files: Vec<PathBuf>,
pub status: StepStatus,
}
- Planner LLM: generates the full plan as structured JSON
- For each step: executor LLM performs the actual tool calls
- Planner re-evaluates after each step completes (adaptive planning)
- UI: plan shown as a todo list at the top of the Agent panel; steps update in real time
Files: vibe-ai/src/planner.rs (new), vibeui/src/components/AgentPanel.tsx
Estimate: 1 week
4.5 Multi-Agent Parallel Execution
Why: Cursor runs 8 parallel agents; Antigravity runs 5. This is the throughput multiplier.
- VibeCLI:
vibecli --agent <task> --parallel Nspawns N sub-processes, each aAgentLoopon a git worktree - VibeUI: “Parallel Agents” view in the Manager tab — spawn up to 5 agents on different tasks simultaneously
- Each agent operates on an isolated git worktree (no conflicts)
- Results merged: show diff comparison of each agent’s output, user picks winner or merges
Files: vibe-ai/src/multi_agent.rs (new), vibeui/src/components/ManagerView.tsx (new), vibe-core/src/git.rs (add worktree support)
Estimate: 2 weeks
Phase 5 — Ecosystem & Polish Yes Complete
Goal: Close the remaining gaps, ship differentiating features, and establish the open ecosystem.
5.1 MCP (Model Context Protocol) Integration
Why: Claude Code has 300+ MCP integrations; VibeCLI/VibeUI have zero.
- Implement MCP client in
vibe-ai/src/mcp.rs— JSON-RPC 2.0 over stdio or SSE -
MCP servers auto-discovered from config:
[[mcp_servers]] name = "github" command = "npx @modelcontextprotocol/server-github" [[mcp_servers]] name = "postgres" command = "npx @modelcontextprotocol/server-postgres" args = ["postgresql://localhost/mydb"] - MCP tools exposed to the agent alongside built-in tools
- MCP resources (e.g., database schema, API docs) injected into context
Files: vibe-ai/src/mcp.rs (new), vibecli-cli/src/config.rs, vibeui/src-tauri/src/mcp.rs (new)
Estimate: 1.5 weeks
5.2 OS Sandbox for Command Execution
Why: Codex CLI uses Apple Seatbelt + Linux seccomp. VibeCLI runs commands unrestricted.
- macOS: wrap command execution in
sandbox-execwith a restricted profile (no network, write only to CWD) - Linux: use
bwrap(bubblewrap) for namespace isolation - Windows: use
Job Objectsfor process isolation FullAutomode requires sandbox OR explicit--no-sandboxflag
Files: vibe-core/src/executor.rs, vibecli-cli/src/config.rs
Estimate: 1 week
5.3 Non-Interactive / CI Mode (VibeCLI)
Why: Codex CLI supports codex exec for automation pipelines.
vibecli exec "Add docstrings to all public functions in src/" --auto-edit --output report.md
vibecli exec "Fix all clippy warnings" --full-auto --sandbox
- No TUI, no user prompts (except in
suggestmode which fails with error) - Writes a structured JSON/markdown report of all actions taken
- Exit codes: 0 (success), 1 (partial), 2 (failed), 3 (approval required)
- GitHub Actions marketplace action:
vibecody/vibecli-action@v1
Files: vibecli-cli/src/main.rs, vibecli-cli/src/ci.rs (new), .github/actions/vibecli/ (new)
Estimate: 1 week
5.4 Multimodal Input (VibeCLI + VibeUI)
Why: Cursor and Codex CLI support pasting screenshots for visual debugging.
- VibeCLI: detect image paths in input (
/chat [image.png] explain this error) - VibeUI: drag-and-drop or paste image into chat; encode as base64 and send with the message
- Providers: Claude and OpenAI support vision natively; add image encoding to those providers
Files: vibe-ai/src/provider.rs (add ImageContent to Message), vibe-ai/src/providers/claude.rs, vibe-ai/src/providers/openai.rs, vibeui/src/components/AIChat.tsx
Estimate: 4 days
5.5 Extension System (VibeUI) — Complete
Why: The wasmtime stub exists. Complete it so third parties can extend VibeUI.
Define the extension host API:
// Host functions exposed to WASM extensions
pub trait ExtensionHost {
fn register_command(&self, name: &str, handler: Box<dyn Fn(&[&str]) -> Result<String>>);
fn on_file_save(&self, handler: Box<dyn Fn(&Path)>);
fn on_text_change(&self, handler: Box<dyn Fn(&Path, &str)>);
fn read_file(&self, path: &Path) -> Result<String>;
fn write_file(&self, path: &Path, content: &str) -> Result<()>;
fn show_notification(&self, message: &str);
fn get_ai_completion(&self, prompt: &str) -> Result<String>;
}
- Extensions loaded from
~/.vibeui/extensions/*.wasm - Extension marketplace page on the docs site
- Example extensions:
prettier-format.wasm,rustfmt-on-save.wasm
Files: vibe-extensions/src/host.rs (complete), vibe-extensions/src/api.rs (new), vibeui/src-tauri/src/extensions.rs
Estimate: 2 weeks
5.6 Trace / Audit Log
Why: Codex CLI records every action for inspection and debugging.
- Agent loop writes a structured JSONL trace:
~/.vibecli/traces/<timestamp>.jsonl - Each entry:
{ timestamp, step, tool, input, output, duration_ms, approved_by } - VibeCLI command:
/trace— lists recent traces - VibeCLI command:
/trace view <id>— renders trace as a human-readable timeline in TUI - VibeUI: “History” panel showing recent agent sessions with expandable trace
Files: vibe-ai/src/trace.rs (new), vibecli-cli/src/tui/components/trace_view.rs (new), vibeui/src/components/HistoryPanel.tsx (new)
Estimate: 3 days
7. Prioritized Feature Backlog
Yes Completed — Phases 1–2 (Agent Foundation + Context Intelligence)
| # | Feature | Addresses | Status |
|---|---|---|---|
| 1 | Streaming TUI responses | Codex, Claude Code | Yes Done |
| 2 | Tool use framework (7 tools) | All | Yes Done |
| 3 | Agent loop (plan→act→observe) | Codex, Claude Code | Yes Done |
| 4 | Approval tiers (Suggest/AutoEdit/FullAuto) | Codex, Claude Code | Yes Done |
| 5 | Multi-file batch edits | All | Yes Done |
| 6 | Codebase indexing (regex/heuristic + embeddings) | Cursor, Windsurf | Yes Done |
| 7 | Project memory (AGENTS.md / VIBECLI.md) | Codex, Claude Code | Yes Done |
| 8 | Diff review before apply | All | Yes Done |
Yes Completed — Phase 3 (Inline Intelligence)
| # | Feature | Addresses | Status |
|---|---|---|---|
| 9 | LSP in Monaco (completions, hover, go-to-def) | Cursor, Windsurf | Yes Done |
| 10 | Inline AI completions (FIM) | Cursor, Windsurf | Yes Done |
| 11 | @ context system | Cursor, Windsurf | Yes Done |
| 12 | Flow-awareness engine (FlowTracker) | Windsurf | Yes Done |
Yes Completed — Phases 4–5 (Agentic Editor + Ecosystem)
| # | Feature | Addresses | Status |
|---|---|---|---|
| 13 | Agent mode in VibeUI (AgentPanel) | Antigravity, Cursor | Yes Done |
| 14 | Memory / rules (MemoryPanel) | Cursor, Windsurf | Yes Done |
| 15 | Checkpoint system (backend + UI) | Windsurf | Yes Done |
| 16 | MCP integration (JSON-RPC 2.0 stdio) | Claude Code, Codex | Yes Done |
| 17 | OS sandbox (sandbox-exec / bwrap) | Codex | Yes Done |
| 18 | CI mode (–exec, JSON/Markdown reports) | Codex, Claude Code | Yes Done |
| 19 | Multimodal input (Claude + OpenAI vision) | Cursor, Claude Code | Yes Done |
| 20 | Extension system (WASM wasmtime) | Cursor, Windsurf | Yes Done |
| 21 | Trace / audit log (JSONL + HistoryPanel) | Codex | Yes Done |
| 22 | Multi-agent parallel (git worktrees + ManagerView) | Cursor, Antigravity | Yes Done |
| 23 | Planning agent (PlannerAgent) | Windsurf, Antigravity | Yes Done |
Yes Completed — Phases 6–9 (see ROADMAP-v2)
| # | Feature | Addresses | Status |
|---|---|---|---|
| 24 | Hooks system (events + shell + LLM handlers) | Claude Code | Yes Done |
| 25 | Plan Mode (PlannerAgent) | Windsurf, Claude Code | Yes Done |
| 26 | Parallel multi-agent + git worktrees | Cursor (8), Windsurf | Yes Done |
| 27 | Embedding-based semantic indexing | Cursor, Windsurf | Yes Done |
| 28 | Next-edit prediction (Tab/Supercomplete) | Cursor, Windsurf | Yes Done |
| 29 | Checkpoint UI panel | Windsurf | Yes Done |
| 30 | Session resume | Codex, Claude Code | Yes Done |
| 31 | Web search tool | Codex | Yes Done |
| 32 | GitHub PR review agent (BugBot equiv.) | Cursor BugBot | Yes Done |
| 33 | Shell environment policy / Admin policy | Codex | Yes Done |
| 34 | Skills system | Claude Code, Windsurf | Yes Done |
| 35 | Artifacts panel | Antigravity | Yes Done |
| 36 | OpenTelemetry | Codex | Yes Done |
| 37 | GitHub Actions | Codex, Claude Code | Yes Done |
| 38 | Manager View (parallel UI) | Antigravity | Yes Done |
| 39 | VS Code extension | All | Yes Done |
| 40 | Agent SDK (TypeScript) | Claude Code | Yes Done |
8. Architecture Summary (All Phases Complete)
vibecli-cli
├── REPL / TUI (streaming, hooks, /agent, /plan, /multi-agent, /review)
├── CI mode (--exec, --parallel, --review)
├── Server mode (vibecli serve — API for VS Code extension + SDK)
└── src/
├── ci.rs, review.rs, serve.rs, otel_init.rs
└── hooks (config loading)
vibe-ai
├── provider.rs (AIProvider trait + vision + tool use)
├── agent.rs (plan→act→observe loop, approval tiers)
├── planner.rs (PlannerAgent: plan generation + guided execution)
├── multi_agent.rs (parallel agents on git worktrees)
├── hooks.rs (HookRunner: command + LLM handlers, event bus)
├── skills.rs (SkillLoader: auto-activating context snippets)
├── artifacts.rs (Artifact types, annotation queue)
├── mcp.rs (McpClient JSON-RPC 2.0)
├── tools.rs (ToolCall enum + WebSearch + FetchUrl)
├── trace.rs (JSONL audit + session resume)
├── policy.rs (AdminPolicy: tool/path restrictions)
└── otel.rs (OpenTelemetry span attributes)
vibe-core
├── index/
│ ├── mod.rs, symbol.rs (tree-sitter symbol index)
│ └── embeddings.rs (HNSW vector index, Ollama/OpenAI embeddings)
├── context.rs (smart context builder: flow + semantic + git)
├── executor.rs (sandboxed execution + shell env policy)
└── git.rs (worktree: create, remove, merge)
vibe-extensions
└── loader.rs (wasmtime WASM host)
vibeui (React + Tauri)
├── AgentPanel (single-agent: steps, approval, artifacts)
├── ManagerView (multi-agent: task board, worktrees, merge)
├── CheckpointPanel (timeline, restore, auto-checkpoint)
├── ArtifactsPanel (rich cards, annotations, async feedback)
├── MemoryPanel (rules editor)
├── HistoryPanel (trace viewer)
├── ContextPicker (@ context popup)
├── GitPanel (git + PR review)
└── Terminal, AIChat, CommandPalette, ThemeToggle
vscode-extension (chat, inline completions, agent mode)
packages/agent-sdk (TypeScript SDK: @vibecody/agent-sdk)
.github/actions/vibecli (GitHub Actions marketplace action)
9. Key Differentiators (Current — v0.5.5, April 2026)
9.1 Broad feature matrix — 14 competitors
Y = native support; P = partial / limited; — = not supported. Ordered by the features we care about most.
| Dimension | VibeCody 0.5.5 | Cursor | Windsurf | Antigravity | Claude Code | Codex CLI | Aider | Goose | OpenHands | Cline | Cody | Copilot | JetBrains AI | Devin |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Open source | Y | — | — | — | — | P | Y | Y | Y | Y | P | — | — | — |
| Rust native backend | Y | — | — | P | — | Y | — | Y | — | — | — | — | — | — |
| Local AI first (Ollama) | Y | P | P | — | — | P | Y | Y | P | Y | P | — | — | — |
| Self-hostable daemon | Y | — | — | — | — | P | Y | Y | Y | Y | P | — | — | — |
| CLI + GUI unified | Y | P | — | P | P | — | — | — | — | — | Y | — | P | — |
| Terminal TUI | Y | — | — | — | P | Y | Y | Y | P | — | — | — | — | — |
| Desktop IDE (Monaco) | Y | Y | Y | Y | — | — | — | — | — | — | Y | Y | Y | — |
| Flutter mobile (6 platforms) | Y | P | — | — | — | — | — | — | — | — | — | — | — | P |
| Apple Watch native | Y | — | — | — | — | — | — | — | — | — | — | — | — | — |
| Wear OS native | Y | — | — | — | — | — | — | — | — | — | — | — | — | — |
| Handoff-style continuity | Y | — | — | — | — | — | — | — | — | — | — | — | — | P |
| mDNS / Tailscale / ngrok zero-config | Y | — | — | — | — | — | — | — | — | — | — | — | — | — |
| Multi-provider (≥10) | Y (22) | P | P | P | — | P | Y | Y | Y | Y | Y | — | P | — |
| Full agent loop | Y | Y | Y | Y | Y | Y | P | Y | Y | Y | P | P | Y | Y |
| Parallel agents | Y | Y (8) | Y | Y (5) | Y | P | — | — | — | — | — | — | — | — |
| Plan Mode | Y | — | Y | Y | Y | — | — | — | Y | Y | — | Y | Y | Y |
| MCP integration | Y | Y | P | Y | Y (300+) | Y | — | Y | P | — | P | — | P | — |
| Hooks system | Y | — | — | — | Y | — | — | — | — | — | — | — | — | — |
| Skills system | Y (711) | — | Y | — | Y | — | — | — | — | — | — | — | — | — |
| OS sandbox | Y | — | — | — | P | Y | — | — | Y | — | — | — | — | Y |
| Inline completions (FIM) | Y | Y | Y | Y | — | — | — | — | — | — | P | Y | Y | — |
| Semantic codebase index | Y | Y | Y | Y | P | Y | P | P | P | P | Y | P | P | Y |
| Multi-file batch edits | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | P | P | Y | Y |
| Diff review before apply | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | P | Y | Y |
| Checkpoint / rewind | Y | P | Y | Y | — | P | — | Y | P | — | — | — | — | Y |
| Trace / audit log (JSONL) | Y | — | — | — | P | Y | — | Y | Y | — | — | — | — | P |
| OpenTelemetry | Y | — | — | — | — | Y | — | — | — | — | — | — | — | — |
| VS Code extension | Y | Y | Y | Y | Y | Y | — | — | — | Y | Y | Y | — | — |
| JetBrains plugin | Y | — | Y | — | — | — | — | — | — | — | Y | Y | Y | — |
| Neovim plugin | Y | — | — | — | — | — | — | — | — | — | P | Y | — | — |
| Agent SDK (TypeScript) | Y | — | — | — | Y | P | — | Y | — | — | — | — | — | P |
| Red-team security pipeline | Y | — | — | — | — | — | — | — | — | — | — | — | — | — |
| Counsel (multi-LLM deliberation) | Y | — | — | — | — | — | — | — | — | — | — | — | — | — |
| Artifacts + Manager View | Y | — | — | Y | — | — | — | — | — | — | — | — | — | Y |
| Multiplayer CRDT collab | Y | — | — | — | — | — | — | — | — | — | — | — | — | — |
| CRDT / real-time sync on mobile+watch | Y | — | — | — | — | — | — | — | — | — | — | — | — | — |
9.2 Where VibeCody is unique
As of v0.5.5 (April 2026), VibeCody is the only product in our 52-tool survey that ships all of these simultaneously:
- A native Apple Watch client — SwiftUI, P-256 ECDSA via Secure Enclave, dictated reply.
- A native Wear OS client — Kotlin/Compose, Android Keystore / StrongBox attestation.
- A 6-platform Flutter mobile companion (iOS, Android, macOS, Linux, Windows, Web) with Handoff-style continuity.
- Zero-config device discovery — mDNS
_vibecli._tcp.local.+ Tailscale Funnel + ngrok, auto-raced. - A Rust-native, fully open-source, self-hostable daemon that drives all of the above.
- 22 AI providers behind a single abstraction with failover — the widest catalog of any AI coding tool.
- A terminal (VibeCLI) + full desktop IDE (VibeUI) + chat desktop app (VibeCLI App) + mobile + watch — every surface built on the same crates.
- Counsel — structured multi-LLM deliberation (expert / devil’s advocate / skeptic / pragmatist) with a moderator synthesis; no other tool ships this.
- Red-team security pipeline + compliance reporting built into the CLI.
9.3 Where we still have parity gaps to close
The honest list (tracked in the consolidated Fit-Gap Analysis):
- Cursor’s Tab model — next-edit prediction quality is still best-in-class; we ship FIM completions but haven’t matched their specialized model.
- Devin-level long-horizon autonomy — Devin chains hours of work in a cloud VM; our agent loop tops out at ~50 steps.
- Claude Code’s 300+ MCP servers — we ship a compliant MCP client, but the community server catalog is still dominated by Anthropic.
- SWE-bench leaderboard — Claude Mythos Preview now leads SWE-bench Verified at 93.9% (Opus 4.7 87.6%, GPT-5.3-Codex 85%, Augment 72.0% open-system pass@1). Caveat (2026-05-03): OpenAI stopped reporting Verified scores after a contamination audit found 59.4% of hard tasks have flawed tests — SWE-bench Pro, SWE-rebench, and SWE-bench-Live are now the primary references. Tracked in the benchmark panel.
- Enterprise SSO / audit packaging — Cody and Copilot for Business are further along on SOC 2, SSO, central policy.
9.4 Headline positioning
VibeCody is the only open-source AI developer toolchain that lets you work on the same coding session from a terminal, a desktop IDE, a phone, and a watch — all running against a self-hosted Rust daemon with 22 providers and zero-config networking.
Every competitor has a better story in one dimension; none of them ship a coherent answer across all six categories (terminal, editor, cloud, review, completions, mobile/watch) the way VibeCody does as of v0.5.5.
Appendix A — History: Phases 23–32 (v5 cycle, March 2026)
Input: 22 competitive gaps catalogued in the v7 fit-gap iteration (see Fit-Gap Analysis) plus six bonus modules in Phase 32. Outcome: All 22 gaps closed; 28 new Rust modules, 9,570 unit tests (0 failures), 187 panels, 568 skills, 100+ REPL commands. Completed 2026-03-29.
Phase 23 — Dual-Protocol Agent Communication (P0)
- 23.1 A2A Protocol Support —
a2a_protocol.rs(agent card, server mode, client discovery, task lifecycle, SSE streaming, capability negotiation),A2aPanel.tsx,/a2a card|serve|discover|call|tasks|status, 55+ tests. - 23.2 Agent Skills Standard Compatibility —
agent_skills_compat.rs(cross-tool skills interop, import/export, registry client, dependency resolution),/skills import|export|search|validate|publish, 35+ tests.
Phase 24 — Parallel Agent Workers (P0)
- 24.1 Worktree Pool —
worktree_pool.rswith N lightweight git-worktree agents, auto-merge orchestration, per-worktree cgroup/ulimit caps;WorktreePoolPanel.tsx;/worktree spawn|list|merge|cleanup|config; 50+ tests. - 24.2 Multi-Agent Terminal Host —
agent_host.rshosts external CLI agents (Claude Code, Gemini CLI, Aider, …) with interleaved output and shared clipboard;AgentHostPanel.tsx;/host add|list|route|remove|ask; 40+ tests.
Phase 25 — Proactive Intelligence (P0)
- 25.1 Proactive Agent —
proactive_agent.rsbackground scanner across performance/security/tech-debt/correctness/a11y/testing categories with learning store and digest mode;ProactivePanel.tsx;/proactive scan|config|accept|reject|history|digest; 45+ tests. - 25.2 Autonomous Issue Triage —
issue_triage.rsclassifier + severity estimator + auto-labeler + draft-response generator with GitHub/Linear integration;TriagePanel.tsx;/triage run|rules|labels|history|batch; 40+ tests.
Phase 26 — Agent Grounding & Context (P0/P1)
- 26.1 Web Search Grounding —
web_grounding.rswith Google/Bing/Brave/SearXNG/Tavily providers, citation tracking, privacy mode;WebGroundingPanel.tsx;/search web|cache|providers|config; 40+ tests. - 26.2 Deep Semantic Codebase Index —
semantic_index.rswith call-graph, type-hierarchy, import-chain, incremental updates;SemanticIndexPanel.tsx;/index build|query|callers|callees|hierarchy|deps|stats; 55+ tests.
Phase 27 — MCP Protocol Evolution (P1)
- 27.1 Streamable HTTP + OAuth 2.1 —
mcp_streamable.rsbidirectional streamable HTTP, PKCE OAuth client + server, SAML→OAuth bridge, connection pooling;/mcp serve-http|oauth|tokens|remote; 45+ tests.
Phase 28 — Smart Repair & Routing (P1/P2)
- 28.1 MCTS Code Repair —
mcts_repair.rsMonte-Carlo Tree Search for code repair (UCB1, rollout tests, agentless mode, cost tracking, SWE-bench integration);MctsRepairPanel.tsx;/repair mcts|agentless|compare|config; 50+ tests. - 28.2 Cost-Optimized Agent Routing —
cost_router.rstask-complexity-aware model routing with budget enforcement and A/B tracking;CostRouterPanel.tsx;/route cost|budget|model|stats|compare; 40+ tests.
Phase 29 — Developer Experience (P1/P2)
- 29.1 Visual Verification —
visual_verify.rsheadless-Chrome screenshot + perceptual-diff + baseline mgmt + CI integration;VisualVerifyPanel.tsx;/verify screenshot|diff|baseline|ci; 35+ tests. - 29.2 Next-Task Prediction —
next_task.rsworkflow-state-machine-driven suggestion engine;NextTaskPanel.tsx;/nexttask suggest|accept|reject|learn|stats; 40+ tests. - 29.3 Offline Voice Coding —
voice_local.rswhisper.cpp integration with model mgmt, VAD, streaming, fallback to Groq;/voice local|model|download|config; 30+ tests. - 29.4 Living Documentation Sync —
doc_sync.rsbidirectional spec↔code reconciliation with drift alerts;DocSyncPanel.tsx;/docsync status|reconcile|watch|freshness; 35+ tests.
Phase 30 — Enterprise & Ecosystem (P2/P3)
- 30.1 Native Integration Connectors —
native_connectors.rswith 20 pre-built connectors (Stripe, Figma, Notion, Jira, Slack, PagerDuty, Datadog, Sentry, LaunchDarkly, Vercel, Netlify, Supabase, Firebase, AWS, GCP, Azure, GitHub, GitLab, Linear, Confluence);ConnectorsPanel.tsx;/connect list|add|test|remove|webhook; 50+ tests. - 30.2 Enterprise Agent Analytics —
agent_analytics.rsper-user/team/project metrics, ROI calculator, CSV/JSON/PDF export;AnalyticsPanel.tsx;/analytics dashboard|export|roi|compare; 40+ tests. - 30.3 Agent Trust Scoring —
agent_trust.rs0-100 per-agent/per-domain trust with decay, auto-review thresholds, transparent explanations;TrustPanel.tsx;/trust scores|history|config|explain; 35+ tests. - 30.4 Agentic Package Manager —
smart_deps.rsdependency-graph analysis, CVE auto-patch, license compliance, monorepo-aware lockfile mgmt;SmartDepsPanel.tsx;/deps resolve|compare|patch|audit|graph; 40+ tests.
Phase 31 — Strategic Frontiers (P3)
- 31.1 RLCEF Training Loop —
rlcef_loop.rsexecution-based learning with outcome tracker, reward signals, mistake clustering, fine-tuning export (opt-in, local-only);RlcefPanel.tsx;/rlcef train|eval|mistakes|patterns|reset|export; 45+ tests. - 31.2 LangGraph Bridge —
langgraph_bridge.rsLangGraph-compatible REST API, checkpoint format interop, Python SDK wrapper;LangGraphPanel.tsx;/langgraph serve|connect|status|checkpoint; 35+ tests. - 31.3 Sketch Canvas —
sketch_canvas.rsfreeform drawing → React/HTML/SwiftUI component generation, 3D scene export;SketchCanvasPanel.tsx;/sketch new|recognize|generate|export; 30+ tests.
Phase 32 — Advanced Agent Intelligence (Bonus)
- 32.1
context_protocol.rs(streaming long-running context),code_review_agent.rs(rule-driven review),diff_review.rs(diff-aware review). - 32.2
code_replay.rs(reproducible past sessions),speculative_exec.rs(predictive path execution),explainable_agent.rs(interpretable reasoning). - 32.3 TurboQuant KV-cache compression (PolarQuant + QJL, ~3 bits/dim) with benchmark panel and REPL command.
Appendix B — History: Phases 33–39 (v6 cycle, April 2026)
Input: 18 competitive gaps catalogued in the v8 fit-gap iteration (see Fit-Gap Analysis).
Outcome: All 18 gaps closed; 18 new Rust modules, ~13,270 unit tests (0 failures), 210+ panels, 122+ REPL commands, 212+ Rust modules under vibecli-cli/src/. Completed 2026-04-11.
Phase 33 — Cross-Environment Agent Execution (P0)
- 33.1 Cross-Environment Parallel Dispatch —
env_dispatch.rswithLocal | GitWorktree | RemoteSSH | CloudVMexecutors, pool pre-warming, unified progress aggregator, cost ticker;EnvDispatchPanel.tsx;/dispatch local|worktree|ssh|cloud|status|pool; 55+ tests. - 33.2 Recursive Nested Subagents —
nested_agents.rsDAG of parent/child agents with depth limiter, context-inheritance policies, merge strategies, real-time graph visualiser;NestedAgentsPanel.tsx;/agents tree|spawn|depth|graph|cancel; 50+ tests. - 33.3 A2A v0.3 Update —
a2a_protocol.rsextended with gRPC transport (tonic), Ed25519 security-card signing, v0.3 schema + v0.2 shim, Python-SDK interop tests;/a2a grpc|sign|verify|compat; 30+ tests.
Phase 34 — Active Desktop Computer Use (P0)
- 34.1 Active Desktop Control Agent —
desktop_agent.rswith xdotool/AT-SPI (Linux), AXUIElement/CGEvent (macOS), UI Automation (Windows), CDP browser debugger, MJPEG live preview, video recording, allow-list safety;DesktopAgentPanel.tsx;/desktop click|type|scroll|screenshot|record|stop|replay; 45+ tests.
Phase 35 — Protocol Maturation (P1)
- 35.1 MCP Enterprise Governance —
mcp_governance.rswith append-only audit log (SIEM-exportable), OIDC/SAML SSO, JSON-schema policy DSL for allow/deny + rate limits, versioned config portability;McpGovernancePanel.tsx;/mcp audit|sso|gateway|config; 50+ tests. - 35.2 Microsoft Agent Framework 1.0 —
msaf_compat.rsMSAF manifest generation, Azure-AD token validation, MCP↔MSAF envelope shim, Azure Agent Catalog registration + heartbeat;MsafPanel.tsx;/msaf register|manifest|catalog|health|token; 35+ tests.
Phase 36 — Agent Intelligence Primitives (P1)
- 36.1 Agent Await —
agent_await.rsfirst-class conditional-pause tool (ProcessExit | LogPattern | FileChange | PortOpen | HttpReady | TimerElapsed | ManualResume); tokio-select poller;/await list|cancel|status; 40+ tests. - 36.2 Streaming Thoughts —
thought_stream.rsparses<thinking>blocks from Claude/Gemini/GPT streams, categorises Planning/Reasoning/Uncertainty/Decision/Observation, tags confidence, exports annotated Markdown;ThoughtStreamPanel.tsx;/thoughts live|history|export|filter; 35+ tests. - 36.3 Codebase-Vocabulary Voice —
voice_vocab.rsmines identifiers fromsemantic_index.rs, injects them into Whisper asinitial_prompt+ hotwords, tracks WER improvement;/voice vocab build|inject|stats|test; 30+ tests.
Phase 37 — Context & Collaboration (P2)
- 37.1 Ultra-Long Context Adapter (2M–10M tokens) —
long_context.rswith Gemini 3.1 Pro (2M), Llama 4 Scout (10M), Claude Opus 4.6 (1M) routing, semantic-boundary chunking, sliding-window pagination, cost estimator, monorepo ingestion;LongContextPanel.tsx;/ctx route|estimate|ingest|window; 45+ tests. - 37.2 Interactive Design Mode —
design_mode.rsSVG annotation canvas (Arrow/Region/TextLabel/BeforeAfter/ColorSwatch/Measurement) → structured natural-language instruction generator with design-token extraction;DesignModePanel.tsx;/design screenshot|annotate|generate|history; 40+ tests. - 37.3 VibeCLI ↔ VibeUI Context Bridge —
ide_bridge.rs(UDS on macOS/Linux, named pipe on Windows) publishing open files, cursor, test/build output, terminal tail; VibeCLI client auto-discovers and injects<ide_context>into the agent window;IdeBridgePanel.tsx;/ide connect|status|sync|disconnect; 35+ tests.
Phase 38 — Private & Robust Intelligence (P2)
- 38.1 On-Device Private Inference —
on_device.rswith GGUF model registry (SHA-256 verified HF download),llama-cpp-rsFFI +candlefallback, Metal/CUDA/ROCm/AVX2 backends, hardware capability probe,--local-onlynetwork-isolation enforcement, benchmark runner;OnDevicePanel.tsx;/ondevice download|list|run|bench|enforce|hardware; 45+ tests. - 38.2 Hard Problem-Solving Strategy Engine —
hard_problem.rswith task decomposition + assumption surfacer + incremental hypothesis tester + ambiguity resolver + multi-file change planner + complexity estimator;HardProblemPanel.tsx;/plan decompose|assume|hypothesize|clarify|estimate; 40+ tests.
Phase 39 — Strategic Ecosystem (P3)
- 39.1 Autonomous Deploy Pipeline Agent —
auto_deploy.rsclosed-loop plan→build→test→stage→health-check→promote pipeline overDockerCompose | Kubernetes | Serverless | StaticHosting, health gates, auto-rollback, dry-run;AutoDeployPanel.tsx;/deploy plan|dry-run|stage|promote|rollback|status; 50+ tests. - 39.2 Claw Code Framework Compatibility —
clawcode_compat.rsJSON-RPC worker protocol, registry file interop, task-type routing, capability advertisement, bidirectional client/server;/clawcode register|serve|workers|status|call; 35+ tests. - 39.3 Team Onboarding Intelligence —
team_onboarding.rsnew-member detector + usage-pattern analyzer + knowledge-gap report + auto-generated ramp-up guide + hotspot map + team admin view;TeamOnboardingPanel.tsx;/onboard generate|track|guide|hotspots|team; 35+ tests. - 39.4 Reproducibility-First Agent Architecture —
repro_agent.rshermetic session snapshot (lockfiles + env hash + seed), deterministic replayer, session differ, CI reproducibility gate, non-determinism tagger, portablerepro-bundle.tar.gz;ReproAgentPanel.tsx;/repro snapshot|replay|diff|verify|export|import; 40+ tests.
Appendix C — Phases 40–52 and topic-specific deep-dives
Iterations v10 (phases 40–43), v11 (phases 45–48), v12 (phases 49–52) and the five topic-specific fit-gaps (AgentOS, Pi-mono, RL-OS, Paperclip, Code-Review+Architecture) are fully absorbed into the Fit-Gap Analysis. That document is the canonical source for the module-by-module ledger; this roadmap tracks competitive positioning and phase-level history only.
Appendix D — Phase 53: April 2026 trend delta + audit reconciliation
Input: §16 of the Fit-Gap Analysis — 11 newly-identified open gaps (A1–A11) from the 2026-04-26 industry trend survey, plus 8 audit-flagged modules + the RL-OS subsystem queued for real-I/O conversion (US-007…US-015) following the same playbook as the already-shipped US-001…US-006.
Outcome target: v0.5.6 ships with all eight audit-flagged modules converted to real I/O (US-007…US-014), one RL-OS algorithm shipped end-to-end via candle (US-015), and at least 6 of the 11 v13 trend-delta gaps shipped (A1–A4 MCP work + A11 Junie-style migration tool + A6 multi-root agent). The remaining 5 v13 items (A5/A7/A8/A9/A10) are scheduled into v0.5.7 since they require larger UX work or new dependencies.
Why these are grouped into one phase rather than split: the audit-reconciliation work and the new MCP/ACP work both need the same axum + reqwest + mock-server-BDD-harness scaffolding. Sequencing them lets each US-### conversion reuse the harness from the previous one — that’s the productivity pattern that produced the US-001…US-006 cadence in 6 weeks.
53.1 Audit reconciliation — real-I/O conversions (P0)
Same playbook as US-001 (web grounding) → US-006 (proactive scanner). Each conversion adds an axum mock-server BDD harness identical in shape to the existing ones in vibecli-cli/tests/.
- US-007 —
issue_triage.rsGitHub/Linear HTTP —octocrab+ Linear SDK;VIBECLI_GITHUB_TOKEN/VIBECLI_LINEAR_TOKENenv gating; mock server for issue listing + label updates + comment posting; 4 BDD scenarios. - US-008 —
native_connectors.rsfirst 5 connectors — Stripe + Slack + Linear + Notion + GitHub with realoauth2-crate flow; the remaining 15 connectors stay endpoint-string-only and are explicitly marked deferred; 8 BDD scenarios across the 5 connectors. - US-009 —
langgraph_bridge.rsREST API —axumserver exposing LangGraph’s documented routes (/threads/{id}/runs,/threads/{id}/state); checkpoint JSON schema validation; 5 BDD scenarios using the LangGraph Python SDK as the conformance test. - US-010 —
mcts_repair.rsreal rollouts — wirerolloutto spawncargo test/pytest/npm testper detected language; per-rollout time budget; reward = test exit code; 4 BDD scenarios on synthetic broken-test repos undertests/fixtures/mcts/. - US-011 —
sketch_canvas.rs2D wireframe → React JSX — defer 3D entirely (mark out-of-scope in the doc + scoreboard); ship the 2D path against tldraw’s JSON schema or an OSS recognizer likereact-sketch-canvas; 3 BDD scenarios. - US-012 —
cost_router.rsreal routing — wire toprovider.rsretry + circuit breaker; track per-(provider, model) latency/cost inagent_analytics.rs; routing decisions become a real function of observed data; 4 BDD scenarios using a mock provider with controllable latency. - US-013 —
semantic_index.rstree-sitter rewrite — replace thetrimmed.starts_with("pub fn")regex withtree-sitter+ per-language grammars (Rust, TS, Python, Go); reuse the existing tree-sitter setup fromvibe-core/src/index/symbol.rs; add call-graph extraction; 6 BDD scenarios. - US-014 —
linter_aggregatorreal linters — replacesimulate_linter()with subprocess spawn for clippy / eslint / pylint / golangci-lint / shellcheck / hadolint / yamllint / mypy; parse stdout into the existingFindingschema; the FP-filter LLM pass already exists; 8 BDD scenarios (one per linter). - US-015 — RL-OS one real algorithm end-to-end — ship PPO with
candleon CPU as the proof-of-shape; expose via PyO3 bindings; the 52 type-system entries become “real” once one training loop is real and the others can follow incrementally. Test:tests/rl_ppo_cartpole.rsruns the loop on a small env and asserts reward improves over 100 episodes.
53.2 v13 trend-delta items — new builds (P0/P1)
Numbered to match §16.1 of the Fit-Gap Analysis.
P0 (this cycle):
- A1 — MCP Apps support — render
application/vnd.mcp.app+jsonpayloads as embedded React inAIChat.tsx; security: same CSP as the WASM extension host; 4 BDD scenarios. - A2 — MCPB bundle format —
vibecli mcp install <bundle.mcpb>extracts + verifies signature + registers the local server; round-trip withvibecli mcp packfor our own example servers; 3 BDD scenarios. - A3 — MCP
.well-knowncapability discovery —vibecli serveexposesGET /.well-known/mcp.jsonlisting tool/prompt/resource catalogs; capability advertisement without a live SSE connection; 3 BDD scenarios. - A4 — ACP server mode — VibeCLI as an ACP server callable from Zed/JetBrains/Neovim; reuse the JSON-RPC scaffolding from our existing MCP client; ship the matching JetBrains plugin update (the Neovim plugin already speaks ACP-shaped JSON-RPC); 5 BDD scenarios using the official ACP test harness.
- A6 — Multi-root workspace agent — extend
--add-dirfrom read-only to read+write; agent tool calls accept aworkspace_root: <path>field; sandbox enforces per-root permissions; 4 BDD scenarios across 2-root and 3-root configurations. - A11 — Migration tool from Claude Code / Codex configs —
vibecli migrate from-claude-code/vibecli migrate from-codex: read existingCLAUDE.md,~/.claude.json,codex.toml, MCP server lists; emitVIBECLI.md+~/.vibecli/config.toml+~/.vibecli/mcp_servers.toml; 6 BDD scenarios covering common config shapes.
P1 (next cycle, v0.5.7):
- A5 — Async subagents — extend
nested_agents.rswithawait_later: bool; long-running subagents persist state in SQLite, reachable via/agents resume <id>; UI surface inNestedAgentsPanel.tsxshowing “running in background” + notification on completion. - A7 — Browser-native UI-element annotation Design Mode — extend
desktop_agent.rsbrowser-control track with DOM-element click-to-annotate, generating natural-language instructions tied to specific selectors. Patent-distance check required — design must remain distant from Cursor 3’s annotation UX; consult notes/PATENT_AUDIT_INLINE.md before building. - A8 — Self-verifying agent loop — feed
visual_verify.rsfailures back into the agent loop asToolResult { success: false, output: "<diff details>" }; bound the auto-fix loop at 3 iterations;desktop_agent.rstests the running app via real clicks/typing. - A9 — Cloud-agent remote-control protocol — extend the VibeMobile/VibeWatch pairing flow with a
/sessions/<id>/resumeendpoint; the server-side session state is already persisted in~/.vibecli/sessions/; needs a session-handoff UX in the watch + mobile clients. - A10 — Skills hot-reload + real-time progress —
notify-based watcher on the skills directory; emitSkillEvent::Reloadedto all attached agents; UI streams skill execution events via the existing event bus.
53.3 Patent-distance posture
Three of the v13 items (A5 async subagents, A7 Design Mode, A10 manager-style UI consolidation hinted in §1bis) sit close to product surfaces that competitors have invested heavily in. Per the existing patent-distance posture for diffcomplete (notes/PATENT_AUDIT_INLINE.md, gitignored working doc), each of these designs must:
- Pass a patent-distance check before implementation begins (consult Phase 1 + 2 diffcomplete protocol).
- Document the chosen design’s distance from the nearest prior-art claim in a per-feature note inside
notes/. - Avoid copying the layout/interaction language used in the competitor’s documentation or marketing.
This is a deliberate choice to keep design freedom on AI-editing surfaces, not just a CYA posture — it’s already shaped Phase 1 + Phase 2 of diffcomplete and was validated by the user as the right trade-off (memory: feedback_patent_distance_priority).
53.4 Out-of-scope for Phase 53
To keep the phase tractable, the following are explicitly deferred:
- Sketch Canvas 3D / WebGL / three.js (US-011 ships only the 2D wireframe → JSX path). 3D scene export is removed from the gap list as a non-goal.
- The other 15 native connectors in US-008 (only 5 ship with real OAuth this cycle).
- The remaining 30+ RL algorithms in US-015 (only PPO ships end-to-end this cycle; the rest follow once the candle/PyO3 scaffolding is proven).
- Devin-level hours-long autonomy (long-horizon item from §15 of the fitgap; not addressed here).
- Cursor’s proprietary Tab model (long-horizon; we continue to use Ollama/cloud FIM).
- Enterprise SSO / audit packaging (long-horizon; tracked separately under MCP enterprise extensions in Phase 54).
Appendix E — Phase 54: May 2026 weekly delta (B1–B6 + trivial closes)
Input: §1ter of this document + Fit-Gap §16.4. Window: v0.5.7 cycle (target completion: 2026-05-17). Output: Six new gap closures (B1–B6) + five trivial closes already enumerated in §16.4.
54.1 New gap closures
P0 (this cycle):
- B1 — Skills as MCP primitives — expose each entry under
vibecli/vibecli-cli/skills/via an MCP server’slist_skills/get_skillresources; reuseskill_loader.rs; ship a singlevibecli-skills-mcpbinary registered with the daemon; 5 BDD scenarios. Highest-leverage item this cycle — every MCP host (Claude Code, Cursor, Cline, Zed, JetBrains, …) inherits all 711 skills with no per-host work. - B2 — Plugin bundle format with admin install policies — define
vibecli-plugin.tomlmanifest bundling MCP servers + skills + subagents + rules + hooks;vibecli plugin install <path-or-url>;WorkspaceStorepersists per-plugin policy (Off/On/Required); governance panel surfaces this; 6 BDD scenarios. Patent-distance check: avoid copying Cursor’s marketplace layout/UX terminology. - B5 — NVFP4 (Blackwell native) TurboQuant target — add NVFP4 Metal+CUDA kernels to TurboQuant alongside existing MXFP4 + AWQ-Marlin paths; benchmark on RTX 5090 / B200 / GB200; CubeCL/Burn ban scope unchanged.
- B6 — A2A signed agent-card façade — serve
/.well-known/agent.jsonwith a P-256 ECDSA signature reused from the watch-pairing key infrastructure; register VibeCLI as an A2A server in the LF Agentic AI Foundation registry; 4 BDD scenarios covering signature verification + task lifecycle.
P1 (next cycle, v0.5.8):
- B3 — Always-on agent classes (security review, vuln scan) — convert
/reviewfrom on-demand to a daemon-resident agent class; trigger on file-watcher / pre-commit / CI; route findings to the existingFindingschema; UI surface inSecurityPanel.tsx. Patent-distance check required — design must remain distant from Cursor’s Security Review UX. - B4 — Cursor SDK parity audit — compare
packages/agent-sdk/to@cursor/sdkalong: subagents, hooks, plugins, skills, sandbox tiers, recap/resume, multi-client (mobile/watch). Items where Cursor’s surface is wider become roadmap entries (deferred until parity assessment is complete; this is research, not implementation).
54.2 Trivial closes (one-line / one-route)
- Ollama
/v1/messagesroute — one route handler invibecli/vibecli-cli/src/serve.rs; the existing Anthropic provider format already matches. - GPT-5.5 / GPT-5.4 model entries — append to
useModelRegistry.tsSTATIC_MODELS.openai. - Sonnet 4.8 entry (when Anthropic exposes it) — same one-file change.
- Qwen 3.6 / DeepSeek V4 / Kimi K2.6 entries — append to the Ollama section of
useModelRegistry.tsonce GGUF / vLLM weights land. GEMINI.mdfallback inmemory.rs— already noted in v13 as one-line; remains pending.
54.3 ACP Registry follow-through
A4 from Phase 53 (ACP server mode) has a concrete deliverable now that the ACP Registry is live: register VibeCLI in the registry once A4 ships so Zed + JetBrains users discover it natively. Coordinate with the JetBrains plugin update referenced in 53.2 P0.
54.4 Patent-distance posture (carries forward)
B2 (Plugin Marketplace) and B3 (Always-on Security Review) sit close to recently-shipped Cursor surfaces. Same protocol as 53.3 — patent-distance check before implementation begins, per-feature note in notes/, and avoid copying competitor layout/interaction language. The user-validated stance from the diffcomplete cycles applies (memory: feedback_patent_distance_priority).
54.5 Out-of-scope for Phase 54
- MCP Apps generic UI host (A1 from Phase 53 — already P0 there; B2 reuses the same React embedding work, no separate item).
- Long-horizon items from §15 of the fitgap (Tab model, Devin autonomy, MCP catalog breadth, SWE-bench leaderboard entry, enterprise SSO/audit, BYOA adapters) — none addressed here.
- Computer-use feedback into the agent loop beyond
desktop_agent.rs— A8 in Phase 53 covers it. - Patent-audit working doc updates — handled per-feature in
notes/PATENT_AUDIT_INLINE.md, not here.