Competitive Roadmap v2 — Surpassing the Competition
VibeCody Competitive Roadmap v2
Date: February 2026 Scope: Detailed fit-gap analysis and feature-by-feature implementation plan to surpass Codex CLI, Claude Code (VibeCLI), and Cursor, Windsurf, Google Antigravity, Trae (VibeUI).
1. Current State (All Phases Complete)
All nine roadmap phases (1–5 original, 6–9 in this document) are complete. VibeCody has:
| Feature | VibeCLI | VibeUI |
|---|---|---|
| Agent loop (plan→act→observe) | ✅ 30-step max, streaming | ✅ Full panel UI |
| 7 tools (read/write/patch/bash/search/list/complete) | ✅ | ✅ (via Tauri commands) |
| 3 approval tiers (Suggest / AutoEdit / FullAuto) | ✅ | ✅ dropdown |
| 5 AI providers (Ollama/Claude/OpenAI/Gemini/Grok) | ✅ | ✅ |
| Streaming responses | ✅ | ✅ |
| Codebase indexing (regex/heuristic + embeddings) | ✅ | ✅ |
| Memory/rules system (VIBECLI.md, AGENTS.md) | ✅ | ✅ MemoryPanel |
| MCP client (STDIO, JSON-RPC 2.0) | ✅ | — |
| CI/non-interactive mode (–exec) | ✅ | — |
| Multimodal input (Claude + OpenAI vision) | ✅ | ✅ AIChat UI |
| OS sandbox (sandbox-exec / bwrap) | ✅ | ✅ |
| Trace/audit log (JSONL per session) | ✅ | ✅ HistoryPanel |
| Diff review before apply | — | ✅ Monaco DiffEditor |
| Inline AI completions (FIM) | — | ✅ |
| @ context system | — | ✅ |
| Flow tracker (ring buffer + auto-injection) | ✅ | ✅ |
| WASM extension system (wasmtime) | — | ✅ |
| Checkpoint system | — | ✅ backend + CheckpointPanel UI |
| LSP integration | — | ✅ |
| Hooks system (events + shell + LLM handlers) | ✅ | ✅ (via config) |
| Plan Mode (PlannerAgent) | ✅ /plan command | ✅ Agent panel toggle |
| Session resume | ✅ –resume flag | — |
| Web search tool | ✅ | ✅ |
| Shell environment policy / Admin policy | ✅ | — |
| Parallel multi-agent (git worktrees) | ✅ –parallel flag | ✅ ManagerView |
| Embedding-based semantic indexing | ✅ | ✅ |
| Code review agent | ✅ vibecli review | ✅ GitPanel review |
| Skills system | ✅ | ✅ |
| Artifacts | ✅ | ✅ ArtifactsPanel |
| OpenTelemetry | ✅ | — |
| GitHub Actions | ✅ | — |
| Red team pentest pipeline (5-stage) | ✅ –redteam + /redteam | ✅ RedTeamPanel |
| OWASP/CWE static scanner (15 patterns) | ✅ bugbot.rs | ✅ BugBotPanel |
| Code Complete workflow (8-stage) | ✅ /workflow | ✅ WorkflowPanel |
| LSP diagnostics panel | ✅ /check TUI command | — |
| Session sharing | ✅ /share | — |
| @jira context | ✅ @jira:PROJECT-123 | ✅ ContextPicker |
| MCP OAuth install flow | — | ✅ McpPanel OAuth modal |
| Custom domain / publish | — | ✅ DeployPanel domain config |
| CRDT multiplayer collab | ✅ serve.rs WS | ✅ CollabPanel + useCollab |
| Code coverage | — | ✅ detect_coverage_tool + run_coverage |
| Multi-model comparison | — | ✅ compare_models |
| HTTP Playground | — | ✅ send_http_request + discover_api_endpoints |
| Cost observatory | — | ✅ record_cost_entry + get_cost_metrics |
| AI git workflow | — | ✅ suggest_branch_name + resolve_merge_conflict + generate_changelog |
| Codemod auto-fix | — | ✅ run_autofix + apply_autofix |
| VibeCLI daemon (serve) | ✅ | — |
| VS Code extension | ✅ | — |
| Agent SDK (TypeScript) | ✅ | — |
2. Competitive Analysis
2.1 VibeCLI vs. Codex CLI (TuringWorks) and Claude Code
Codex CLI Key Capabilities
- OS-level sandbox with a two-axis security model (sandbox mode × approval policy configured independently)
- Shell environment policy —
shell_environment_policycontrols exactly which env vars subprocesses inherit (all / core / none / include/exclude patterns) - Web search tool — cached, live, or disabled; first-class tool alongside file tools
- Session resume —
codex resumerestores full session transcript, files, draft, and approvals - Code review agent — dedicated mode that diffs against branches or commits and produces a structured review
- OpenTelemetry — native span export for enterprise CI observability
- Admin policy enforcement —
requirements.tomlorg-wide enforcement;approval_policy.reject.mcp_elicitations = trueper category - Cloud tasks —
codex cloudlaunches and manages remote agent tasks - PTY-backed exec — more robust unified exec tool (beta)
- Per-server MCP controls — tool allowlists/denylists, startup timeouts, bearer auth on HTTP servers
- Multiple profiles — named config sets with different providers / sandbox modes
Claude Code Key Capabilities (February 2026 state)
- Hooks system — 17 event types (
PreToolUse,PostToolUse,Stop,TaskCompleted,SubagentStart,WorktreeCreate, …); 3 handler types: shell command, single-turn LLM eval, full subagent (up to 50 turns);updatedInputallows hooks to mutate tool parameters before execution;async: truefor non-blocking hooks - Subagents / Parallel agents — up to 7 concurrent; built-in: Explore (Haiku, read-only, thoroughness levels), Plan (read-only), General-purpose (all tools), Bash; custom subagents defined as Markdown files with YAML frontmatter;
isolation: worktreeruns agent in auto-created git worktree - Agent Teams (Opus 4.6) — multiple Claude Code instances with shared task list + dependency tracking, inter-agent messaging, per-agent dedicated context windows
- Persistent subagent memory —
memory: user|project|localin frontmatter gives agent private files that survive context resets - Skills — auto-activating context-loaded capabilities in
.claude/skills/, activate without explicit invocation based on task context - Plugins — distributable packages bundling commands + hooks + skills + agents + MCP servers; 12 official Anthropic plugins
- Session portability —
/teleport,/desktopto move sessions between terminal, Desktop app, browser - IDE integrations — VS Code, JetBrains, Desktop app, Web, iOS, Slack, GitHub Actions, GitLab CI/CD, Chrome extension
- Agent SDK — TypeScript (v0.2.34) + Python SDK for building custom agents
- 1M-token context via Opus 4.6
- CLAUDE.md hierarchical merging — enterprise policy → user → project → directory-specific
VibeCLI Gaps — All Closed ✅
All previously-identified gaps have been closed:
| Gap | Status | Implementation |
|---|---|---|
| Hooks system | ✅ Closed | vibe-ai/src/hooks.rs — HookRunner with shell + LLM handlers |
| Parallel multi-agent | ✅ Closed | vibe-ai/src/multi_agent.rs — git worktrees |
| Plan Mode | ✅ Closed | vibe-ai/src/planner.rs — PlannerAgent |
| Session resume | ✅ Closed | vibe-ai/src/trace.rs — SessionSnapshot + load_session |
| Web search tool | ✅ Closed | WebSearch + FetchUrl in ToolCall enum |
| Shell environment policy | ✅ Closed | vibe-ai/src/policy.rs — AdminPolicy |
| Code review agent | ✅ Closed | vibecli/src/review.rs — GitHub PR posting |
| OpenTelemetry | ✅ Closed | vibe-ai/src/otel.rs + vibecli/src/otel_init.rs |
| Admin policy enforcement | ✅ Closed | vibe-ai/src/policy.rs |
| Skills system | ✅ Closed | vibe-ai/src/skills.rs — SkillLoader |
| Cloud/remote tasks | ✅ Closed | serve.rs job persistence (~/.vibecli/jobs/), GET /jobs, cancel; BackgroundJobsPanel |
| Agent SDK | ✅ Closed | packages/agent-sdk/ — TypeScript SDK |
2.2 VibeUI vs. Cursor, Windsurf, Google Antigravity
Cursor (v2.0, October 2025) Key Capabilities
- Tab model — proprietary always-on low-latency model; predicts multi-line edits AND next cursor position AND required imports; never stops running
- Composer model — mixture-of-experts, RL-trained in real codebases, 4x faster than comparable models; can launch integrated Chromium browser to test/debug web apps
- 8-way parallel agents — each in its own git worktree or remote machine; ensemble approach for competing solutions
- Background agents (beta) — remote, sandboxed; clone + branch + push without local IDE
- BugBot — integrates with GitHub PRs; automatic diff analysis, inline bug comments with fixes
- Embedding-based codebase index — encrypted paths, plaintext discarded after embedding; background indexing;
@folderscontext injection .cursorrules— project-level persistent AI context file
Windsurf (Wave 13, December 2025) Key Capabilities
- Supercomplete — next-edit prediction: rename variable → AI suggests all subsequent renames; predicts intent not just token
- Real-time flow awareness — Cascade continuously observes file edits, cursor movements, terminal output without prompting; developer never has to re-contextualize the AI
- Persistent cross-session memory — auto-learned coding style + manual rules; survives context window resets; builds per-developer personality model
- SWE-1.5 — proprietary model: Claude 4.5-quality at 13x speed; purpose-trained for edit-run-test agentic loops; supports images
- Plan Mode — distinct planning phase before code execution; plan presented for review before execution
- Named checkpoints per conversation — full project state snapshots, revertible at any time
- Agent Skills — standardized execution templates, auto-invoked by matching prompts
- Parallel agents (Wave 13) — git worktrees, side-by-side panes, dedicated zsh terminal
- Turbo Mode — fully autonomous terminal command execution without per-command confirmation
- MCP integrations — GitHub, Slack, Stripe, Figma, databases
Google Antigravity (Public Preview, November 2025) Key Capabilities
- Manager View — dedicated high-level orchestration layer; spawn/monitor/inspect multiple agents at task level, not file level; designed for teams running many parallel workstreams
- Artifacts — structured, inspectable deliverables: task lists, implementation plans, screenshots, browser recordings, diagrams; each artifact is commentable while agent continues running
- Async feedback — comment on artifact without interrupting agent execution (most unique capability in the field)
- Multi-model — Gemini 3 Pro/Flash natively; Claude Sonnet 4.5 + Opus 4.5; GPT-OSS 120B
- Free during preview — no cost barrier for adoption
Trae (ByteDance, January 2025) Key Capabilities
- AI-native IDE — VS Code fork by ByteDance with 6M+ users; three modes: Chat, Builder (agent), SOLO (fully autonomous)
- Free models — Claude 3.7 Sonnet + GPT-4o at no cost; Pro ($10/month) adds Gemini 2.5 Pro + higher rate limits
- MCP support — built-in MCP client with server manager UI and growing marketplace
- Multimodal — image upload (screenshot-to-code), voice input, @web/@docs/@codebase/@terminal context
- Browser preview — integrated web preview panel with Vercel one-click deploy
- Rules files —
.trae/rulesfor project-level AI context - Open-source agent — trae-agent framework released under MIT license
- No BYOK — users cannot bring their own API keys; locked to ByteDance-provided models
- Privacy concern — ByteDance ownership creates enterprise adoption friction (data sovereignty)
VibeUI Gaps — All Critical/High Items Closed ✅
| Gap | Status | Implementation |
|---|---|---|
| Parallel multi-agent with UI | ✅ Closed | ManagerView.tsx — multi-agent task board |
| Plan Mode in VibeUI | ✅ Closed | AgentPanel “Plan first” toggle |
| Checkpoint UI | ✅ Closed | CheckpointPanel.tsx — timeline + restore |
| Next-edit prediction | ✅ Closed | Inline completion with edit tracking |
| Real-time flow injection | ✅ Closed | FlowTracker auto-injection into prompts |
| GitHub PR integration | ✅ Closed | review.rs + GitPanel review button |
| Artifacts system | ✅ Closed | ArtifactsPanel.tsx — rich cards + annotations |
| Manager View | ✅ Closed | ManagerView.tsx — 8 parallel agents |
| Embedding-based codebase index | ✅ Closed | vibe-core/src/index/embeddings.rs |
| Background agents (remote) | ✅ Closed | serve.rs job persistence + BackgroundJobsPanel; Jobs tab in AI panel |
| Agent Skills | ✅ Closed | vibe-ai/src/skills.rs |
| Async artifact feedback | ✅ Closed | ArtifactsPanel annotation queue |
| Browser integration for web apps | ✅ Closed | BrowserPanel.tsx (iframe + quick-launch chips); Browser tab in bottom panel |
| VS Code extension | ✅ Closed | vscode-extension/src/extension.ts |
3. VibeCody Differentiators to Exploit
These are our current advantages that we must protect and amplify:
| Differentiator | Why it matters |
|---|---|
| Full Rust backend | 10x lower memory than Electron; sub-100ms startup; no V8 heap issues at scale |
| Ollama first-class | Cursor/Windsurf treat local models as afterthoughts; we should be the best local-AI dev tool |
| Privacy by design | No telemetry, no cloud indexing, fully local; growing market demand |
| Open source | Inspect everything, self-host, community extensions |
| CLI + GUI unified | VibeCLI and VibeUI share crates; agent work done once applies both |
| 17 providers | More than Cursor (3) or Windsurf (own + limited); unique for non-OpenAI shops |
| Hooks system (planned) | With ours, we can match Claude Code’s most differentiated feature |
4. Implementation Plan — Phases 6–9
Phase 6 — Hooks, Planning & Intelligence ✅ Complete
Goal: The two most powerful missing capabilities: a hooks system matching Claude Code’s + planning mode matching Windsurf. Also: session resume, web search, flow injection.
6.1 Hooks System
Priority: Critical — Claude Code’s most differentiated feature
The hooks system intercepts every agent event and allows shell scripts or LLM evaluations to block, modify, or react to tool calls. This enables: guaranteed lint-on-edit, format-on-save, security enforcement, test gates, and custom CI policies — all independent of model behavior.
New file: vibeui/crates/vibe-ai/src/hooks.rs
// Core types
pub enum HookEvent {
SessionStart,
PreToolUse { call: ToolCall },
PostToolUse { call: ToolCall, result: ToolResult },
Stop { reason: StopReason },
TaskCompleted { summary: String },
SubagentStart { name: String },
StreamChunk { text: String },
}
pub enum HookDecision {
Allow,
Block { reason: String },
ModifyInput { updated: ToolCall }, // mutate tool params
InjectContext { text: String }, // feed text back to model
}
pub enum HookHandler {
Command { shell: String }, // exit 0=allow, exit 2=block
Llm { prompt: String }, // single-turn eval returning {ok, reason}
}
pub struct HookConfig {
pub event: String, // "PreToolUse", "PostToolUse", etc.
pub tools: Option<Vec<String>>, // tool name filter (regex)
pub handler: HookHandler,
pub async_exec: bool, // non-blocking if true
}
pub struct HookRunner {
configs: Vec<HookConfig>,
provider: Arc<dyn AIProvider>,
}
impl HookRunner {
pub async fn run(&self, event: HookEvent) -> HookDecision;
}
Configuration in ~/.vibecli/config.toml:
[[hooks]]
event = "PostToolUse"
tools = ["write_file", "apply_patch"]
handler = { command = "sh .vibecli/hooks/format.sh" }
[[hooks]]
event = "PreToolUse"
tools = ["bash"]
handler = { command = "sh .vibecli/hooks/security-check.sh" }
[[hooks]]
event = "Stop"
handler = { command = "sh .vibecli/hooks/test-gate.sh" }
async = false # must pass before session ends
Hook payload via stdin (JSON):
{
"event": "PreToolUse",
"tool": "bash",
"input": { "command": "rm -rf dist/" },
"session_id": "1740000000"
}
Hook response via stdout:
{ "allow": false, "reason": "Deletion blocked by security hook" }
// or:
{ "allow": true, "updatedInput": { "command": "rm -rf dist/ --dry-run" } }
// or (PostToolUse inject):
{ "context": "Format check failed: 3 warnings. Claude should fix before completing." }
VibeUI: Add hooks configuration panel in Settings. Show hook execution timeline in HistoryPanel alongside trace entries.
Files:
vibe-ai/src/hooks.rs(new)vibe-ai/src/agent.rs(integrate HookRunner into agent loop)vibecli-cli/src/config.rs(add[[hooks]]array)vibeui/src-tauri/src/commands.rs(add hook management commands)vibeui/src/components/HooksPanel.tsx(new — config UI)
6.2 Plan Mode (Planning Before Execution)
Priority: Critical — Windsurf Wave 13 + Claude Code differentiator
A dedicated planning phase separates reasoning from action. The model generates a structured plan; the user reviews and optionally edits it; then execution proceeds step by step against the approved plan.
New file: vibeui/crates/vibe-ai/src/planner.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct ExecutionPlan {
pub goal: String,
pub steps: Vec<PlanStep>,
pub estimated_files: Vec<String>,
pub risks: Vec<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct PlanStep {
pub id: usize,
pub description: String,
pub tool: String, // which tool will be used
pub estimated_path: Option<String>,
pub status: PlanStepStatus,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum PlanStepStatus { Pending, InProgress, Done, Failed, Skipped }
pub struct PlannerAgent {
provider: Arc<dyn AIProvider>,
}
impl PlannerAgent {
/// Generate a structured execution plan without executing anything.
pub async fn plan(&self, task: &str, context: &AgentContext) -> Result<ExecutionPlan>;
/// Execute a previously approved plan step by step.
pub async fn execute(
&self,
plan: &ExecutionPlan,
executor: Arc<dyn ToolExecutorTrait>,
approval: ApprovalPolicy,
event_tx: mpsc::Sender<AgentEvent>,
) -> Result<String>;
}
REPL: /plan <task> — generates plan, shows it formatted, asks “Edit plan? (y/N) → Execute? (y/N)”
VibeUI Agent Panel: Add “Plan first” toggle. When enabled: run planner → display ExecutionPlan as editable todo list → “Execute Plan” button triggers executor.
Plan prompt format (injected):
You are a planning agent. Your task: <task>
Generate a detailed execution plan as JSON matching this schema:
{"goal": "...", "steps": [{"id": 1, "description": "...", "tool": "read_file", "estimated_path": "src/..."}], "estimated_files": [...], "risks": [...]}
DO NOT execute any actions. Generate ONLY the JSON plan.
6.3 Session Resume
Priority: High
Extend trace storage to include full message history. vibecli --resume <session-id> restores complete conversation state.
Extend TraceWriter:
impl TraceWriter {
/// Save full message history to <session_id>-messages.json alongside JSONL
pub fn save_messages(&self, messages: &[Message]) -> Result<()>;
/// Save agent context snapshot
pub fn save_context(&self, ctx: &AgentContext) -> Result<()>;
}
/// Load a previous session's messages and context for resume
pub fn load_session(session_id: &str, dir: &Path) -> Result<SessionSnapshot>;
pub struct SessionSnapshot {
pub messages: Vec<Message>,
pub context: AgentContext,
pub trace: Vec<TraceEntry>,
}
CLI: vibecli --resume <session-id> picks up where the session left off. /resume REPL command lists resumable sessions.
6.4 Web Search Tool
Priority: High — Codex has it, we don’t
Add WebSearch to ToolCall enum. Use DuckDuckGo’s JSON API (no API key) with an optional Google CSE config.
// In tools.rs:
pub enum ToolCall {
// ... existing variants ...
WebSearch { query: String, num_results: usize }, // NEW
FetchUrl { url: String }, // NEW
}
Tool system prompt addition:
web_search(query, num_results=5): Search the web for current information.
Returns: list of {title, url, snippet}
fetch_url(url): Fetch and summarize a web page.
Returns: page title + text content (truncated to 4000 chars)
Implementation: DuckDuckGo Instant Answer API for search, reqwest for URL fetching with readability-style content extraction.
Config:
[tools.web_search]
enabled = true
engine = "duckduckgo" # "duckduckgo" | "google"
# google_cse_id = "..." # optional
# google_api_key = "..."
max_results = 5
6.5 Flow Context Auto-Injection
Priority: High — Windsurf’s core differentiator
The flow tracker already records events. The missing piece: inject recent activity into every AI prompt automatically, giving the model full awareness without the user having to re-explain.
In AgentLoop::build_system_prompt():
let flow_ctx = flow_tracker.context_string(10);
if !flow_ctx.is_empty() {
system += &format!("\n\n## Recent Developer Activity\n{}", flow_ctx);
}
Flow context format:
## Recent Developer Activity
[2m ago] Opened src/auth/login.rs (line 42)
[3m ago] Edited src/auth/login.rs — lines 38-55 changed
[5m ago] Ran: cargo test auth -- FAILED (2 tests)
[7m ago] Opened Cargo.toml
[9m ago] Edited src/auth/mod.rs — lines 1-10 changed
This appears in every agent request, giving the model full situational awareness.
Also: Inject flow context into VibeUI’s AIChat onSubmit handler, not just the agent.
6.6 Shell Environment Policy
Priority: High — Codex differentiator for CI
Fine-grained control over what environment variables subprocess tool calls inherit.
[safety.shell_environment]
inherit = "core" # "all" | "core" | "none"
include = ["CARGO_HOME", "RUSTUP_HOME", "PATH"]
exclude = ["AWS_SECRET_*", "GITHUB_TOKEN", "*_API_KEY"]
set = { VIBECLI_AGENT = "1", CI = "true" }
ToolExecutor change:
fn build_env(policy: &ShellEnvPolicy) -> HashMap<String, String> {
let base = match policy.inherit {
"all" => std::env::vars().collect(),
"core" => core_env_vars(), // PATH, HOME, USER, SHELL, TERM, LANG
"none" => HashMap::new(),
};
// apply include, exclude, and set rules
}
Phase 7 — Parallel Agents & Intelligence Upgrades ✅ Complete
Goal: Ship parallel multi-agent execution (closes the biggest throughput gap vs. Cursor/Windsurf), upgrade codebase indexing to embeddings, and ship next-edit prediction.
7.1 Parallel Multi-Agent (Git Worktrees)
Priority: Critical — both Cursor (8) and Windsurf (Wave 13) have this
VibeCLI:
# Run 3 agents in parallel, each in its own worktree
vibecli --agent "refactor auth module" --parallel 3
# Or split a complex task across specialized agents
vibecli --multi-agent tasks.json # JSON array of subtasks
Architecture: MultiAgentOrchestrator spawns N AgentLoop instances, each operating on a separate git worktree.
New file: vibe-ai/src/multi_agent.rs
pub struct MultiAgentOrchestrator {
provider: Arc<dyn AIProvider>,
approval: ApprovalPolicy,
executor: Arc<dyn ToolExecutorTrait>,
max_agents: usize,
}
pub struct AgentInstance {
pub id: usize,
pub task: String,
pub worktree: PathBuf,
pub branch: String,
pub status: AgentStatus,
pub steps: Vec<AgentStep>,
}
pub enum OrchestratorEvent {
AgentStarted { id: usize, task: String },
AgentStep { id: usize, step: AgentStep },
AgentComplete { id: usize, summary: String, branch: String },
AgentError { id: usize, error: String },
AllComplete { results: Vec<AgentResult> },
}
impl MultiAgentOrchestrator {
/// Split one task N ways and run in parallel.
pub async fn run_parallel(
&self,
task: &str,
n: usize,
event_tx: mpsc::Sender<OrchestratorEvent>,
) -> Result<Vec<AgentResult>>;
/// Run different tasks on different agents simultaneously.
pub async fn run_tasks(
&self,
tasks: Vec<AgentTask>,
event_tx: mpsc::Sender<OrchestratorEvent>,
) -> Result<Vec<AgentResult>>;
}
Worktree management (add to vibe-core/src/git.rs):
pub fn create_worktree(repo: &Path, branch: &str, worktree_path: &Path) -> Result<()>;
pub fn remove_worktree(repo: &Path, worktree_path: &Path) -> Result<()>;
pub fn list_worktrees(repo: &Path) -> Result<Vec<WorktreeInfo>>;
pub fn merge_worktree_branch(repo: &Path, branch: &str) -> Result<MergeResult>;
TUI: New /multi-agent command shows a split view with N panes, one per agent. Each pane streams its own steps.
VibeUI: New Parallel tab in AI panel. Shows N side-by-side agent cards. “Merge Best” button diffs all outputs and lets user pick.
7.2 Embedding-Based Codebase Indexing
Priority: High — Cursor’s core competitive moat
Upgrade from regex-based symbol search to semantic search using local embeddings.
New file: vibe-core/src/index/embeddings.rs
/// Store and query vector embeddings using an HNSW index.
pub struct EmbeddingIndex {
index: HnswMap<Vec<f32>, EmbeddingDoc>,
provider: EmbeddingProvider,
}
pub struct EmbeddingDoc {
pub file: PathBuf,
pub chunk_start: usize,
pub chunk_end: usize,
pub text: String,
}
pub enum EmbeddingProvider {
Ollama { model: String, api_url: String },
OpenAI { api_key: String, model: String },
}
impl EmbeddingIndex {
/// Embed and index all files in workspace.
pub async fn build(workspace: &Path, provider: &EmbeddingProvider) -> Result<Self>;
/// Incrementally update changed files.
pub async fn update(&mut self, changed_files: &[PathBuf]) -> Result<()>;
/// Semantic search: return top-k most relevant chunks.
pub async fn search(&self, query: &str, k: usize) -> Result<Vec<SearchHit>>;
}
Chunking strategy:
- Split files into 512-token chunks at function/class boundaries (using tree-sitter or heuristic line-counting)
- Overlap: 64 tokens between chunks
- Max file size: 500KB (skip larger files)
- Skip:
.git/,target/,node_modules/,dist/, generated files
Config:
[index]
enabled = true
embedding_provider = "ollama"
embedding_model = "nomic-embed-text" # or "text-embedding-3-small"
rebuild_on_startup = false
max_file_size_kb = 500
Integration with agent: When an agent task starts, semantic search finds the most relevant files to include in context automatically.
7.3 Next-Edit Prediction in VibeUI
Priority: Critical — Cursor Tab / Windsurf Supercomplete
The current inline completion returns a single completion at cursor. True next-edit prediction watches what you’ve edited and predicts what you’ll want to change next — in a different location.
Architecture:
- After every keystroke (debounced 150ms), capture: current file state, cursor position, last 5 edits with positions and timestamps
- Send to fast model (Ollama
qwen2.5-coder:7bor similar) with a next-edit prediction prompt - If the model predicts an edit at a different location than the cursor: show a ghost annotation at that location
- User presses
Tabto jump to predicted location and accept the edit
New Tauri command: predict_next_edit
#[tauri::command]
async fn predict_next_edit(
state: State<'_, AppState>,
current_file: String,
content: String,
cursor_line: u32,
cursor_col: u32,
recent_edits: Vec<EditEvent>, // last 5 edits: {line, col, old, new, elapsed_ms}
provider: String,
) -> Result<Option<NextEditPrediction>, String>;
pub struct NextEditPrediction {
pub target_line: u32,
pub target_col: u32,
pub suggested_text: String,
pub confidence: f32,
}
Prediction prompt:
Recent edits in {file}:
1. Line 42: renamed `user_name` → `username`
2. Line 67: renamed `user_name` → `username`
3. Line 83: still has `user_name` (unchanged)
Predict the next edit the developer will make. Respond ONLY with JSON:
{"line": 83, "col": 15, "replacement": "username", "confidence": 0.95}
Monaco integration: When prediction arrives, render a dimmed inline decoration at target location. Tab key handler: if prediction pending and Tab pressed, jump + accept; otherwise normal tab behavior.
7.4 Checkpoint UI in VibeUI
Priority: Critical — backend done, ship the UI
The Tauri backend already has create_checkpoint, list_checkpoints, restore_checkpoint. Ship the React UI.
New file: vibeui/src/components/CheckpointPanel.tsx
interface Checkpoint {
index: number;
label: string;
timestamp: number;
oid: string;
}
export function CheckpointPanel({ workspacePath }) {
// Timeline view: vertical list of checkpoints with age
// Each entry: index, label, timestamp, "Restore" button
// "Create Checkpoint" button at top with label input
// Before restore: confirm dialog showing which files will change
}
Add to AI panel tabs: alongside Chat / Agent / Rules / History.
Auto-create checkpoint: When agent starts a task (especially in FullAuto mode), automatically create a checkpoint named before-agent-<task-summary>.
7.5 GitHub PR Integration (BugBot Equivalent)
Priority: High — Cursor BugBot is a major differentiator
A dedicated code review agent mode that analyzes diffs and produces structured reviews.
VibeCLI:
# Review uncommitted changes
vibecli review
# Review specific branch vs main
vibecli review --branch feature/auth --base main
# Post review as GitHub PR comment
vibecli review --pr 42 --post-github
New file: vibecli-cli/src/review.rs
pub struct ReviewConfig {
pub base_ref: String, // "main" | commit SHA | branch name
pub target_ref: String,
pub post_to_github: bool,
pub github_pr: Option<u32>,
pub focus: Vec<ReviewFocus>, // Security, Performance, Correctness, Style
}
pub struct ReviewReport {
pub summary: String,
pub issues: Vec<ReviewIssue>,
pub suggestions: Vec<ReviewSuggestion>,
pub score: ReviewScore,
}
pub struct ReviewIssue {
pub file: String,
pub line: u32,
pub severity: Severity, // Critical, Warning, Info
pub category: ReviewFocus,
pub description: String,
pub suggested_fix: Option<String>,
}
Review prompt strategy:
- Get full diff:
git diff <base>..<target> - For large diffs: chunk by file, review each file separately
- Aggregate results, deduplicate, rank by severity
- Output: structured JSON + human-readable Markdown
GitHub integration: Use gh CLI to post review comments. Requires GITHUB_TOKEN in environment or config.
VibeUI: Add “Review” button in GitPanel that opens a ReviewPanel showing issues with file/line links.
Phase 8 — Ecosystem Features ✅ Complete
Goal: Skills system, OpenTelemetry, Artifacts, GitHub Actions, agent configurability.
8.1 Skills System
Priority: Medium — Claude Code’s “Skills” are auto-activating capabilities
Skills are context-aware capability definitions that activate automatically when a task matches their description — no explicit invocation needed.
Directory: .vibecli/skills/ in repo root (or ~/.vibecli/skills/ for global).
rust-safety.md example:
---
name: rust-safety
description: Activated when working on Rust code safety, memory, or correctness
triggers: ["unsafe", "memory", "panic", "lifetime", "borrow"]
tools_allowed: [read_file, write_file, bash]
---
When editing Rust code, always:
1. Check for `unwrap()` calls that should be `?` or `expect()`
2. Verify all `unsafe` blocks have a `// SAFETY:` comment
3. After writing, run `cargo clippy -- -D warnings` via bash tool
4. Prefer `Arc<Mutex<T>>` over raw shared state
Skill activation: Before each agent request, scan .vibecli/skills/ directory. For each skill, check if any triggers keyword appears in the task description or recent tool outputs. Activated skills’ content is appended to the system prompt.
Implementation: Add SkillLoader to vibe-ai/src/skills.rs. Call before building system prompt in AgentLoop.
8.2 OpenTelemetry Integration
Priority: Medium — Enterprise/CI observability
Emit OpenTelemetry spans for agent steps, enabling Jaeger/Grafana/Datadog observability in CI pipelines.
[otel]
enabled = false
endpoint = "http://localhost:4317" # OTLP gRPC
service_name = "vibecli"
Spans emitted:
agent.session— root span for entire agent runagent.step— one span per tool call (tool name, input summary, success, duration)agent.hook— one span per hook executionagent.llm_call— LLM API call with model, token counts, latency
Crate: opentelemetry, opentelemetry-otlp, opentelemetry-sdk
8.3 Artifacts System (Antigravity-Inspired)
Priority: High — genuinely novel UX
Agents produce structured, inspectable, annotatable deliverables alongside text responses.
New type in vibe-ai/src/artifacts.rs:
pub enum Artifact {
TaskList { items: Vec<TaskItem> },
ImplementationPlan { steps: Vec<PlanStep>, files: Vec<String> },
FileChange { path: String, diff: String },
CommandOutput { command: String, stdout: String, exit_code: i32 },
TestResults { passed: usize, failed: usize, output: String },
ReviewReport { issues: Vec<ReviewIssue> },
}
pub struct AgentArtifact {
pub id: String,
pub artifact: Artifact,
pub timestamp: u64,
pub annotations: Vec<Annotation>, // user comments
}
pub struct Annotation {
pub text: String,
pub timestamp: u64,
pub applied: bool, // has the agent incorporated this feedback?
}
VibeUI: New ArtifactsPanel renders artifacts as rich cards. Users can expand, annotate, and mark artifacts as “feedback applied.” Annotations are queued and injected into the agent’s next context window as: "User feedback on artifact: <annotation>".
This enables async feedback — the user annotates while the agent continues working on the next step.
8.4 GitHub Actions Integration
Priority: Medium
Official GitHub Action for running VibeCLI in CI:
.github/actions/vibecli/action.yml:
name: VibeCLI Agent
description: Run a VibeCLI agent task in CI
inputs:
task:
description: Task for the agent to perform
required: true
provider:
description: AI provider (ollama/claude/openai)
default: claude
approval:
description: Approval policy (auto-edit/full-auto)
default: auto-edit
output-format:
description: Report format (json/markdown)
default: markdown
runs:
using: composite
steps:
- name: Run VibeCLI agent
shell: bash
env:
ANTHROPIC_API_KEY: $
run: |
vibecli exec "$" \
--provider $ \
--$ \
--output-format $ \
--output vibecli-report.md
Use cases:
- Auto-fix failing test:
task: "Fix the failing test in CI" - Auto-refactor:
task: "Add error handling to all public API functions" - Auto-review:
vibecli review --pr $PR_NUMBER --post-github
Phase 9 — Manager View & Scale ✅ Complete
Goal: Ship the high-level orchestration UI (Manager View), VS Code extension, and Agent SDK.
9.1 Manager View in VibeUI
Priority: High — Antigravity’s most unique feature
A dedicated orchestration dashboard for managing multiple parallel agents at the task level, not the file level.
New React component: vibeui/src/components/ManagerView.tsx
Layout:
┌─────────────────────────────────────────────────────-┐
│ Manager View + New Agent │
├──────────┬──────────┬──────────┬────────────────────-┤
│ Agent 1 │ Agent 2 │ Agent 3 │ Task Board │
│ ──────── │ ──────── │ ──────── │ ────────────────── │
│ Status: │ Status: │ Status: │ ☐ Task 1 → Agent 1 │
│ Running │ Done x │ Pending │ x Task 2 → Agent 2 │
│ │ │ │ z Task 3 → Agent 3 │
│ Step 3/? │ 12 steps │ queued │ │
│ [expand] │ [review] │ [assign] │ [+ Add Task] │
└──────────┴──────────┴──────────┴────────────────────-┘
Features:
- Spawn up to 8 agents (matching Cursor), each in a git worktree
- Task board with dependency tracking (Task 3 depends on Task 2)
- Each agent card expandable to show step-by-step trace
- “Review Changes” for done agents: opens Monaco diff viewer
- “Merge Best” for parallel runs: pick winner or cherry-pick across agents
- Real-time progress via Tauri events
Tauri commands:
start_parallel_agents(tasks: Vec<AgentTask>)— spawns orchestratorget_orchestrator_status()→Vec<AgentInstance>merge_agent_branch(agent_id, strategy)— merge worktree into main
9.2 VS Code Extension
Priority: Medium — critical for distribution
A VS Code extension that provides VibeCLI/VibeUI capabilities inside VS Code.
Extension capabilities:
- Chat panel — sidebar chat powered by VibeCLI’s agent
- Inline completions — register
InlineCompletionItemProvider; delegate to VibeCLI’s FIM endpoint - Agent mode —
/agent <task>command runs VibeCLI agent, streams steps into output panel - Status bar — shows current provider, branch, last agent status
Implementation approach:
- VS Code extension communicates with a local VibeCLI daemon (
vibecli serve --port 7878) - Daemon exposes REST/WebSocket API:
POST /chat,POST /agent,GET /stream/<session-id> - Extension is thin TypeScript client over this API
New file: vibecli-cli/src/serve.rs — Axum HTTP server exposing VibeCLI capabilities
9.3 Agent SDK
Priority: Low-Medium — community/enterprise adoption
A library that lets developers build custom agents using VibeCLI’s infrastructure.
Rust crate: Publish vibe-ai as a standalone crate on crates.io.
TypeScript package: @vibecody/agent-sdk wraps the VibeCLI daemon API:
import { VibeCLIAgent } from '@vibecody/agent-sdk';
const agent = new VibeCLIAgent({
provider: 'claude',
approval: 'full-auto',
tools: ['read_file', 'write_file', 'bash', 'web_search'],
hooks: [
{ event: 'PostToolUse', tools: ['write_file'], command: 'npm run lint' }
]
});
for await (const event of agent.run('Add TypeScript strict mode to all files')) {
if (event.type === 'step') console.log(`[${event.tool}] ${event.summary}`);
if (event.type === 'complete') console.log('Done:', event.summary);
}
5. Current Feature Matrix (All Phases Complete)
| Capability | VibeCLI | Codex CLI | Claude Code | Cursor | Windsurf | Antigravity |
|---|---|---|---|---|---|---|
| Agent loop | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Parallel agents | ✅ 8-way | experimental | ✅ 7-way | ✅ 8-way | ✅ | ✅ async |
| Hooks system | ✅ | ❌ | ✅ 17 events | ❌ | ❌ | ❌ |
| Plan Mode | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ |
| Web search tool | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| Session resume | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| OS sandbox | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| Shell env policy | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| Code review agent | ✅ | ✅ | ✅ | BugBot | ❌ | ❌ |
| MCP support | ✅ | ✅ | ✅ 300+ | ❌ | ✅ | ❌ |
| Multimodal | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Semantic indexing | ✅ | ❌ | ❌ | ✅ | ✅ | partial |
| OTel | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ |
| GitHub Actions | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| Skills | ✅ | ❌ | ✅ | ❌ | ✅ | ❌ |
| Ollama first-class | ✅ | ❌ | ❌ | partial | partial | ❌ |
| Open source | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Rust native | ✅ | ✅ | ❌ | ❌ | ❌ | partial |
| Provider timeout hardening | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| Capability | VibeUI | Cursor | Windsurf | Antigravity |
|---|---|---|---|---|
| Next-edit prediction | ✅ | ✅ Tab | ✅ Supercomplete | partial |
| Parallel agents + UI | ✅ Manager View | ✅ | ✅ | ✅ |
| Plan Mode | ✅ | ❌ | ✅ | ✅ |
| Checkpoints UI | ✅ | ❌ | ✅ | Artifacts |
| Flow injection | ✅ | ❌ | ✅ | ❌ |
| Artifacts | ✅ | ❌ | ❌ | ✅ |
| GitHub PR review | ✅ | BugBot | ❌ | ❌ |
| Semantic indexing | ✅ | ✅ | ✅ | partial |
| WASM extensions | ✅ | ✅ | ✅ | ❌ |
| Agent skills | ✅ | ❌ | ✅ | ❌ |
| Multi-provider (5+) | ✅ | partial | partial | ✅ |
| Rust native backend | ✅ | ❌ | ❌ | partial |
| CRDT multiplayer collab | ✅ | ❌ | ❌ | ❌ |
| Code coverage panel | ✅ | ❌ | ❌ | ❌ |
| Multi-model comparison | ✅ | ❌ | ❌ | ❌ |
| HTTP Playground | ✅ | ❌ | ❌ | ❌ |
| Cost observatory | ✅ | ❌ | ❌ | ❌ |
| AI git workflow | ✅ | ❌ | ❌ | ❌ |
| Codemod auto-fix | ✅ | ❌ | ❌ | ❌ |
| WCAG 2.1 AA accessibility | ✅ | partial | partial | partial |
| Keyboard shortcuts (8+) | ✅ | ✅ | ✅ | partial |
| Onboarding tour | ✅ | ✅ | ❌ | ❌ |
| Provider timeout hardening | ✅ | ❌ | ❌ | ❌ |
6. Architecture (All Phases Complete)
vibecli-cli
├── REPL / TUI (streaming, hooks, /agent, /plan, /multi-agent, /review)
├── CI mode (--exec, --parallel, --review)
├── Server mode (vibecli serve — API for VS Code extension + SDK)
└── src/
├── ci.rs, review.rs, serve.rs
└── hooks.rs (config loading)
vibe-ai
├── provider.rs (AIProvider trait + ImageAttachment + vision)
├── agent.rs (plan→act→observe + hook integration)
├── planner.rs (PlannerAgent: plan generation + guided execution)
├── multi_agent.rs (parallel agents on git worktrees)
├── hooks.rs (HookRunner: command + llm handlers, event bus)
├── skills.rs (SkillLoader: auto-activating context snippets)
├── artifacts.rs (Artifact types, annotation queue)
├── mcp.rs (McpClient JSON-RPC 2.0)
├── tools.rs (ToolCall enum + WebSearch + FetchUrl)
└── trace.rs (JSONL audit + session resume)
vibe-core
├── index/
│ ├── mod.rs, symbol.rs, content.rs
│ └── embeddings.rs (HNSW index + Ollama/OpenAI embeddings)
├── context.rs (smart context builder: flow + semantic + git)
├── executor.rs (sandboxed execution + shell env policy)
└── git.rs (worktree: create, remove, merge)
vibe-collab
├── server.rs (CollabServer: DashMap room registry)
├── room.rs (CollabRoom: Y.Doc + peer list + broadcast)
├── protocol.rs (Yjs binary sync: SyncStep1/2/Update)
├── awareness.rs (cursor state + 8-color palette)
└── error.rs
vibe-extensions
└── loader.rs (wasmtime WASM host)
vibeui (React + Tauri)
├── AgentPanel (single-agent: steps, approval, artifacts)
├── ManagerView (multi-agent: task board, worktrees, merge)
├── CheckpointPanel (timeline, restore, auto-checkpoint)
├── ArtifactsPanel (rich cards, annotations, async feedback)
├── HooksPanel (hooks configuration UI)
├── MemoryPanel (rules editor)
├── HistoryPanel (trace viewer)
├── GitPanel (git + PR review)
└── components/
└── ReviewPanel (code review issues with file/line links)
7. Completed Implementation Backlog
Phase 6 ✅ Complete
| # | Feature | Gap Closed vs. | Status |
|---|---|---|---|
| 1 | Hooks system (events + shell + LLM handlers) | Claude Code | ✅ Done |
| 2 | Plan Mode (PlannerAgent + approval flow) | Windsurf, Claude Code | ✅ Done |
| 3 | Web search tool | Codex CLI | ✅ Done |
| 4 | Flow context auto-injection | Windsurf | ✅ Done |
| 5 | Shell environment policy | Codex CLI | ✅ Done |
| 6 | Session resume | Codex CLI | ✅ Done |
Phase 7 ✅ Complete
| # | Feature | Gap Closed vs. | Status |
|---|---|---|---|
| 7 | Parallel multi-agent (git worktrees) | Cursor, Windsurf | ✅ Done |
| 8 | Embedding-based semantic indexing | Cursor, Windsurf | ✅ Done |
| 9 | Next-edit prediction in VibeUI | Cursor Tab, Windsurf Supercomplete | ✅ Done |
| 10 | Checkpoint UI in VibeUI | Windsurf | ✅ Done |
| 11 | GitHub PR review agent | Cursor BugBot | ✅ Done |
Phase 8 ✅ Complete
| # | Feature | Gap Closed vs. | Status |
|---|---|---|---|
| 12 | Skills system | Claude Code, Windsurf | ✅ Done |
| 13 | Artifacts panel in VibeUI | Antigravity | ✅ Done |
| 14 | OpenTelemetry spans | Codex CLI | ✅ Done |
| 15 | GitHub Actions workflow | Codex CLI, Claude Code | ✅ Done |
| 16 | Hooks config UI in VibeUI | — | ✅ Done |
| 17 | Turbo Mode (VibeUI FullAuto toggle) | Windsurf | ✅ Done |
Phase 9 ✅ Complete
| # | Feature | Gap Closed vs. | Status |
|---|---|---|---|
| 18 | Manager View (VibeUI parallel orchestration) | Antigravity | ✅ Done |
| 19 | VS Code extension | Cursor, Windsurf, all | ✅ Done |
| 20 | VibeCLI daemon (vibecli serve) |
Enables SDK + extension | ✅ Done |
| 21 | Agent SDK (TypeScript) | Claude Code | ✅ Done |
| 22 | Admin policy enforcement | Codex CLI | ✅ Done |
7.10 Phase 41 — Red Team Security Testing ✅
Status: Complete
Competitor reference: Shannon (KeygraphHQ) — autonomous AI-powered pentesting framework
Comparison: docs/SHANNON-COMPARISON.md
| Item | Status | Details |
|---|---|---|
redteam.rs — 5-stage autonomous pentest pipeline |
✅ | Recon → Analysis → Exploitation → Validation → Report; RedTeamConfig, RedTeamSession, VulnFinding, AttackVector (15 types), CvssSeverity with CVSS scoring, RedTeamManager at ~/.vibecli/redteam/ |
| Expanded CWE scanner (bugbot.rs) | ✅ | 8 new patterns: CWE-918 SSRF, CWE-611 XXE, CWE-502 deserialization, CWE-943 NoSQL injection, CWE-1336 template injection, CWE-639 IDOR, CWE-352 CSRF, CWE-319 cleartext; total: 15 CWE patterns |
| CLI flags | ✅ | --redteam <url>, --redteam-config <file>, --redteam-report <session-id> |
| REPL commands | ✅ | /redteam with sub-commands: scan, list, show, report, config; tab-completion + hints |
| Config section | ✅ | [redteam] in config.toml: max_depth, timeout_secs, parallel_agents, scope_patterns, exclude_patterns, auth_config, auto_report |
| RedTeamPanel.tsx | ✅ | Pipeline stage visualization, target URL input, findings feed with severity badges + CVSS scores, expand-to-details with PoC + remediation, report export button; 🛡️ RedTeam tab in AI panel |
| Tauri commands | ✅ | start_redteam_scan, get_redteam_sessions, get_redteam_findings, generate_redteam_report, cancel_redteam_scan |
| Shannon comparison doc | ✅ | docs/SHANNON-COMPARISON.md — full feature matrix, architectural comparison, integration opportunities |
7.11 Phase 42 — Jira Context, MCP OAuth, Custom Domains ✅
Status: Complete
| Item | Status | Details |
|---|---|---|
@jira:PROJECT-123 context |
✅ | VibeCLI expand_at_refs() + VibeUI resolve_at_references() + ContextPicker.tsx autocomplete; Jira REST API v2 with basic auth; env vars: JIRA_BASE_URL, JIRA_EMAIL, JIRA_API_TOKEN |
| MCP OAuth install flow | ✅ | McpPanel.tsx two-step modal (configure → paste auth code); 3 Tauri commands (initiate_mcp_oauth, complete_mcp_oauth, get_mcp_token_status); tokens at ~/.vibeui/mcp-tokens.json; green 🔑 badge |
| Custom domain / publish | ✅ | DeployPanel.tsx domain input + set_custom_domain Tauri command; Vercel REST API with VERCEL_TOKEN; CNAME instructions for other targets |
7.12 Phase 43 — Test Runner & AI Commit Message ✅
Status: Complete
| Item | Status | Details |
|---|---|---|
| Test runner system | ✅ | detect_test_framework + run_tests Tauri commands; auto-detects Cargo/npm/pytest/Go; streams test:log events; parses structured output; TestPanel.tsx (🧪 Tests tab) with framework badge, live log, filter tabs, pass/fail badges; /test REPL command in VibeCLI |
| AI commit message generation | ✅ | generate_commit_message Tauri command; git diff --staged → AI prompt → imperative one-liner; “✨ AI” button in GitPanel.tsx fills commit textarea |
7.13 Phase 43 — CRDT Multiplayer Collaboration ✅
Status: Complete
Real-time collaborative editing powered by yrs (the Rust port of Yjs). Multiple users edit the same file simultaneously with automatic conflict resolution via CRDTs.
| Item | Status | Details |
|---|---|---|
vibe-collab crate |
✅ | New shared crate: CollabServer (DashMap room registry), CollabRoom (Y.Doc per room, Y.Text per file path, broadcast fan-out), protocol.rs (Yjs binary sync: SyncStep1/SyncStep2/Update), awareness.rs (cursor state + 8-color peer palette), error.rs |
| WebSocket transport | ✅ | Axum 0.7 extract::ws handler at /ws/collab/:room_id; bearer token auth via query param; binary frames for Yjs sync, text frames for JSON session coordination; peer join/leave broadcast |
| REST room management | ✅ | POST /collab/rooms (create), GET /collab/rooms (list), GET /collab/rooms/:room_id/peers (peer list); protected by existing auth + rate-limit middleware |
| Tauri commands | ✅ | create_collab_session, join_collab_session, leave_collab_session, list_collab_peers, get_collab_status — 5 new commands registered in lib.rs |
CollabPanel.tsx |
✅ | Create/join room UI, peer list with color indicators, copy invite link, leave session; “👥 Collab” 25th AI panel tab |
useCollab.ts hook |
✅ | React hook managing WebSocket connection, Y.Doc lifecycle, awareness state, peer tracking, reconnection |
| NPM dependencies | ✅ | yjs ^13.6.0, y-monaco ^0.1.6, y-websocket ^2.0.0 added to vibeui/package.json |
| Tests | ✅ | 15 unit tests: room lifecycle, peer management, room full, Y.Doc sync convergence, incremental updates, message serialization, color cycling, server cleanup |
Architecture
Client A (VibeUI) VibeCLI Daemon Client B (VibeUI)
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Monaco Editor│ │ CollabServer │ │ Monaco Editor│
│ ↕ │ │ ┌──────────┐│ │ ↕ │
│ y-monaco │──WebSocket──→ │ │CollabRoom││ ←─WebSocket─│ y-monaco │
│ Y.Doc (JS) │ (binary) │ │ Y.Doc(Rs)││ (binary) │ Y.Doc (JS) │
│ y-websocket │ │ │ broadcast ││ │ y-websocket │
└──────────────┘ │ └──────────┘│ └──────────────┘
└──────────────┘
7.14 Phase 44 — Code Coverage, Multi-Model Comparison, HTTP Playground ✅
Status: Complete
| Item | Status | Details |
|---|---|---|
| Code coverage panel | ✅ | detect_coverage_tool (cargo-llvm-cov/nyc/coverage.py/go-cover) + run_coverage Tauri commands; LCOV and Go coverprofile parsers; CoverageResult with per-file uncovered lines and total percentage |
| Multi-model comparison | ✅ | compare_models Tauri command; parallel tokio::join! dual-provider call; build_temp_provider factory (6 providers); CompareResult with timing, tokens, errors |
| HTTP Playground | ✅ | send_http_request (method/URL/headers/body, 30s timeout, URL validation); discover_api_endpoints (regex grep for Express/Axum/FastAPI/Spring route patterns, 8 file types, max 60 results) |
| Safety hardening | ✅ | Replaced unwrap() in 9 files: bugbot.rs, gateway.rs, redteam.rs, agent.rs, chat.rs, buffer.rs, git.rs, index/mod.rs, remote.rs |
7.15 Phase 44 — Arena Mode, Live Preview, Recursive Subagent Trees ✅
Status: Complete
| Item | Status | Details |
|---|---|---|
| Arena Mode | ✅ | ArenaPanel.tsx (🥊 Arena tab) — blind A/B model comparison: randomized provider assignment, hidden identities, vote buttons (A/B/Tie/Both bad), post-vote reveal with timing/tokens, persistent leaderboard at ~/.vibeui/arena-votes.json; save_arena_vote + get_arena_history Tauri commands; /arena REPL command with compare/stats/history sub-commands |
| Live Preview with Element Selection | ✅ | BrowserPanel gains inspect mode toggle (🔍, localhost-only); injects inspector.js into iframe; postMessage listener for vibe:element-selected; element info overlay (tag, selector, React component, parent chain, outerHTML); “Send to Chat” via vibeui:inject-context; inspector.js gains parentChain in buildInfo(); @html-selected context type in ContextPicker + resolve_at_references() |
| Recursive Subagent Trees | ✅ | AgentContext gains parent_session_id, depth, shared active_agent_counter; ToolCall::SpawnAgent gains max_depth; spawn_sub_agent() enforces depth ≤ 5, per-parent children ≤ 10, global agents ≤ 20; session_store.rs gains tree schema + get_children()/get_tree()/list_root_sessions() queries; 5 new unit tests |
7.16 Phase 45 — Cost Observatory, AI Git Workflow, Codemod Auto-Fix ✅
Status: Complete
| Item | Status | Details |
|---|---|---|
| Cost & Performance Observatory | ✅ | record_cost_entry appends to ~/.vibeui/cost-log.jsonl (JSONL); get_cost_metrics computes per-provider aggregates + budget remaining; set_cost_limit + clear_cost_history; uses TokenUsage::estimated_cost_usd() pricing |
| AI Git Workflow | ✅ | suggest_branch_name (LLM-generated from task description); resolve_merge_conflict (AI merge resolution); generate_changelog (git log → Keep-a-Changelog format via LLM) |
| Codemod & Lint Auto-Fix | ✅ | run_autofix auto-detects clippy/eslint/ruff/gofmt/prettier, runs fix mode, returns AutofixResult with diff + file count; apply_autofix stages or reverts via git |
| Frontend: CostPanel | ✅ | CostPanel.tsx (💰 Cost tab) — per-provider cost breakdown, total spend, budget limit input, cost history table, clear history |
| Frontend: AutofixPanel | ✅ | AutofixPanel.tsx (🔧 Autofix tab) — auto-detect linter, run fix, diff preview with file count, apply/revert |
| Frontend: AI Git tools | ✅ | GitPanel.tsx — 🌿 AI Branch Name (suggest + copy), 📄 Generate Changelog (since-ref + editable result), ⚡ Resolve Merge Conflict (AI resolve + copy) |
| VibeCLI /autofix | ✅ | /autofix added to REPL COMMANDS array |
| UTF-8 safety | ✅ | Char-boundary-safe string slicing across 6 Rust files (tool_executor, tools, trace, commands, tui/mod, vim_editor); prevents panics on multi-byte characters |
7.17 Phase 46 — Provider Hardening + WCAG 2.1 AA Accessibility ✅
Status: Complete
| Item | Status | Details |
|——|——–|———|
| HTTP client timeouts (all providers) | ✅ | Every AI provider uses reqwest::Client::builder() with 90s request + 10s connect timeouts — Ollama, OpenAI, Claude, Gemini, Groq, OpenRouter, Azure OpenAI (previously only Bedrock, Copilot, BugBot had timeouts) |
| Copilot device flow hardening | ✅ | Token exchange and device flow use timeout-configured client; improved error handling (copilot.rs) |
| Gemini streaming improvements | ✅ | Improved SSE chunk parsing and error resilience (gemini.rs) |
| Agent stream buffer optimization | ✅ | Pre-allocated String::with_capacity(8192) + move instead of clone per LLM token (agent.rs) |
| WCAG 2.1 AA keyboard navigation | ✅ | 8 new keyboard shortcuts: Cmd+J AI panel, Cmd+ terminal, Cmd+Shift+P palette, Cmd+1-9 AI tabs, Cmd+Shift+E explorer, Cmd+Shift+G git; focus-visible outlines on all interactive elements |
| Command palette ARIA | ✅ | role=”dialog”, role=”combobox”, role=”listbox”, role=”option”, aria-activedescendant for screen reader navigation (CommandPalette.tsx) |
| Modal focus trap | ✅ | Tab cycles within modal; Escape closes; previous focus restored; aria-modal, aria-labelledby (Modal.tsx) |
| Agent status announcements | ✅ | aria-live=”polite” region announces status changes to screen readers (AgentPanel.tsx) |
| Skip-to-content link | ✅ | Hidden link appears on Tab focus, jumps past sidebar to editor (App.css + App.tsx) |
| OnboardingTour component | ✅ | First-run guided tour (localStorage gate), dismissible (OnboardingTour.tsx, 116 lines) |
| EmptyState + LoadingSpinner | ✅ | Reusable UI primitives for consistent empty/loading states (EmptyState.tsx, LoadingSpinner.tsx`) |
7.18 Test Coverage Expansion ✅
Status: Complete
| Item | Status | Details |
|---|---|---|
| provider.rs tests (22) | ✅ | TokenUsage total/add/estimated_cost_usd for all 6 pricing tiers (Claude Opus/Sonnet/Haiku, GPT-4o/4-turbo/3.5, Ollama free); ProviderConfig builder chain + serialization; base64 padding; Message/CompletionResponse serde |
| tools.rs tests (30) | ✅ | ToolCall::name/is_destructive/is_terminal/summary for all 10 tool types; ToolResult::ok/err/truncation; format_tool_result success/error/truncated; parse edge cases (defaults, unknown, multiple calls) |
| diff.rs tests (12) | ✅ | DiffEngine::generate_diff (identical/changed/added/removed/empty-to-content/content-to-empty); format_unified_diff headers/prefixes; apply_diff roundtrip; hunk line counts |
| search.rs tests (8) | ✅ | search_files matching/multi-file/case-sensitive/insensitive/no-match/hidden-files-skipped/invalid-regex/trimmed-content |
| executor.rs tests (18) | ✅ | is_safe_command blocklist (rm -rf, fork bomb, mkfs, dd, chmod 777, shred, device write) + safe commands; execute/execute_in; execute_with_approval gate; output_to_string stdout/stderr/both/empty |
| symbol.rs tests (16) | ✅ | Language::from_extension (11 exts + case-insensitive), is_source, as_str; SymbolKind::as_str (11 kinds); SymbolInfo::format_ref; extract_symbols for Rust/Python/Go/TypeScript/Unknown; deduplication |
| bedrock.rs SigV4 tests (13) | ✅ | sha256_hex known vectors; hmac_sha256 determinism/different-keys; derive_signing_key date/region variations; epoch_days_to_ymd (epoch/2000/2024/leap-day/year-end); sigv4_auth_header format/determinism/payload |
| collab error.rs tests (13) | ✅ | CollabError Display for all 8 variants; StatusCode conversion (NOT_FOUND/CONFLICT/UNAUTHORIZED/BAD_REQUEST/INTERNAL_SERVER_ERROR) |
| Total | ✅ | 508 tests passing across workspace (was 344) |
7.18b Test Coverage Expansion Round 2 ✅
Status: Complete
| Item | Status | Details |
|---|---|---|
| flow.rs tests (17) | ✅ | FlowTracker ring buffer eviction, dedup of opens/edits, context_string category filtering, limit param, unknown kind |
| syntax.rs tests (22) | ✅ | detect_language (Rust/Python/JS/Go/prose/empty), highlight with/without language, highlight_code_blocks fenced/unclosed/empty/multiple |
| diff_viewer.rs tests (9) | ✅ | colorize_diff ANSI (+green/-red/@@cyan), header lines not colored, context uncolored, mixed diff |
| memory.rs tests (6) | ✅ | combined_rules section headers, save/load roundtrip, missing file returns empty |
| chat.rs tests (14) | ✅ | Conversation role accessors, ChatEngine providers/conversations, out-of-bounds errors, serde |
| completion.rs tests (16) | ✅ | estimate_confidence (empty/short/medium/long, syntactic endings, uncertainty markers, cap at 1.0) |
| agent_executor.rs tests (10) | ✅ | truncate at/over limit, resolve paths, execute_call routing (unsupported tools, missing file) |
| mcp_server.rs tests (12) | ✅ | resolve paths, tool_defs (6 tools, required params, inputSchema), RpcOk/RpcErr serde |
| manager.rs tests (9) | ✅ | LspManager 4 default configs, client lookup, default() equivalence |
| workspace.rs tests (12) | ✅ | from_config, setting types, dedup, close_file, WorkspaceConfig serde |
| multi_agent.rs tests (10) | ✅ | AgentTask/Status/Result serde, AgentInstance clone, branch_name |
| scheduler.rs tests (16) | ✅ | format_interval (s/m/h/d), parse_duration edge cases, ScheduleExpr serde roundtrip |
| Total | ✅ | 664 tests passing across workspace (was 508; +153 new) |
7.18c Test Coverage Expansion Round 3 ✅
Status: Complete
| Item | Status | Details |
|---|---|---|
| index/mod.rs tests (30) | ✅ | score_symbol, tokenize, should_skip expanded, build/search/refresh with tempfiles, relevant_symbols ranking, serde |
| hooks.rs tests (37) | ✅ | type_name all 10 variants, tool_name, file_path, glob_match_path, segment_match, path filters, HookHandler/HookConfig/build_payload serde |
| buffer.rs tests (25) | ✅ | from_file, save/save_as, apply_edits batch, cursors, slice, line_len, Position/Range/Edit serde, undo/redo empty no-op |
| git.rs tests (19) | ✅ | list_branches, get_history, get_commit_files, get_diff, discard_changes, commit, switch_branch, pop_stash, struct serde |
| rules.rs tests (14) | ✅ | RulesLoader::load with/without frontmatter, glob_match, load_for_workspace dedup, load_steering clears path, Rule serde |
| background_agents.rs tests (14) | ✅ | cancel_run, Display/serde, AgentDef serde, AgentRun lifecycle, init, list/get runs |
| team.rs tests (10) | ✅ | context_string edge cases, TeamConfig serde, save/load, add_knowledge dedup, remove_knowledge |
| linear.rs tests (9) | ✅ | priority_label all values, LinearIssue serde, handle_linear_command subcommands |
| context.rs tests (8) | ✅ | with_index, with_open_files, token_budget, empty/missing inputs |
| config.rs tests (7) | ✅ | load_from_file, serde roundtrip, empty/invalid TOML |
| Total | ✅ | 1,898 tests passing across workspace (as of 2026-03-07) |
7.19 Phase 7.19 — Context Window Safety + Process Manager ✅
Status: Complete
| Item | Status | Details |
|---|---|---|
estimate_tokens() |
✅ | 1 token ≈ 4 chars + 8/msg overhead; fast O(n) pass |
prune_messages() |
✅ | Drains middle messages, preserves system+task+last-6; inserts placeholder |
AgentLoop.with_context_limit() |
✅ | Builder method; default 80 000 tokens |
| Context pruning in agent step loop | ✅ | Called at top of each step before stream_chat |
list_processes Tauri cmd |
✅ | ps aux (POSIX) / tasklist /FO CSV (Windows); sorted by memory |
kill_process(pid) Tauri cmd |
✅ | kill -TERM (POSIX) / taskkill /F (Windows) |
ProcessPanel.tsx |
✅ | Filterable table, 5s auto-refresh, mem KB/MB/GB, status emoji, Kill+confirm |
⚙️ Procs AI panel tab |
✅ | 32nd tab in App.tsx |
| Unit tests (5) | ✅ | estimate_empty, estimate_basic, prune_noop_under_budget, prune_removes_middle, prune_noop_too_few |
| Total tests | ✅ | 513 (508 + 5 new) |
7.20 Phase 7.20 — Streaming Metrics + REPL Session Commands ✅
Status: Complete
| Item | Status | Details |
|---|---|---|
/sessions REPL command |
✅ | Lists last 15 root sessions from SQLite with ID, status, steps, task preview, age, resume hint |
/sessions <prefix> filter |
✅ | Filters list by session ID prefix |
/resume SQLite fallback |
✅ | When JSONL trace has no messages sidecar, falls back to store.get_messages(id); pure SQLite lookup when no JSONL exists |
Token streaming speed (tok/s) |
✅ | streamStartMsRef + streamCharsRef → tokensPerSec = chars/4/secs; displayed as ⚡ badge |
| Total tokens display | ✅ | Estimated total tokens shown next to tok/s during streaming |
streamMetrics state in AgentPanel.tsx |
✅ | { tokensPerSec, ttftMs, totalTokens } — reset on each agent start |
| Metrics badge visibility | ✅ | Shown only when isRunning && streamMetrics — hides after completion |
7.21 Phase 7.21 — Real-time Chat Streaming ✅
Status: Complete
| Item | Status | Details |
|---|---|---|
stream_chat_message Tauri cmd |
✅ | Spawns tokio task; emits chat:chunk/chat:complete/chat:error events; cancels prior stream |
stop_chat_stream Tauri cmd |
✅ | Aborts background task via AbortHandle; adds partial text as final message |
AppState.chat_abort_handle |
✅ | Arc<Mutex<Option<AbortHandle>>> — same pattern as agent_abort_handle |
futures = "0.3" dependency |
✅ | Added to vibeui/src-tauri/Cargo.toml |
ChatResponse Clone |
✅ | Added #[derive(Clone)] so response can be emitted via Tauri events |
AIChat.tsx streaming mode |
✅ | invoke("stream_chat_message") kick-starts; chat:chunk listener builds text live |
| Live streaming text display | ✅ | Shows streamingText with blinking cursor while loading; replaces typing-indicator once first chunk arrives |
| Tok/s speed badge | ✅ | ⚡ N tok/s · ~M tokens line below streaming text; uses same streamStartMsRef/streamCharsRef pattern as AgentPanel |
| Stop button wired | ✅ | Calls stopMessage() which invokes stop_chat_stream + commits partial text |
useCallback/listen imports |
✅ | Clean TypeScript, tsc --noEmit passes |
| Tests | ✅ | 513 passing (no regression) |
8. VibeCody Wins — Competitive Position
With all phases complete, VibeCody is the only developer toolchain that combines:
| 1. | Open source + fully local | inspect every line, self-host, no telemetry |
| 2. | Rust native backend | sub-100ms startup, <50MB memory vs. 300MB+ Electron |
| 3. | Hooks system depth | matches Claude Code’s 17-event architecture; no Electron IDE has this |
| 4. | Ollama first-class | best local AI experience; Cursor/Windsurf treat it as an afterthought |
| 5. | CLI + GUI unified | VibeCLI and VibeUI share the same agent, same tools, same memory |
| 6. | OS-level sandbox | genuine security isolation, not just permission dialogs |
| 7. | 5+ providers | the only tool that’s truly multi-cloud + local AI |
| 8. | Privacy by design | embeddings computed locally via Ollama, code never leaves your machine |
| 9. | Shell environment policy | production-grade CI env control matching Codex CLI |
| 10. | Artifacts + Manager View | Antigravity-style orchestration in an open-source tool |
| 11. | WCAG 2.1 AA accessible | focus traps, ARIA roles, keyboard nav, skip links — no competitor matches this |
| 12. | Provider hardening | HTTP timeouts on every provider; no silent hangs on slow/down APIs |