API Reference

Complete HTTP API reference for the VibeCLI daemon (vibecli serve).

Overview

Start the daemon:

vibecli --serve --port 7878 --provider ollama

On startup, a Bearer token is printed to stderr. All authenticated endpoints require this token.

Property	Value
Base URL	`http://localhost:7878`
Content-Type	`application/json`
Auth	`Authorization: Bearer <token>`
Max body	1 MB
CORS origins	`localhost`, `127.0.0.1`, `tauri://localhost`

Authentication

All endpoints except /health, /webhook/github, /pair, /acp/v1/capabilities, and /ws/collab/:room_id require a Bearer token.

# Token is printed on startup:
#   [serve] API token: abc123...

export VIBECLI_TOKEN="abc123..."

Unauthenticated requests receive:

{ "error": "Missing or invalid Authorization: Bearer <token>" }

Status: 401 Unauthorized

API Key Rotation

Restart the daemon to generate a new token. A fresh token is printed to stderr on each startup.

Error Handling

All errors return a consistent JSON structure:

{ "error": "Human-readable error message" }

Status Code	Meaning
`400`	Bad request (malformed JSON, missing fields)
`401`	Missing or invalid Bearer token
`404`	Resource not found (session, job, task)
`429`	Rate limit exceeded
`500`	Internal server error (provider failure)

User-supplied input in error messages is sanitized (alphanumeric + -_. only, truncated to 200 chars).

Rate Limiting

Two rate limit tiers apply:

Tier	Limit	Window	Applies to
Authenticated	60 requests	60 seconds	All authed endpoints
Public	10 requests	60 seconds	`/health`, `/webhook/github`, etc.

When the limit is exceeded:

HTTP/1.1 429 Too Many Requests
Retry-After: 5

{ "error": "Rate limit exceeded. Try again shortly." }

Endpoints

GET /health

Liveness check. No authentication required.

Response 200 OK:

{
  "status": "ok",
  "version": "0.3.3"
}

curl http://localhost:7878/health

POST /chat

Single-turn chat completion (non-streaming). Collects the full response before returning.

Request body:

Field	Type	Required	Description
`messages`	`ChatMessage[]`	Yes	Conversation history
`model`	`string`	No	Override the provider’s default model

ChatMessage:

Field	Type	Values
`role`	`string`	`"user"`, `"assistant"`, `"system"`
`content`	`string`	Message text

Response 200 OK:

{
  "content": "The AI response text..."
}

Example:

curl -X POST http://localhost:7878/chat \
  -H "Authorization: Bearer $VIBECLI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Explain Rust lifetimes in 3 sentences"}
    ]
  }'

Errors:

Status	Cause
`500`	`LLM provider error: ...` or `Stream error: ...`

POST /chat/stream

Streaming chat completion via Server-Sent Events (SSE). Returns tokens as they are generated.

Request body: Same as POST /chat.

SSE event types:

Event	Data	Description
`message` (default)	Token text	Incremental content chunk
`error`	Error string	Provider or stream error
`done`	`""` (empty)	Stream finished

Keep-alive: Every 15 seconds.

Example:

curl -N -X POST http://localhost:7878/chat/stream \
  -H "Authorization: Bearer $VIBECLI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "system", "content": "You are a Rust expert."},
      {"role": "user", "content": "Write a binary search function"}
    ]
  }'

Response stream:

data: fn binary_search

data: <T: Ord>(arr: &[T],

data:  target: &T) -> Option<usize>

event: done
data:

POST /agent

Start a background agent task. Returns immediately with a session ID. Subscribe to events via GET /stream/:session_id.

Request body:

Field	Type	Required	Description
`task`	`string`	Yes	Natural language task description
`approval`	`string`	No	Override approval policy: `"suggest"`, `"auto-edit"`, or `"full-auto"`

Response 200 OK:

{
  "session_id": "a1b2c3d4e5f6..."
}

The session_id is a cryptographically random 128-bit hex string.

Example:

curl -X POST http://localhost:7878/agent \
  -H "Authorization: Bearer $VIBECLI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "task": "Add input validation to src/api/handler.rs",
    "approval": "full-auto"
  }'

GET /stream/:session_id

Subscribe to real-time agent events via SSE. Connect after calling POST /agent.

SSE event data (JSON):

Each event’s data field is a JSON object with these fields:

Field	Type	Present when
`type`	`string`	Always. One of: `chunk`, `step`, `complete`, `error`
`content`	`string`	`chunk`, `complete`, `error`
`step_num`	`number`	`step`
`tool_name`	`string`	`step`
`success`	`boolean`	`step`

Event types:

Type	Description
`chunk`	Incremental text from the LLM
`step`	A tool was executed (e.g., `read_file`, `bash`)
`complete`	Agent finished. `content` has the summary
`error`	Agent failed. `content` has the error message

Example:

curl -N http://localhost:7878/stream/a1b2c3d4e5f6... \
  -H "Authorization: Bearer $VIBECLI_TOKEN"

Response stream:

data: {"type":"chunk","content":"Reading the file..."}

data: {"type":"step","step_num":1,"tool_name":"read_file","success":true}

data: {"type":"chunk","content":"Adding validation..."}

data: {"type":"step","step_num":2,"tool_name":"write_file","success":true}

data: {"type":"complete","content":"Added input validation for all 3 handler functions."}

Errors:

Status	Cause
`404`	`Session '<id>' not found`

GET /jobs

List all persisted job records, sorted by most recent first.

Response 200 OK:

[
  {
    "session_id": "a1b2c3d4...",
    "task": "Add input validation",
    "status": "complete",
    "provider": "ollama",
    "started_at": 1710700000000,
    "finished_at": 1710700060000,
    "summary": "Added input validation for all 3 handler functions."
  }
]

JobRecord fields:

Field	Type	Description
`session_id`	`string`	Unique job identifier
`task`	`string`	Original task description
`status`	`string`	`"running"`, `"complete"`, `"failed"`, `"cancelled"`
`provider`	`string`	AI provider name
`started_at`	`number`	Unix timestamp (milliseconds)
`finished_at`	`number?`	Unix timestamp (milliseconds), null if running
`summary`	`string?`	Completion summary or error message

curl http://localhost:7878/jobs \
  -H "Authorization: Bearer $VIBECLI_TOKEN"

GET /jobs/:id

Get a single job record by session ID.

Response 200 OK: A single JobRecord object (same schema as above).

curl http://localhost:7878/jobs/a1b2c3d4... \
  -H "Authorization: Bearer $VIBECLI_TOKEN"

Errors: 404 if not found.

POST /jobs/:id/cancel

Cancel a running job. Removes the SSE stream and marks the job as cancelled.

Response 200 OK: The updated JobRecord with status: "cancelled".

curl -X POST http://localhost:7878/jobs/a1b2c3d4.../cancel \
  -H "Authorization: Bearer $VIBECLI_TOKEN"

Errors: 404 if not found. If the job is already finished, it returns the record unchanged.

GET /sessions

HTML page listing all agent sessions. Useful for browsing in a web browser.

curl http://localhost:7878/sessions \
  -H "Authorization: Bearer $VIBECLI_TOKEN"

GET /sessions.json

JSON list of all sessions (machine-readable alternative to /sessions).

curl http://localhost:7878/sessions.json \
  -H "Authorization: Bearer $VIBECLI_TOKEN"

GET /view/:id

HTML page for a specific session with full conversation history.

curl http://localhost:7878/view/a1b2c3d4... \
  -H "Authorization: Bearer $VIBECLI_TOKEN"

GET /share/:id

Read-only shareable session view. Displays a “Shared” banner at the top.

curl http://localhost:7878/share/a1b2c3d4... \
  -H "Authorization: Bearer $VIBECLI_TOKEN"

WS /ws/collab/:room_id

WebSocket endpoint for real-time CRDT collaboration. No Bearer token required (public).

Connect:

websocat ws://localhost:7878/ws/collab/my-room

Message format: Binary CRDT sync messages from the vibe-collab crate. Messages are broadcast to all peers in the room.

Related REST endpoints (authenticated):

Method	Path	Description
`POST`	`/collab/rooms`	Create a new collaboration room
`GET`	`/collab/rooms`	List all active rooms
`GET`	`/collab/rooms/:room_id/peers`	List peers in a room

POST /acp/v1/tasks

Create a task via the Agent Client Protocol. Runs the agent in full-auto mode.

Request body:

Field	Type	Required	Description
`task`	`string`	Yes	Task description
`context`	`object`	No	Optional context
`context.workspace_root`	`string`	No	Override workspace directory

Response 201 Created:

{
  "id": "acp-a1b2c3d4e5f6...",
  "status": "pending",
  "summary": "Task queued: Add tests for auth module",
  "files_modified": [],
  "steps_completed": 0
}

curl -X POST http://localhost:7878/acp/v1/tasks \
  -H "Authorization: Bearer $VIBECLI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"task": "Add tests for auth module"}'

GET /acp/v1/tasks/:id

Get ACP task status.

Response 200 OK:

{
  "id": "acp-a1b2c3d4e5f6...",
  "status": "complete",
  "summary": "ACP task completed",
  "files_modified": [],
  "steps_completed": 0
}

curl http://localhost:7878/acp/v1/tasks/acp-a1b2c3d4e5f6... \
  -H "Authorization: Bearer $VIBECLI_TOKEN"

GET /acp/v1/capabilities

ACP capability advertisement. No authentication required.

curl http://localhost:7878/acp/v1/capabilities

POST /webhook/github

GitHub App webhook endpoint. No Bearer token required. Uses HMAC-SHA256 signature verification via the X-Hub-Signature-256 header.

Headers:

Header	Description
`X-GitHub-Event`	Event type (e.g., `pull_request`)
`X-Hub-Signature-256`	HMAC-SHA256 signature

Response 200 OK:

{
  "status": "reviewed",
  "findings": 3,
  "summary": "Found 3 issues in the PR"
}

Unhandled event types return {"status": "ignored"}.

POST /webhook/skill/:skill_name

Trigger a skill by its webhook_trigger name. Requires authentication.

curl -X POST http://localhost:7878/webhook/skill/deploy-prod \
  -H "Authorization: Bearer $VIBECLI_TOKEN" \
  -d '{"ref": "main"}'

Response 200 OK:

{
  "triggered": true,
  "skill": "deploy-production",
  "body_length": 16
}

Errors: 404 if no skill has a matching webhook_trigger.

Memory Endpoints

The OpenMemory cognitive memory engine provides persistent, queryable memory across two storage layers: the cognitive store (5-sector vector graph) and the verbatim drawer store (lossless 800-char chunks).

All memory endpoints require authentication (Authorization: Bearer $VIBECLI_TOKEN).

Cognitive store

Method	Path	Description
`POST`	`/memory/add`	Add a memory entry (sector auto-classified)
`POST`	`/memory/query`	Semantic search with composite scoring
`GET`	`/memory/list`	List all memories (supports `?sector=` and `?limit=` params)
`GET`	`/memory/stats`	Counts by sector, storage size, encryption status, drawer count
`POST`	`/memory/fact`	Add a temporal fact (auto-closes previous same-key fact)
`GET`	`/memory/facts`	List active and closed facts
`POST`	`/memory/decay`	Run exponential salience decay
`POST`	`/memory/consolidate`	Sleep-cycle consolidation — merge weak memories, generate reflections
`GET`	`/memory/export`	Export all memories as JSON
`POST`	`/memory/import`	Import memories from mem0 / Zep / native JSON
`POST`	`/memory/pin`	Pin a memory by ID (exempt from decay and purge)
`POST`	`/memory/unpin`	Remove the pin flag from a memory
`POST`	`/memory/delete`	Delete a memory permanently by ID

Verbatim drawer layer (MemPalace)

Method	Path	Description
`POST`	`/memory/chunk`	Ingest text as verbatim 800-char chunks
`GET`	`/memory/drawers/stats`	Drawer count, Wing/Room distribution, dedup hit rate
`POST`	`/memory/tunnel`	Create a cross-project waypoint between two memories
`POST`	`/memory/auto-tunnel`	Auto-detect and create tunnel waypoints across stores
`GET`	`/memory/benchmark`	Run LongMemEval recall@K (supports `?k=` param, default 5)

4-layer context

Method	Path	Description
`POST`	`/memory/context`	Get the full 4-layer context block the agent would receive

# Add a cognitive memory
curl -X POST http://localhost:7878/memory/add \
  -H "Authorization: Bearer $VIBECLI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"content": "The auth module uses JWT with RS256 signing"}'

# Semantic query
curl -X POST http://localhost:7878/memory/query \
  -H "Authorization: Bearer $VIBECLI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "How does authentication work?", "limit": 5}'

# Ingest raw text as verbatim chunks
curl -X POST http://localhost:7878/memory/chunk \
  -H "Authorization: Bearer $VIBECLI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"content": "Runbook step 3: restart payment-worker pods after migration 0047..."}'

# Get 4-layer agent context
curl -X POST http://localhost:7878/memory/context \
  -H "Authorization: Bearer $VIBECLI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query": "deployment process", "l1_tokens": 700, "l2_limit": 8}'

# Run recall benchmark at k=5
curl "http://localhost:7878/memory/benchmark?k=5" \
  -H "Authorization: Bearer $VIBECLI_TOKEN"

# Pin a memory (survives decay and consolidation purge)
curl -X POST http://localhost:7878/memory/pin \
  -H "Authorization: Bearer $VIBECLI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"id": "mem_c2a9"}'

# Remove a pin
curl -X POST http://localhost:7878/memory/unpin \
  -H "Authorization: Bearer $VIBECLI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"id": "mem_c2a9"}'

# Delete a memory permanently
curl -X POST http://localhost:7878/memory/delete \
  -H "Authorization: Bearer $VIBECLI_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"id": "mem_d1f6"}'

/memory/pin, /memory/unpin, /memory/delete responses:

{ "ok": true }

All three endpoints return {"ok": false, "error": "memory not found"} when the id does not match any stored memory.

/memory/stats response:

{
  "total_memories": 47,
  "total_waypoints": 12,
  "total_facts": 9,
  "total_drawers": 132,
  "encryption": false,
  "sectors": [
    { "sector": "Semantic",   "count": 18, "avg_salience": 0.82, "pinned_count": 3 },
    { "sector": "Episodic",   "count": 14, "avg_salience": 0.61, "pinned_count": 1 },
    { "sector": "Procedural", "count": 11, "avg_salience": 0.75, "pinned_count": 2 },
    { "sector": "Reflective", "count":  3, "avg_salience": 0.90, "pinned_count": 3 },
    { "sector": "Emotional",  "count":  1, "avg_salience": 0.45, "pinned_count": 0 }
  ],
  "embedding_dim": 512,
  "embedding_compression_ratio": 10.7,
  "embedding_backend": "turboquant"
}

The embedding_* fields describe the in-process vector index. embedding_backend is currently always "turboquant" (~3 bits/dim compressed); clients should treat the field as opaque so future backends (e.g. "hnsw_f32", "candle_bert") can be added without breaking parsers.

/memory/benchmark response:

{
  "k": 5,
  "total_memories": 47,
  "total_drawers": 132,
  "probes": 20,
  "hits_cognitive": 15,
  "hits_verbatim": 18,
  "recall_cognitive": 0.75,
  "recall_verbatim": 0.90,
  "recall_combined": 0.975,
  "cases": [
    { "sector": "episodic",   "query": "What was the last project I worked on?", "found_cognitive": true,  "found_verbatim": true  },
    { "sector": "preference", "query": "What coding style does the user prefer?", "found_cognitive": false, "found_verbatim": true  }
  ]
}

Tauri Commands (VibeUI)

The following Tauri commands are available for the VibeUI frontend via invoke(). All commands are registered in vibeui/src-tauri/src/lib.rs.

Memory commands

Command	Arguments	Returns
`openmemory_stats`	—	`{ total_memories, total_waypoints, total_facts, total_drawers, sectors[] }`
`openmemory_add`	`content: string, tags?: string[]`	`{ id, sector, tags, weight, created_at }`
`openmemory_query`	`query: string, limit?: number, sector?: string`	`QueryResult[]`
`openmemory_list`	`offset?: number, limit?: number, sector?: string`	`Memory[]`
`openmemory_facts`	—	`TemporalFact[]`
`openmemory_add_fact`	`subject, predicate, object: string`	`TemporalFact`
`openmemory_decay`	—	`{ decayed: number, remaining: number }`
`openmemory_consolidate`	—	`{ merged: number, reflections_created: number }`
`openmemory_export`	—	`string` (markdown)
`openmemory_enable_encryption`	`key?: string`	`{ enabled: boolean }`
`openmemory_pin`	`id: string`	`{ ok: boolean }`
`openmemory_unpin`	`id: string`	`{ ok: boolean }`
`openmemory_delete`	`id: string`	`{ ok: boolean }`

Verbatim drawer commands

Command	Arguments	Returns
`openmemory_drawer_stats`	—	`{ total_drawers, wings[], rooms[] }`
`openmemory_layered_context`	`query: string, l1_tokens?: number, l2_limit?: number`	`{ l1_essential_story, l2_scoped[], l3_drawers[], total_drawers }`
`openmemory_benchmark`	`k?: number`	`{ k, recall_cognitive, recall_verbatim, recall_combined, cases[], … }`

// Example: run benchmark and display results
const result = await invoke<BenchmarkResult>('openmemory_benchmark', { k: 5 });
console.log(`Combined Recall@5: ${(result.recall_combined * 100).toFixed(1)}%`);

// Example: get layered context for a query
const ctx = await invoke('openmemory_layered_context', {
  query: 'deployment process',
  l1Tokens: 700,
  l2Limit: 8,
});

GET /pair

Generate a one-time device pairing URL. No authentication required.

curl http://localhost:7878/pair

Response 200 OK:

{
  "url": "http://localhost:7878/pair?token=...",
  "token": "abc123...",
  "instructions": "Open this URL in your device's browser to pair with this VibeCLI instance."
}

Goals — `/v1/goals/*`

Durable execution-intent primitive. See design/goal/README.md for the full data model + cross-client surface table.

Method	Path	Purpose
`POST`	`/v1/goals`	Create. Body: `{ title, statement?, workspace?, success_criteria?, tags?, parent_goal_id? }`. Returns 201 + `Goal`. 409 on `(workspace, title)` conflict.
`GET`	`/v1/goals`	List. Query: `status`, `workspace`, `tag`, `limit` (default 50). Returns `{ goals, count }`.
`GET`	`/v1/goals/:id`	Detail. Returns `{ goal, links }`.
`PATCH`	`/v1/goals/:id`	Partial update. `workspace` and `parent_goal_id` use double-`Option` semantics (omit / `null` / value). Editing `statement` or `success_criteria` auto-clears `current_plan`.
`DELETE`	`/v1/goals/:id`	Hard delete; links cascade.
`POST`	`/v1/goals/:id/plan`	Generate `ExecutionPlan` via `PlannerAgent`. Body: `{ provider?, model? }`. Per-request override honored when both are present and the API key resolves (env or `profile_settings.db`); otherwise falls back to the daemon’s configured provider. Response carries `plan_provider_override_applied`, `plan_provider_requested`, `plan_model_requested`.
`POST`	`/v1/goals/:id/link`	Attach a session / job / recap / note. Body: `{ kind, target_id, note? }`.
`POST`	`/v1/goals/:id/start`	Spawn a session bound to this goal. Body: `{ task?, provider?, model? }`. Returns `{ session_id, link_id, goal_id }`.
`POST`	`/v1/goals/:id/recap`	Cross-store aggregate recap. Body: `{ provider?, model? }`. When both fields are supplied and the named provider is reachable, the daemon synthesizes the headline + bullets via LLM and sets `recap_synthesizer: "llm"`. Otherwise the heuristic fold runs and `recap_synthesizer: "heuristic"` is returned. Per-target recaps are still collected via two-phase store split.
`GET`	`/v1/goals/:id/children`	One-level tree query. Returns `{ parent_goal_id, children, count }`. Walk iteratively for a full tree.
`GET`	`/v1/goals/:id/tree`	Recursive subtree walk. Query: `depth` (default 3, clamped to 1..10). Returns `{ root, depth, tree: { goal, children, [truncated, direct_child_count, cycle] } }`. Re-visited nodes set `cycle: true` so clients don’t recurse.
`GET`	`/v1/goals/current`	Look up the pinned goal. Query: `workspace?` (empty / absent = global slot). Returns `{ workspace, goal_id, pinned_at, goal }` or `{ workspace, goal_id: null }`.
`PUT`	`/v1/goals/current`	Pin or replace the current goal. Body: `{ goal_id, workspace? }`. 404 if `goal_id` is unknown.
`DELETE`	`/v1/goals/current`	Clear the pin. Query: `workspace?`. Returns `{ workspace, removed }`.

Watch (curated proxies)

The Apple Watch / Wear OS never hits /v1/* directly. Use the curated read-only /watch/goals pair instead.

Method	Path	Notes
`GET`	`/watch/goals`	Active goals only, ≤25, slim payload (`{ id, title, status, workspace_label, updated_at, pinned }`). `pinned` is `true` when the row is the workspace-specific OR global current pin (G11.2). Older daemons that lack the field decode cleanly on the watch side.
`GET`	`/watch/goals/:id`	Envelope `{ goal, links, pinned }` (G12.1 added `pinned: bool` at the envelope level so the watch detail / tile can render the ★ without a separate `/v1/goals/current` lookup; watch never hits `/v1/*`).
`POST`	`/watch/goals/:id/start`	Curated wrapper for `do_v1_exec_goal_start`. Body: `{ task? }`. Returns `{ session_id, link_id, goal_id }`.

API Reference

Overview

Authentication

API Key Rotation

Error Handling

Rate Limiting

Endpoints

GET /health

POST /chat

POST /chat/stream

POST /agent

GET /stream/:session_id

GET /jobs

GET /jobs/:id

POST /jobs/:id/cancel

GET /sessions

GET /sessions.json

GET /view/:id

GET /share/:id

WS /ws/collab/:room_id

POST /acp/v1/tasks

GET /acp/v1/tasks/:id

GET /acp/v1/capabilities

POST /webhook/github

POST /webhook/skill/:skill_name

Memory Endpoints

Cognitive store

Verbatim drawer layer (MemPalace)

4-layer context

Tauri Commands (VibeUI)

Memory commands

Verbatim drawer commands

GET /pair

Goals — /v1/goals/*

Watch (curated proxies)

Goals — `/v1/goals/*`