Complete HTTP API reference for the VibeCLI daemon (vibecli serve).
Overview
Start the daemon:
vibecli --serve --port 7878 --provider ollama
On startup, a Bearer token is printed to stderr. All authenticated endpoints require this token.
| Property | Value |
|---|---|
| Base URL | http://localhost:7878 |
| Content-Type | application/json |
| Auth | Authorization: Bearer <token> |
| Max body | 1 MB |
| CORS origins | localhost, 127.0.0.1, tauri://localhost |
Authentication
All endpoints except /health, /webhook/github, /pair, /acp/v1/capabilities, and /ws/collab/:room_id require a Bearer token.
# Token is printed on startup:
# [serve] API token: abc123...
export VIBECLI_TOKEN="abc123..."
Unauthenticated requests receive:
{ "error": "Missing or invalid Authorization: Bearer <token>" }
Status: 401 Unauthorized
API Key Rotation
Restart the daemon to generate a new token. A fresh token is printed to stderr on each startup.
Error Handling
All errors return a consistent JSON structure:
{ "error": "Human-readable error message" }
| Status Code | Meaning |
|---|---|
400 |
Bad request (malformed JSON, missing fields) |
401 |
Missing or invalid Bearer token |
404 |
Resource not found (session, job, task) |
429 |
Rate limit exceeded |
500 |
Internal server error (provider failure) |
User-supplied input in error messages is sanitized (alphanumeric + -_. only, truncated to 200 chars).
Rate Limiting
Two rate limit tiers apply:
| Tier | Limit | Window | Applies to |
|---|---|---|---|
| Authenticated | 60 requests | 60 seconds | All authed endpoints |
| Public | 10 requests | 60 seconds | /health, /webhook/github, etc. |
When the limit is exceeded:
HTTP/1.1 429 Too Many Requests
Retry-After: 5
{ "error": "Rate limit exceeded. Try again shortly." }
Endpoints
GET /health
Liveness check. No authentication required.
Response 200 OK:
{
"status": "ok",
"version": "0.3.3"
}
curl http://localhost:7878/health
POST /chat
Single-turn chat completion (non-streaming). Collects the full response before returning.
Request body:
| Field | Type | Required | Description |
|---|---|---|---|
messages |
ChatMessage[] |
Yes | Conversation history |
model |
string |
No | Override the provider’s default model |
ChatMessage:
| Field | Type | Values |
|---|---|---|
role |
string |
"user", "assistant", "system" |
content |
string |
Message text |
Response 200 OK:
{
"content": "The AI response text..."
}
Example:
curl -X POST http://localhost:7878/chat \
-H "Authorization: Bearer $VIBECLI_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "Explain Rust lifetimes in 3 sentences"}
]
}'
Errors:
| Status | Cause |
|---|---|
500 |
LLM provider error: ... or Stream error: ... |
POST /chat/stream
Streaming chat completion via Server-Sent Events (SSE). Returns tokens as they are generated.
Request body: Same as POST /chat.
SSE event types:
| Event | Data | Description |
|---|---|---|
message (default) |
Token text | Incremental content chunk |
error |
Error string | Provider or stream error |
done |
"" (empty) |
Stream finished |
Keep-alive: Every 15 seconds.
Example:
curl -N -X POST http://localhost:7878/chat/stream \
-H "Authorization: Bearer $VIBECLI_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "system", "content": "You are a Rust expert."},
{"role": "user", "content": "Write a binary search function"}
]
}'
Response stream:
data: fn binary_search
data: <T: Ord>(arr: &[T],
data: target: &T) -> Option<usize>
event: done
data:
POST /agent
Start a background agent task. Returns immediately with a session ID. Subscribe to events via GET /stream/:session_id.
Request body:
| Field | Type | Required | Description |
|---|---|---|---|
task |
string |
Yes | Natural language task description |
approval |
string |
No | Override approval policy: "suggest", "auto-edit", or "full-auto" |
Response 200 OK:
{
"session_id": "a1b2c3d4e5f6..."
}
The session_id is a cryptographically random 128-bit hex string.
Example:
curl -X POST http://localhost:7878/agent \
-H "Authorization: Bearer $VIBECLI_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"task": "Add input validation to src/api/handler.rs",
"approval": "full-auto"
}'
GET /stream/:session_id
Subscribe to real-time agent events via SSE. Connect after calling POST /agent.
SSE event data (JSON):
Each event’s data field is a JSON object with these fields:
| Field | Type | Present when |
|---|---|---|
type |
string |
Always. One of: chunk, step, complete, error |
content |
string |
chunk, complete, error |
step_num |
number |
step |
tool_name |
string |
step |
success |
boolean |
step |
Event types:
| Type | Description |
|---|---|
chunk |
Incremental text from the LLM |
step |
A tool was executed (e.g., read_file, bash) |
complete |
Agent finished. content has the summary |
error |
Agent failed. content has the error message |
Example:
curl -N http://localhost:7878/stream/a1b2c3d4e5f6... \
-H "Authorization: Bearer $VIBECLI_TOKEN"
Response stream:
data: {"type":"chunk","content":"Reading the file..."}
data: {"type":"step","step_num":1,"tool_name":"read_file","success":true}
data: {"type":"chunk","content":"Adding validation..."}
data: {"type":"step","step_num":2,"tool_name":"write_file","success":true}
data: {"type":"complete","content":"Added input validation for all 3 handler functions."}
Errors:
| Status | Cause |
|---|---|
404 |
Session '<id>' not found |
GET /jobs
List all persisted job records, sorted by most recent first.
Response 200 OK:
[
{
"session_id": "a1b2c3d4...",
"task": "Add input validation",
"status": "complete",
"provider": "ollama",
"started_at": 1710700000000,
"finished_at": 1710700060000,
"summary": "Added input validation for all 3 handler functions."
}
]
JobRecord fields:
| Field | Type | Description |
|---|---|---|
session_id |
string |
Unique job identifier |
task |
string |
Original task description |
status |
string |
"running", "complete", "failed", "cancelled" |
provider |
string |
AI provider name |
started_at |
number |
Unix timestamp (milliseconds) |
finished_at |
number? |
Unix timestamp (milliseconds), null if running |
summary |
string? |
Completion summary or error message |
curl http://localhost:7878/jobs \
-H "Authorization: Bearer $VIBECLI_TOKEN"
GET /jobs/:id
Get a single job record by session ID.
Response 200 OK: A single JobRecord object (same schema as above).
curl http://localhost:7878/jobs/a1b2c3d4... \
-H "Authorization: Bearer $VIBECLI_TOKEN"
Errors: 404 if not found.
POST /jobs/:id/cancel
Cancel a running job. Removes the SSE stream and marks the job as cancelled.
Response 200 OK: The updated JobRecord with status: "cancelled".
curl -X POST http://localhost:7878/jobs/a1b2c3d4.../cancel \
-H "Authorization: Bearer $VIBECLI_TOKEN"
Errors: 404 if not found. If the job is already finished, it returns the record unchanged.
GET /sessions
HTML page listing all agent sessions. Useful for browsing in a web browser.
curl http://localhost:7878/sessions \
-H "Authorization: Bearer $VIBECLI_TOKEN"
GET /sessions.json
JSON list of all sessions (machine-readable alternative to /sessions).
curl http://localhost:7878/sessions.json \
-H "Authorization: Bearer $VIBECLI_TOKEN"
GET /view/:id
HTML page for a specific session with full conversation history.
curl http://localhost:7878/view/a1b2c3d4... \
-H "Authorization: Bearer $VIBECLI_TOKEN"
GET /share/:id
Read-only shareable session view. Displays a “Shared” banner at the top.
curl http://localhost:7878/share/a1b2c3d4... \
-H "Authorization: Bearer $VIBECLI_TOKEN"
WS /ws/collab/:room_id
WebSocket endpoint for real-time CRDT collaboration. No Bearer token required (public).
Connect:
websocat ws://localhost:7878/ws/collab/my-room
Message format: Binary CRDT sync messages from the vibe-collab crate. Messages are broadcast to all peers in the room.
Related REST endpoints (authenticated):
| Method | Path | Description |
|---|---|---|
POST |
/collab/rooms |
Create a new collaboration room |
GET |
/collab/rooms |
List all active rooms |
GET |
/collab/rooms/:room_id/peers |
List peers in a room |
POST /acp/v1/tasks
Create a task via the Agent Client Protocol. Runs the agent in full-auto mode.
Request body:
| Field | Type | Required | Description |
|---|---|---|---|
task |
string |
Yes | Task description |
context |
object |
No | Optional context |
context.workspace_root |
string |
No | Override workspace directory |
Response 201 Created:
{
"id": "acp-a1b2c3d4e5f6...",
"status": "pending",
"summary": "Task queued: Add tests for auth module",
"files_modified": [],
"steps_completed": 0
}
curl -X POST http://localhost:7878/acp/v1/tasks \
-H "Authorization: Bearer $VIBECLI_TOKEN" \
-H "Content-Type: application/json" \
-d '{"task": "Add tests for auth module"}'
GET /acp/v1/tasks/:id
Get ACP task status.
Response 200 OK:
{
"id": "acp-a1b2c3d4e5f6...",
"status": "complete",
"summary": "ACP task completed",
"files_modified": [],
"steps_completed": 0
}
curl http://localhost:7878/acp/v1/tasks/acp-a1b2c3d4e5f6... \
-H "Authorization: Bearer $VIBECLI_TOKEN"
GET /acp/v1/capabilities
ACP capability advertisement. No authentication required.
curl http://localhost:7878/acp/v1/capabilities
POST /webhook/github
GitHub App webhook endpoint. No Bearer token required. Uses HMAC-SHA256 signature verification via the X-Hub-Signature-256 header.
Headers:
| Header | Description |
|---|---|
X-GitHub-Event |
Event type (e.g., pull_request) |
X-Hub-Signature-256 |
HMAC-SHA256 signature |
Response 200 OK:
{
"status": "reviewed",
"findings": 3,
"summary": "Found 3 issues in the PR"
}
Unhandled event types return {"status": "ignored"}.
POST /webhook/skill/:skill_name
Trigger a skill by its webhook_trigger name. Requires authentication.
curl -X POST http://localhost:7878/webhook/skill/deploy-prod \
-H "Authorization: Bearer $VIBECLI_TOKEN" \
-d '{"ref": "main"}'
Response 200 OK:
{
"triggered": true,
"skill": "deploy-production",
"body_length": 16
}
Errors: 404 if no skill has a matching webhook_trigger.
Memory Endpoints
The OpenMemory cognitive memory engine provides persistent, queryable memory across two storage layers: the cognitive store (5-sector vector graph) and the verbatim drawer store (lossless 800-char chunks).
All memory endpoints require authentication (Authorization: Bearer $VIBECLI_TOKEN).
Cognitive store
| Method | Path | Description |
|---|---|---|
POST |
/memory/add |
Add a memory entry (sector auto-classified) |
POST |
/memory/query |
Semantic search with composite scoring |
GET |
/memory/list |
List all memories (supports ?sector= and ?limit= params) |
GET |
/memory/stats |
Counts by sector, storage size, encryption status, drawer count |
POST |
/memory/fact |
Add a temporal fact (auto-closes previous same-key fact) |
GET |
/memory/facts |
List active and closed facts |
POST |
/memory/decay |
Run exponential salience decay |
POST |
/memory/consolidate |
Sleep-cycle consolidation — merge weak memories, generate reflections |
GET |
/memory/export |
Export all memories as JSON |
POST |
/memory/import |
Import memories from mem0 / Zep / native JSON |
POST |
/memory/pin |
Pin a memory by ID (exempt from decay and purge) |
POST |
/memory/unpin |
Remove the pin flag from a memory |
POST |
/memory/delete |
Delete a memory permanently by ID |
Verbatim drawer layer (MemPalace)
| Method | Path | Description |
|---|---|---|
POST |
/memory/chunk |
Ingest text as verbatim 800-char chunks |
GET |
/memory/drawers/stats |
Drawer count, Wing/Room distribution, dedup hit rate |
POST |
/memory/tunnel |
Create a cross-project waypoint between two memories |
POST |
/memory/auto-tunnel |
Auto-detect and create tunnel waypoints across stores |
GET |
/memory/benchmark |
Run LongMemEval recall@K (supports ?k= param, default 5) |
4-layer context
| Method | Path | Description |
|---|---|---|
POST |
/memory/context |
Get the full 4-layer context block the agent would receive |
# Add a cognitive memory
curl -X POST http://localhost:7878/memory/add \
-H "Authorization: Bearer $VIBECLI_TOKEN" \
-H "Content-Type: application/json" \
-d '{"content": "The auth module uses JWT with RS256 signing"}'
# Semantic query
curl -X POST http://localhost:7878/memory/query \
-H "Authorization: Bearer $VIBECLI_TOKEN" \
-H "Content-Type: application/json" \
-d '{"query": "How does authentication work?", "limit": 5}'
# Ingest raw text as verbatim chunks
curl -X POST http://localhost:7878/memory/chunk \
-H "Authorization: Bearer $VIBECLI_TOKEN" \
-H "Content-Type: application/json" \
-d '{"content": "Runbook step 3: restart payment-worker pods after migration 0047..."}'
# Get 4-layer agent context
curl -X POST http://localhost:7878/memory/context \
-H "Authorization: Bearer $VIBECLI_TOKEN" \
-H "Content-Type: application/json" \
-d '{"query": "deployment process", "l1_tokens": 700, "l2_limit": 8}'
# Run recall benchmark at k=5
curl "http://localhost:7878/memory/benchmark?k=5" \
-H "Authorization: Bearer $VIBECLI_TOKEN"
# Pin a memory (survives decay and consolidation purge)
curl -X POST http://localhost:7878/memory/pin \
-H "Authorization: Bearer $VIBECLI_TOKEN" \
-H "Content-Type: application/json" \
-d '{"id": "mem_c2a9"}'
# Remove a pin
curl -X POST http://localhost:7878/memory/unpin \
-H "Authorization: Bearer $VIBECLI_TOKEN" \
-H "Content-Type: application/json" \
-d '{"id": "mem_c2a9"}'
# Delete a memory permanently
curl -X POST http://localhost:7878/memory/delete \
-H "Authorization: Bearer $VIBECLI_TOKEN" \
-H "Content-Type: application/json" \
-d '{"id": "mem_d1f6"}'
/memory/pin, /memory/unpin, /memory/delete responses:
{ "ok": true }
All three endpoints return {"ok": false, "error": "memory not found"} when the id does not match any stored memory.
/memory/stats response:
{
"total_memories": 47,
"total_waypoints": 12,
"total_facts": 9,
"total_drawers": 132,
"encryption": false,
"sectors": [
{ "sector": "Semantic", "count": 18, "avg_salience": 0.82, "pinned_count": 3 },
{ "sector": "Episodic", "count": 14, "avg_salience": 0.61, "pinned_count": 1 },
{ "sector": "Procedural", "count": 11, "avg_salience": 0.75, "pinned_count": 2 },
{ "sector": "Reflective", "count": 3, "avg_salience": 0.90, "pinned_count": 3 },
{ "sector": "Emotional", "count": 1, "avg_salience": 0.45, "pinned_count": 0 }
],
"embedding_dim": 512,
"embedding_compression_ratio": 10.7,
"embedding_backend": "turboquant"
}
The embedding_* fields describe the in-process vector index. embedding_backend is currently always "turboquant" (~3 bits/dim compressed); clients should treat the field as opaque so future backends (e.g. "hnsw_f32", "candle_bert") can be added without breaking parsers.
/memory/benchmark response:
{
"k": 5,
"total_memories": 47,
"total_drawers": 132,
"probes": 20,
"hits_cognitive": 15,
"hits_verbatim": 18,
"recall_cognitive": 0.75,
"recall_verbatim": 0.90,
"recall_combined": 0.975,
"cases": [
{ "sector": "episodic", "query": "What was the last project I worked on?", "found_cognitive": true, "found_verbatim": true },
{ "sector": "preference", "query": "What coding style does the user prefer?", "found_cognitive": false, "found_verbatim": true }
]
}
Tauri Commands (VibeUI)
The following Tauri commands are available for the VibeUI frontend via invoke(). All commands are registered in vibeui/src-tauri/src/lib.rs.
Memory commands
| Command | Arguments | Returns |
|---|---|---|
openmemory_stats |
— | { total_memories, total_waypoints, total_facts, total_drawers, sectors[] } |
openmemory_add |
content: string, tags?: string[] |
{ id, sector, tags, weight, created_at } |
openmemory_query |
query: string, limit?: number, sector?: string |
QueryResult[] |
openmemory_list |
offset?: number, limit?: number, sector?: string |
Memory[] |
openmemory_facts |
— | TemporalFact[] |
openmemory_add_fact |
subject, predicate, object: string |
TemporalFact |
openmemory_decay |
— | { decayed: number, remaining: number } |
openmemory_consolidate |
— | { merged: number, reflections_created: number } |
openmemory_export |
— | string (markdown) |
openmemory_enable_encryption |
key?: string |
{ enabled: boolean } |
openmemory_pin |
id: string |
{ ok: boolean } |
openmemory_unpin |
id: string |
{ ok: boolean } |
openmemory_delete |
id: string |
{ ok: boolean } |
Verbatim drawer commands
| Command | Arguments | Returns |
|---|---|---|
openmemory_drawer_stats |
— | { total_drawers, wings[], rooms[] } |
openmemory_layered_context |
query: string, l1_tokens?: number, l2_limit?: number |
{ l1_essential_story, l2_scoped[], l3_drawers[], total_drawers } |
openmemory_benchmark |
k?: number |
{ k, recall_cognitive, recall_verbatim, recall_combined, cases[], … } |
// Example: run benchmark and display results
const result = await invoke<BenchmarkResult>('openmemory_benchmark', { k: 5 });
console.log(`Combined Recall@5: ${(result.recall_combined * 100).toFixed(1)}%`);
// Example: get layered context for a query
const ctx = await invoke('openmemory_layered_context', {
query: 'deployment process',
l1Tokens: 700,
l2Limit: 8,
});
GET /pair
Generate a one-time device pairing URL. No authentication required.
curl http://localhost:7878/pair
Response 200 OK:
{
"url": "http://localhost:7878/pair?token=...",
"token": "abc123...",
"instructions": "Open this URL in your device's browser to pair with this VibeCLI instance."
}
Goals — /v1/goals/*
Durable execution-intent primitive. See design/goal/README.md for the full data model + cross-client surface table.
| Method | Path | Purpose |
|---|---|---|
POST |
/v1/goals |
Create. Body: { title, statement?, workspace?, success_criteria?, tags?, parent_goal_id? }. Returns 201 + Goal. 409 on (workspace, title) conflict. |
GET |
/v1/goals |
List. Query: status, workspace, tag, limit (default 50). Returns { goals, count }. |
GET |
/v1/goals/:id |
Detail. Returns { goal, links }. |
PATCH |
/v1/goals/:id |
Partial update. workspace and parent_goal_id use double-Option semantics (omit / null / value). Editing statement or success_criteria auto-clears current_plan. |
DELETE |
/v1/goals/:id |
Hard delete; links cascade. |
POST |
/v1/goals/:id/plan |
Generate ExecutionPlan via PlannerAgent. Body: { provider?, model? }. Per-request override honored when both are present and the API key resolves (env or profile_settings.db); otherwise falls back to the daemon’s configured provider. Response carries plan_provider_override_applied, plan_provider_requested, plan_model_requested. |
POST |
/v1/goals/:id/link |
Attach a session / job / recap / note. Body: { kind, target_id, note? }. |
POST |
/v1/goals/:id/start |
Spawn a session bound to this goal. Body: { task?, provider?, model? }. Returns { session_id, link_id, goal_id }. |
POST |
/v1/goals/:id/recap |
Cross-store aggregate recap. Body: { provider?, model? }. When both fields are supplied and the named provider is reachable, the daemon synthesizes the headline + bullets via LLM and sets recap_synthesizer: "llm". Otherwise the heuristic fold runs and recap_synthesizer: "heuristic" is returned. Per-target recaps are still collected via two-phase store split. |
GET |
/v1/goals/:id/children |
One-level tree query. Returns { parent_goal_id, children, count }. Walk iteratively for a full tree. |
GET |
/v1/goals/:id/tree |
Recursive subtree walk. Query: depth (default 3, clamped to 1..10). Returns { root, depth, tree: { goal, children, [truncated, direct_child_count, cycle] } }. Re-visited nodes set cycle: true so clients don’t recurse. |
GET |
/v1/goals/current |
Look up the pinned goal. Query: workspace? (empty / absent = global slot). Returns { workspace, goal_id, pinned_at, goal } or { workspace, goal_id: null }. |
PUT |
/v1/goals/current |
Pin or replace the current goal. Body: { goal_id, workspace? }. 404 if goal_id is unknown. |
DELETE |
/v1/goals/current |
Clear the pin. Query: workspace?. Returns { workspace, removed }. |
Watch (curated proxies)
The Apple Watch / Wear OS never hits /v1/* directly. Use the curated read-only /watch/goals pair instead.
| Method | Path | Notes |
|---|---|---|
GET |
/watch/goals |
Active goals only, ≤25, slim payload ({ id, title, status, workspace_label, updated_at, pinned }). pinned is true when the row is the workspace-specific OR global current pin (G11.2). Older daemons that lack the field decode cleanly on the watch side. |
GET |
/watch/goals/:id |
Envelope { goal, links, pinned } (G12.1 added pinned: bool at the envelope level so the watch detail / tile can render the ★ without a separate /v1/goals/current lookup; watch never hits /v1/*). |
POST |
/watch/goals/:id/start |
Curated wrapper for do_v1_exec_goal_start. Body: { task? }. Returns { session_id, link_id, goal_id }. |