Overview
VibeCody’s AutoResearch module is an autonomous iterative research agent that systematically explores parameter spaces, tracks experiments, and learns across runs. It supports 5 search strategies, 7 research domains, composite scoring with NaN detection, and cross-run learning through a persistent ResearchMemory. Safety rails prevent runaway experiments from consuming disk, memory, or time.
Time to complete: ~15 minutes
Prerequisites
- VibeCLI 0.5.1 or later installed and on your PATH
- At least one AI provider configured
- A project directory with code to optimize (examples use a database-backed service)
- For VibeUI: the desktop app running with the AutoResearch panel visible
Research Strategies
AutoResearch offers 5 strategies for exploring the experiment space:
| Strategy | Best For | How It Works |
|---|---|---|
| Greedy | Quick wins, single-variable tuning | Always picks the next experiment that improves the best known score |
| BeamSearch | Moderate exploration with pruning | Maintains top-K candidates, expands each, prunes worst |
| Genetic | Large parameter spaces | Mutation and crossover of top-performing configurations |
| Combinatorial | Exhaustive small spaces | Tries every combination of parameters |
| Bayesian | Expensive evaluations, sample-efficient | Builds a surrogate model, picks points with highest expected improvement |
Step-by-Step Walkthrough
Step 1: Create a new research session
vibecli
vibecli 0.5.1 | Provider: claude | Model: claude-sonnet-4-6
Type /help for commands, /quit to exit
> /autoresearch new "optimize database queries"
Research session created.
Session ID: rs_48d2a1c7
Domain: DatabaseTuning
Strategy: Greedy (default, change with /autoresearch config strategy <name>)
Metrics: [query_time_ms, throughput_qps, memory_mb]
Status: Ready
Hypothesis generated:
H1: "Adding composite indexes on frequently joined columns will reduce
query time by 30-50%"
Confidence: Medium
Predicted impact: High
Step 2: Configure the research strategy
Switch to Bayesian for sample-efficient exploration:
> /autoresearch config strategy bayesian
Strategy updated: Bayesian
Surrogate model: Gaussian Process
Acquisition: Expected Improvement
Initial samples: 5 (random)
Max experiments: 50
Timeout per run: 300s
Step 3: Start the research run
> /autoresearch start rs_48d2a1c7
Starting research session rs_48d2a1c7...
[Experiment 1/50] Random initial sample
Parameters: { index_type: "btree", pool_size: 10, cache_mb: 256 }
Results: query_time_ms=45.2, throughput_qps=1820, memory_mb=312
Score: 0.62
[Experiment 2/50] Random initial sample
Parameters: { index_type: "hash", pool_size: 20, cache_mb: 512 }
Results: query_time_ms=38.1, throughput_qps=2140, memory_mb=548
Score: 0.71
[Experiment 3/50] Random initial sample
Parameters: { index_type: "btree", pool_size: 5, cache_mb: 128 }
Results: query_time_ms=62.8, throughput_qps=1240, memory_mb=189
Score: 0.44
[Experiment 4/50] Bayesian acquisition (EI=0.18)
Parameters: { index_type: "btree", pool_size: 25, cache_mb: 512 }
Results: query_time_ms=31.4, throughput_qps=2580, memory_mb=540
Score: 0.79 (new best)
[Experiment 5/50] Bayesian acquisition (EI=0.12)
Parameters: { index_type: "btree", pool_size: 30, cache_mb: 768 }
Results: query_time_ms=28.7, throughput_qps=2820, memory_mb=798
Score: 0.83 (new best)
...
Session paused at experiment 12/50 (Ctrl+C or /autoresearch pause).
Best score: 0.89 (experiment 9)
Best params: { index_type: "btree", pool_size: 28, cache_mb: 640 }
Improvement: 43.6% over baseline
Step 4: Check session status
> /autoresearch status rs_48d2a1c7
Session: rs_48d2a1c7 — "optimize database queries"
Domain: DatabaseTuning
Strategy: Bayesian
Status: Paused (experiment 12/50)
Duration: 4m 38s
Best score: 0.89
Metric trends (last 5 experiments):
query_time_ms: 45.2 → 28.7 → 24.1 → 22.8 → 23.5
throughput_qps: 1820 → 2820 → 3100 → 3240 → 3180
memory_mb: 312 → 798 → 640 → 648 → 652
Hypotheses:
H1: "Composite indexes reduce query time" — CONFIRMED (38% reduction)
H2: "Pool size 25-30 is optimal" — CONFIRMED (plateau at 28)
H3: "Cache beyond 640MB has diminishing returns" — CONFIRMED
Safety violations: 0
Step 5: Analyze results
> /autoresearch analyze rs_48d2a1c7
Analysis for session rs_48d2a1c7:
Top 3 configurations:
Rank 1: score=0.89 | query=22.8ms, qps=3240, mem=648MB
{ index_type: "btree", pool_size: 28, cache_mb: 640 }
Rank 2: score=0.87 | query=24.1ms, qps=3100, mem=640MB
{ index_type: "btree", pool_size: 25, cache_mb: 640 }
Rank 3: score=0.83 | query=28.7ms, qps=2820, mem=798MB
{ index_type: "btree", pool_size: 30, cache_mb: 768 }
Parameter sensitivity:
cache_mb: HIGH (0.72 correlation with score)
pool_size: MEDIUM (0.54 correlation)
index_type: LOW (btree consistently wins)
Recommendations:
1. Use btree indexes (hash showed no advantage for this workload)
2. Set connection pool to 28 connections
3. Allocate 640MB query cache (beyond this, memory cost exceeds benefit)
4. Expected production improvement: ~50% latency reduction
Step 6: View cross-run learning
> /autoresearch lessons
Research Memory (cross-run lessons):
Lesson 1: [DatabaseTuning] "btree indexes outperform hash for mixed
read/write workloads" (from rs_48d2a1c7, confidence: high)
Lesson 2: [DatabaseTuning] "connection pool sweet spot is 25-30 for
PostgreSQL on 8-core machines" (from rs_48d2a1c7, confidence: high)
Lesson 3: [DatabaseTuning] "cache memory has diminishing returns beyond
~640MB for datasets under 10GB" (from rs_48d2a1c7, confidence: medium)
Success patterns: 3 recorded
Failure patterns: 1 recorded (hash index + small pool → low throughput)
Metric baselines: query_time_ms < 30ms, throughput_qps > 3000
Step 7: Export results
> /autoresearch export rs_48d2a1c7
Exported to: ./research_rs_48d2a1c7.tsv
Format: TSV (compatible with autoresearch tooling)
Columns: experiment_id, timestamp, index_type, pool_size, cache_mb,
query_time_ms, throughput_qps, memory_mb, composite_score
12 rows written.
Step 8: Get AI suggestions for next research
> /autoresearch suggest rs_48d2a1c7
Suggested next experiments (based on Bayesian model + lessons):
1. Test prepared statements with optimal config
Parameters: { ..., prepared_statements: true }
Expected improvement: 8-15% latency reduction
Rationale: Common optimization not yet explored
2. Test read replicas for read-heavy queries
Parameters: { ..., read_replicas: 2 }
Expected improvement: 40-60% throughput increase
Rationale: Scale-out strategy for read workloads
3. Test query plan caching
Parameters: { ..., plan_cache: true }
Expected improvement: 5-10% for repeated query patterns
VibeUI: AutoResearch Panel
The AutoResearch panel in VibeUI provides 5 tabs:
- Setup – Create sessions, select domain and strategy, define metrics
- Experiments – Live experiment tracking with progress bar and metric charts
- Analysis – Parameter sensitivity heatmaps and configuration rankings
- Memory – Cross-run lessons, success/failure patterns, baselines
- Export – Download TSV files and generate summary reports
Demo Recording
{
"meta": {
"title": "Autonomous Research Agent",
"description": "Create a research session, run Bayesian optimization experiments, and analyze results with cross-run learning.",
"duration_seconds": 240,
"version": "1.0.0"
},
"steps": [
{
"id": 1,
"action": "repl",
"commands": [
{ "input": "/autoresearch new \"optimize database queries\"", "delay_ms": 3000 },
{ "input": "/autoresearch config strategy bayesian", "delay_ms": 2000 }
],
"description": "Create session and configure Bayesian strategy"
},
{
"id": 2,
"action": "repl",
"commands": [
{ "input": "/autoresearch start rs_48d2a1c7", "delay_ms": 15000 }
],
"description": "Run research experiments"
},
{
"id": 3,
"action": "repl",
"commands": [
{ "input": "/autoresearch status rs_48d2a1c7", "delay_ms": 3000 },
{ "input": "/autoresearch analyze rs_48d2a1c7", "delay_ms": 4000 }
],
"description": "Check status and analyze results"
},
{
"id": 4,
"action": "repl",
"commands": [
{ "input": "/autoresearch lessons", "delay_ms": 3000 },
{ "input": "/autoresearch export rs_48d2a1c7", "delay_ms": 2000 },
{ "input": "/quit", "delay_ms": 500 }
],
"description": "View lessons, export results, and exit"
}
]
}
What’s Next
- Demo 48: OpenMemory – Persistent cognitive memory across sessions
- Demo 50: Warp-Style Features – Natural language commands and secret redaction
- Demo 52: Watch Mode & Sandbox – File watching and isolated execution