Persistence Complete Documentation

Comprehensive Provider-Based Persistence System for Orbit-RS

Table of Contents

  1. Overview
  2. Architecture
  3. Storage Backends
  4. Provider-Based System
  5. Configuration
  6. Performance Tuning
  7. Monitoring and Metrics
  8. Deployment Scenarios
  9. Migration Between Providers
  10. Security Considerations
  11. Best Practices

Overview

The Orbit server persistence system provides a comprehensive provider-based architecture that supports multiple storage backends. This allows you to choose the most appropriate storage solution for your deployment scenario, from simple in-memory storage to cloud-scale object storage systems.

Key Features


Architecture

Core Components

The persistence system is built around several key traits:

PersistenceProvider

Base trait for all providers with common functionality:

AddressableDirectoryProvider

Specialized trait for storing addressable (actor) lease information:

ClusterNodeProvider

Specialized trait for storing cluster node information:

Provider Registry

The PersistenceProviderRegistry manages multiple providers and provides:

Architecture Diagram

┌─────────────────────────────────────────────────────────────┐
│                    Application Layer                        │
└─────────────────────────────────────────────────────────────┘
                               │
                               ▼
┌──────────────────────────────────────────────────────--─────┐
│              Persistence Provider Registry                  │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐       │
│  │  Addressable │  │   Cluster    │  │   Default    │       │
│  │   Provider   │  │   Provider   │  │   Provider   │       │
│  └──────────────┘  └──────────────┘  └──────────────┘       │
└─────────────────────────────────────────────────────────────┘
                               │
            ┌──────────────────┼──────────────────┐
            │                  │                  │
            ▼                  ▼                  ▼
    ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
    │   Memory     │  │   RocksDB    │  │     S3       │
    │   Provider   │  │   Provider   │  │   Provider   │
    └──────────────┘  └──────────────┘  └──────────────┘
            │                  │                  │
            ▼                  ▼                  ▼
    ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
    │  In-Memory   │  │   RocksDB    │  │   AWS S3     │
    │   Storage    │  │   Storage    │  │   Storage    │
    └──────────────┘  └──────────────┘  └──────────────┘

Storage Backends

Local Storage Backends

1. Memory Provider (MemoryConfig)

Best for: Development, testing, high-performance scenarios with limited data

Features:

Configuration:

[providers.memory]
type = "Memory"
max_entries = 100000
disk_backup = { path = "/data/orbit-backup.json", sync_interval = 300, compression = "Gzip" }

Performance:

2. COW B+Tree Provider (CowBTreeConfig)

Best for: Read-heavy workloads, snapshot requirements, version control needs

Features:

Configuration:

[providers.cow_btree]
type = "CowBTree"
data_dir = "/var/lib/orbit"
max_keys_per_node = 64
wal_buffer_size = 1048576
enable_snapshots = true
snapshot_interval = 1000

Performance:

3. LSM-Tree Provider (LsmTreeConfig)

Best for: Write-heavy workloads, high-throughput ingestion

Features:

Configuration:

[providers.lsm_tree]
type = "LsmTree"
data_dir = "/var/lib/orbit"
memtable_size_mb = 64
max_levels = 7
level_size_multiplier = 10
compaction_threshold = 4
enable_bloom_filters = true
bloom_filter_bits_per_key = 10
enable_compression = true

Performance:

4. RocksDB Provider (RocksDbConfig)

Best for: Production deployments, ACID guarantees needed, high reliability requirements

Features:

Configuration:

[providers.rocksdb]
type = "RocksDB"
data_dir = "/var/lib/orbit"
compression = true
block_cache_mb = 256
write_buffer_mb = 64
max_background_jobs = 4

Performance:

Protocol-Specific Persistence

All Orbit-RS protocols use RocksDB for durable storage, with each protocol having its own isolated storage directory:

Protocol Storage Locations

Features

Storage Structure

data/
├── postgresql/rocksdb/    # PostgreSQL tables, schemas, indexes
├── mysql/rocksdb/          # MySQL tables and data
├── cql/rocksdb/            # CQL wide-column data
├── redis/rocksdb/          # Redis key-value data with TTL
├── cypher/rocksdb/         # Graph nodes and relationships
├── aql/rocksdb/            # Documents, collections, edges, graphs
└── graphrag/rocksdb/       # GraphRAG entities, relationships, embeddings

Configuration

Protocol persistence is automatically configured when starting the server. The data directory can be specified via:

orbit-server --data-dir /var/lib/orbit

Or in the configuration file:

[server]
data_dir = "/var/lib/orbit"

For more details, see Protocol Persistence Status.

Cloud Storage Backends

5. S3-Compatible Providers (S3Config)

Best for: AWS environments, MinIO deployments, high durability requirements

Features:

Configuration:

[providers.s3]
type = "S3"
endpoint = "https://s3.amazonaws.com"
region = "us-west-2"
bucket = "orbit-data"
access_key_id = "YOUR_ACCESS_KEY"
secret_access_key = "YOUR_SECRET_KEY"
prefix = "orbit"
enable_ssl = true
connection_timeout = 30
retry_count = 3

6. Azure Blob Storage (AzureConfig)

Best for: Microsoft Azure environments

Features:

Configuration:

[providers.azure]
type = "Azure"
account_name = "orbitdata"
account_key = "YOUR_ACCOUNT_KEY"
container_name = "orbit"
prefix = "orbit"
connection_timeout = 30
retry_count = 3

7. Google Cloud Storage (GoogleCloudConfig)

Best for: Google Cloud Platform environments

Features:

Configuration:

[providers.gcp]
type = "GoogleCloud"
project_id = "your-project-id"
bucket_name = "orbit-data"
credentials_path = "/path/to/service-account.json"
prefix = "orbit"
connection_timeout = 30
retry_count = 3

Distributed Storage Backends

8. etcd Provider (EtcdConfig)

Best for: Kubernetes environments, distributed systems requiring consistency

Features:

Configuration:

[providers.etcd]
type = "Etcd"
endpoints = ["http://localhost:2379", "http://localhost:22379", "http://localhost:32379"]
prefix = "orbit"
lease_ttl = 300
username = "orbit"
password = "secret"
ca_cert = "/path/to/ca.crt"
client_cert = "/path/to/client.crt"
client_key = "/path/to/client.key"

9. Redis Provider (RedisConfig)

Best for: High-performance caching, pub/sub scenarios

Features:

Configuration:

[providers.redis]
type = "Redis"
url = "redis://localhost:6379"
cluster_mode = false
database = 0
password = "secret"
prefix = "orbit"
pool_size = 10
retry_count = 3

10. Kubernetes Provider (KubernetesConfig)

Best for: Cloud-native Kubernetes deployments

Features:

Configuration:

[providers.kubernetes]
type = "Kubernetes"
namespace = "orbit"
config_map_name = "orbit-data"
secret_name = "orbit-secrets"
in_cluster = true

Specialized Backends

11. MinIO Provider (MinIOConfig)

Best for: Self-hosted object storage, hybrid cloud scenarios

Features:

12. Flash-Optimized Provider (FlashConfig)

Best for: High-performance scenarios with NVMe storage, multipathing requirements

Features:

Configuration:

[providers.flash]
type = "Flash"
data_dir = "/nvme/orbit"
enable_multipathing = true
io_depth = 32
block_size = 4096
cache_size = 1073741824  # 1GB
compression = "Lz4"
paths = ["/nvme0/orbit", "/nvme1/orbit", "/nvme2/orbit"]

13. Composite Provider (CompositeConfig)

Best for: High availability scenarios requiring failover

Features:

Configuration:

[providers.composite]
type = "Composite"
sync_interval = 60
health_check_interval = 30
failover_threshold = 3

[providers.composite.primary]
type = "Memory"
max_entries = 50000

[providers.composite.backup]
type = "S3"
endpoint = "http://localhost:9000"
region = "us-east-1"
bucket = "orbit-backup"
access_key_id = "minioadmin"
secret_access_key = "minioadmin"
enable_ssl = false

Provider-Based System

Storage Trait (TableStorage)

The storage abstraction layer provides a unified interface for all storage operations:

pub trait TableStorage: Send + Sync {
    // Schema management
    async fn create_table(&self, schema: &TableSchema) -> Result<()>;
    async fn get_table_schema(&self, table_name: &str) -> Result<Option<TableSchema>>;
    
    // Data operations
    async fn insert(&self, table: &str, row: &Row) -> Result<()>;
    async fn select(&self, query: &SelectQuery) -> Result<Vec<Row>>;
    async fn update(&self, table: &str, updates: &[Update], filter: &Filter) -> Result<u64>;
    async fn delete(&self, table: &str, filter: &Filter) -> Result<u64>;
    
    // Transaction support
    async fn begin_transaction(&self) -> Result<Transaction>;
    async fn commit_transaction(&self, tx: Transaction) -> Result<()>;
    async fn rollback_transaction(&self, tx: Transaction) -> Result<()>;
    
    // Maintenance
    async fn checkpoint(&self) -> Result<()>;
    async fn compact(&self) -> Result<()>;
    async fn metrics(&self) -> Result<StorageMetrics>;
}

Factory Pattern

StorageBackendFactory provides a unified way to create storage backend instances:

use orbit_protocols::postgres_wire::storage::{StorageBackendConfig, StorageBackendFactory};

// Create with default LSM storage
let executor = SqlExecutor::new().await?;

// Create with custom storage configuration
let config = StorageBackendConfig::Memory;
let executor = SqlExecutor::new_with_storage_config(config).await?;

// Use a specific storage backend
let storage = StorageBackendFactory::create_backend(&config).await?;
let executor = SqlExecutor::with_storage(storage);

Configuration

Configuration Methods

1. Programmatic Configuration

use orbit_server::persistence::*;

let config = PersistenceProviderConfig::builder()
    .with_memory("memory", MemoryConfig::default(), false)
    .with_s3("s3", s3_config, true)  // Default provider
    .build()?;

2. Configuration Files

TOML Format:

[defaults]
addressable = "s3"
cluster = "etcd"

[providers.memory]
type = "Memory"
max_entries = 100000

[providers.s3]
type = "S3"
endpoint = "https://s3.amazonaws.com"
region = "us-west-2"
bucket = "orbit-data"

JSON Format:

{
  "defaults": {
    "addressable": "s3",
    "cluster": "etcd"
  },
  "providers": {
    "memory": {
      "type": "Memory",
      "max_entries": 100000
    },
    "s3": {
      "type": "S3",
      "endpoint": "https://s3.amazonaws.com",
      "region": "us-west-2",
      "bucket": "orbit-data"
    }
  }
}

3. Environment Variables

# S3 configuration from environment
ORBIT_S3_ENDPOINT=https://s3.amazonaws.com
ORBIT_S3_REGION=us-west-2
ORBIT_S3_BUCKET=orbit-data
ORBIT_S3_ACCESS_KEY_ID=your_access_key
ORBIT_S3_SECRET_ACCESS_KEY=your_secret_key

Backend Selection

Set the persistence backend using one of these methods:

Supported values:


Performance Tuning

Backend Comparison

Feature COW B+ Tree LSM-Tree RocksDB Memory
Write Latency ~41μs ~38μs ~53μs <1μs
Read Latency <1μs ~0.3μs ~19μs <1μs
Memory Usage Low Medium High Low
Disk Usage Low Medium (write amp) High (write amp) None
CPU Usage Low Medium (compaction) High Low
Crash Recovery WAL Replay WAL + SSTable Built-in None
Production Ready ⚠️ Beta ⚠️ Beta ✅ Stable ✅ Stable

Tuning Guidelines

COW B+ Tree Tuning

# For read-heavy workloads
max_keys_per_node = 256        # Larger nodes, fewer levels
wal_buffer_size = 512000       # Smaller buffer, frequent flushes
enable_snapshots = true
snapshot_interval = 500        # More frequent snapshots

# For memory-constrained environments
max_keys_per_node = 32         # Smaller nodes
wal_buffer_size = 65536        # Small buffer
enable_snapshots = false       # Disable snapshots

LSM-Tree Tuning

# For write-heavy workloads
memtable_size_mb = 128         # Large memtables
max_levels = 8                 # More levels
level_size_multiplier = 8      # Smaller multiplier
compaction_threshold = 10      # Less frequent compaction

# For read-heavy workloads
memtable_size_mb = 32          # Smaller memtables
enable_bloom_filters = true    # Enable bloom filters
bloom_filter_bits_per_key = 15 # More accurate filters
block_cache_mb = 512           # Large block cache

RocksDB Tuning

# For production workloads
write_buffer_size = 268435456  # 256MB write buffers
block_cache_size = 1073741824  # 1GB block cache for reads
max_background_jobs = 8        # More compaction threads
compression = true             # Enable compression

Monitoring and Metrics

Key Metrics

All providers expose standard metrics:

Health Checks

Health checks provide three states:

Alerting Thresholds

# Recommended alert thresholds
write_latency_p95: 100ms      # 95th percentile write latency
read_latency_p95: 10ms        # 95th percentile read latency
error_rate: 0.1%              # Error rate threshold
memory_usage: 80%             # Memory usage threshold
disk_usage: 85%               # Disk usage threshold
compaction_backlog: 10        # LSM compaction queue size

Deployment Scenarios

Scenario 1: Development Environment

Configuration: Memory provider with disk backup

Scenario 2: Production on AWS

Configuration: S3 for persistence, ElastiCache Redis for caching

Scenario 3: Kubernetes Deployment

Configuration: etcd for cluster state, ConfigMaps for configuration

Scenario 4: Hybrid Cloud

Configuration: Composite provider with on-premises primary and cloud backup

Scenario 5: High-Performance Computing

Configuration: Flash-optimized storage with multipathing


Migration Between Providers

The system includes migration utilities for moving data between providers:

use orbit_server::persistence::migration::*;

let migration = DataMigration::new()
    .from_provider(old_provider)
    .to_provider(new_provider)
    .with_batch_size(1000)
    .with_validation(true);

let result = migration.execute().await?;
println!("Migrated {} leases and {} nodes", result.leases_migrated, result.nodes_migrated);

Migration Process

# Export from COW B+ Tree
orbit-cli export --backend=cow --data-dir=/var/lib/orbit --output=export.json

# Import to LSM-Tree
orbit-cli import --backend=lsm --data-dir=/var/lib/orbit-new --input=export.json

# Validate migration
orbit-cli validate --backend=lsm --data-dir=/var/lib/orbit-new

Security Considerations

Authentication

Encryption

Access Control


Best Practices

Development

  1. Always use transactions for multi-operation consistency
  2. Implement proper error handling for all persistence operations
  3. Monitor metrics in development to catch performance regressions
  4. Test with realistic data sizes to validate performance assumptions
  5. Use builder pattern for configuration to ensure type safety

Production

  1. Enable monitoring for all key metrics
  2. Set up automated backups of WAL and snapshots
  3. Test disaster recovery procedures regularly
  4. Use appropriate backend for your workload characteristics
  5. Configure resource limits to prevent resource exhaustion
  6. Implement graceful degradation for persistence failures

DevOps

  1. Automate deployment with configuration validation
  2. Use infrastructure as code for reproducible environments
  3. Implement blue-green deployments for zero-downtime updates
  4. Set up comprehensive monitoring and alerting
  5. Document runbooks for common operational procedures
  6. Regular capacity planning reviews based on growth trends

Historical Issues and Fixes

RocksDB Persistence Creation Issues (Resolved)

Root Causes Identified

  1. TieredTableStorage was Completely In-Memory
    • Problem: The TieredTableStorage implementation used HybridStorageManager which stored all data in-memory using RowBasedStore (a Vec<Row>). There was no disk persistence mechanism.
    • Fix: Integrated RocksDB into TieredTableStorage to provide durable storage.
  2. No Protocol-Specific Persistence Folders
    • Problem: Only generic directories were created. Each protocol should have its own persistence folder.
    • Fix: Created protocol-specific subdirectories (data/postgresql/rocksdb/, data/mysql/rocksdb/, etc.).
  3. Hot/Warm/Cold Directories Were Not Used
    • Problem: The data/hot, data/warm, and data/cold directories were created but never written to.
    • Fix: RocksDB now handles persistence directly, and tiered storage uses RocksDB as the backing store.
  4. RocksDB Was Initialized But Not Used by TieredTableStorage
    • Problem: A global RocksDbTableStorage was initialized but TieredTableStorage didn’t use it.
    • Fix: Modified TieredTableStorage to use RocksDB for schema and data persistence.
  5. Redis Path Inconsistency
    • Problem: Redis data files were being created at both root level and rocksdb level.
    • Fix: Standardized Redis to use data/redis/rocksdb/ consistently.

Persistence Verification

All persistence implementations have been verified to:

Verification Tests: See orbit/server/tests/persistence_verification.rs for comprehensive tests.


Conclusion

The provider-based persistence architecture transforms Orbit server from a simple in-memory system to a production-ready platform that can adapt to any deployment scenario. Whether you’re running a single development instance or a global distributed system, there’s a provider configuration that meets your needs.

The system’s flexibility, combined with comprehensive monitoring, security features, and migration capabilities, ensures that your Orbit deployment can evolve with your requirements without architectural changes.


Status: Production Ready