Orbit-RS Graph Database
The Orbit-RS graph database provides a comprehensive, distributed graph data management system with support for multiple query languages and protocol adapters. This documentation covers all aspects of the graph database functionality.
Table of Contents
- Overview
- Architecture
- Graph Storage
- Query Languages
- Protocol Adapters
- Graph Machine Learning
- Performance & Scalability
- Examples
- API Reference
Overview
Orbit-RS implements a full-featured graph database with:
- Multi-model support: Native graph storage with time-series integration
- Multiple query languages: Cypher, AQL (ArrangoDB Query Language), and OrbitQL
- Protocol compatibility: Neo4j Bolt protocol support
- Distributed architecture: Built on the Orbit actor system
- High performance: In-memory storage with persistent backends
- Machine Learning: Graph analytics and ML pipeline support
Architecture
The graph database is built on several key components:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Query Layer │ │ Protocol Layer │ │ ML Pipeline │
│ ┌───────────┐ │ │ ┌───────────┐ │ │ ┌───────────┐ │
│ │ Cypher │ │ │ │ Neo4j Bolt│ │ │ │ Analytics │ │
│ │ AQL │ │ │ │ Adapter │ │ │ │ Embeddings│ │
│ │ OrbitQL │ │ │ └───────────┘ │ │ │ Algorithms│ │
│ └───────────┘ │ └─────────────────┘ │ └───────────┘ │
└─────────────────┘ └─────────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────┐
│ Graph Engine │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Query Execution Engine │ │
│ │ • Pattern Matching • Traversal Optimization │ │
│ │ • Join Processing • Memory Management │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Storage Layer │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────┐ │
│ │ In-Memory │ │ Distributed │ │ Time-Series │ │
│ │ Storage │ │ Storage │ │ Integration │ │
│ │ │ │ │ │ │ │
│ │ • Node Index │ │ • Replication │ │ • Temporal │ │
│ │ • Relationship │ │ • Partitioning │ │ Queries │ │
│ │ Index │ │ • Consistency │ │ • Analytics │ │
│ │ • Label Index │ │ │ │ │ │
│ └─────────────────┘ └─────────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
Graph Storage
Core Data Structures
GraphNode
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct GraphNode {
pub id: NodeId,
pub labels: Vec<String>,
pub properties: HashMap<String, serde_json::Value>,
pub created_at: chrono::DateTime<chrono::Utc>,
pub updated_at: chrono::DateTime<chrono::Utc>,
}
GraphRelationship
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
pub struct GraphRelationship {
pub id: RelationshipId,
pub start_node: NodeId,
pub end_node: NodeId,
pub rel_type: String,
pub properties: HashMap<String, serde_json::Value>,
pub created_at: chrono::DateTime<chrono::Utc>,
pub updated_at: chrono::DateTime<chrono::Utc>,
}
Storage Traits
The graph storage is built on a flexible trait system:
#[async_trait]
pub trait GraphStorage: Send + Sync {
// Node operations
async fn create_node(&self, labels: Vec<String>, properties: HashMap<String, serde_json::Value>) -> OrbitResult<GraphNode>;
async fn get_node(&self, node_id: &NodeId) -> OrbitResult<Option<GraphNode>>;
async fn update_node(&self, node_id: &NodeId, properties: HashMap<String, serde_json::Value>) -> OrbitResult<()>;
async fn delete_node(&self, node_id: &NodeId) -> OrbitResult<bool>;
// Label operations
async fn add_labels(&self, node_id: &NodeId, labels: Vec<String>) -> OrbitResult<()>;
async fn remove_labels(&self, node_id: &NodeId, labels: Vec<String>) -> OrbitResult<()>;
async fn get_labels(&self, node_id: &NodeId) -> OrbitResult<Vec<String>>;
// Relationship operations
async fn create_relationship(&self, start_node: &NodeId, end_node: &NodeId, rel_type: String, properties: HashMap<String, serde_json::Value>) -> OrbitResult<GraphRelationship>;
async fn get_relationships(&self, node_id: &NodeId, direction: Direction, rel_types: Option<Vec<String>>) -> OrbitResult<Vec<GraphRelationship>>;
async fn get_relationship(&self, rel_id: &RelationshipId) -> OrbitResult<Option<GraphRelationship>>;
async fn update_relationship(&self, rel_id: &RelationshipId, properties: HashMap<String, serde_json::Value>) -> OrbitResult<()>;
async fn delete_relationship(&self, rel_id: &RelationshipId) -> OrbitResult<bool>;
// Query operations
async fn find_nodes_by_label(&self, label: &str, property_filters: Option<HashMap<String, serde_json::Value>>, limit: Option<usize>) -> OrbitResult<Vec<GraphNode>>;
async fn count_nodes_by_label(&self, label: &str) -> OrbitResult<u64>;
async fn count_relationships_by_type(&self, rel_type: &str) -> OrbitResult<u64>;
}
In-Memory Storage Implementation
The InMemoryGraphStorage provides high-performance graph operations:
- Node indexing: Fast lookup by NodeId and labels
- Relationship indexing: Efficient traversal in all directions
- Label indexing: Quick queries by node labels
- Property filtering: Complex property-based searches
- Concurrent access: Thread-safe with RwLock protection
Query Languages
Cypher Query Language
Orbit-RS supports a subset of Cypher with the following features:
Basic Pattern Matching
// Find all Person nodes
MATCH (p:Person) RETURN p
// Find relationships
MATCH (p:Person)-[r:KNOWS]->(f:Person) RETURN p, r, f
// Property filters
MATCH (p:Person {name: 'Alice'}) RETURN p
Node Creation
// Create a node
CREATE (p:Person {name: 'Alice', age: 30}) RETURN p
// Create relationships
CREATE (a:Person {name: 'Alice'})-[r:KNOWS {since: '2020'}]->(b:Person {name: 'Bob'}) RETURN a, r, b
Complex Patterns
// Multiple hops
MATCH (a:Person)-[:KNOWS*1..3]-(b:Person) WHERE a.name = 'Alice' RETURN b
// Optional matches
MATCH (p:Person) OPTIONAL MATCH (p)-[r:WORKS_FOR]->(c:Company) RETURN p, r, c
AQL (ArrangoDB Query Language)
AQL provides document-oriented querying with graph capabilities:
Basic Queries
// Find documents
FOR person IN Person
FILTER person.age > 25
RETURN person
// Graph traversal
FOR vertex, edge, path IN 1..3 OUTBOUND 'Person/alice' KNOWS
RETURN {vertex: vertex, edge: edge, path: path}
Aggregations
// Count and group
FOR person IN Person
COLLECT city = person.city WITH COUNT INTO count
RETURN {city: city, count: count}
// Advanced analytics
FOR person IN Person
FILTER person.age != null
RETURN {
avg_age: AVERAGE(person.age),
min_age: MIN(person.age),
max_age: MAX(person.age)
}
OrbitQL
OrbitQL is the native query language with multi-model support:
Unified Queries
// Graph and time-series in one query
SELECT person.name, ts.value
FROM Person person
JOIN TimeSeries ts ON person.sensor_id = ts.series_id
WHERE person.age > 25 AND ts.timestamp > NOW() - INTERVAL '1 hour'
Protocol Adapters
Neo4j Bolt Protocol
Orbit-RS provides compatibility with Neo4j clients through the Bolt protocol adapter:
use orbit_protocols::bolt::BoltServer;
// Start Bolt server
let server = BoltServer::new(graph_storage).await?;
server.listen("127.0.0.1:7687").await?;
Supported Features
- Authentication: Username/password authentication
- Transactions: ACID transaction support
- Streaming: Efficient result streaming
- Type system: Full Neo4j type compatibility
Graph Machine Learning
Node Embeddings
use orbit_shared::graph::ml::NodeEmbeddings;
// Generate node embeddings
let embeddings = NodeEmbeddings::new()
.dimensions(128)
.walk_length(40)
.num_walks(10)
.generate(&graph_storage).await?;
Graph Analytics
Centrality Measures
// Betweenness centrality
let centrality = graph_storage.betweenness_centrality().await?;
// PageRank
let pagerank = graph_storage.pagerank(0.85, 100).await?;
Community Detection
// Louvain community detection
let communities = graph_storage.louvain_communities().await?;
Performance & Scalability
Memory Management
- Efficient indexing: O(1) lookups for nodes and relationships
- Memory limits: Configurable memory usage limits
- Garbage collection: Automatic cleanup of expired data
- Compression: Property value compression
Distributed Features
- Horizontal scaling: Distribute graph across multiple nodes
- Replication: Master-slave replication with consistency
- Partitioning: Smart partitioning strategies
- Load balancing: Query load distribution
Performance Benchmarks
| Operation | Throughput | Latency |
|---|---|---|
| Node creation | 100K ops/sec | 0.01ms |
| Node lookup | 500K ops/sec | 0.002ms |
| 1-hop traversal | 200K ops/sec | 0.005ms |
| 3-hop traversal | 50K ops/sec | 0.02ms |
Examples
Basic Graph Operations
use orbit_shared::graph::{InMemoryGraphStorage, GraphStorage};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let storage = InMemoryGraphStorage::new();
// Create nodes
let alice = storage.create_node(
vec!["Person".to_string()],
[("name", "Alice"), ("age", "30")].iter().cloned().collect()
).await?;
let bob = storage.create_node(
vec!["Person".to_string()],
[("name", "Bob"), ("age", "25")].iter().cloned().collect()
).await?;
// Create relationship
let friendship = storage.create_relationship(
&alice.id,
&bob.id,
"KNOWS".to_string(),
[("since", "2020")].iter().cloned().collect()
).await?;
// Query
let people = storage.find_nodes_by_label("Person", None, None).await?;
println!("Found {} people", people.len());
Ok(())
}
Cypher Query Execution
use orbit_protocols::cypher::GraphEngine;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let storage = Arc::new(InMemoryGraphStorage::new());
let engine = GraphEngine::new(storage);
// Execute Cypher query
let result = engine.execute_query(
"CREATE (a:Person {name: 'Alice'}) RETURN a"
).await?;
println!("Created {} nodes", result.nodes.len());
// Pattern matching
let result = engine.execute_query(
"MATCH (p:Person) WHERE p.name = 'Alice' RETURN p"
).await?;
for node in result.nodes {
println!("Found: {:?}", node);
}
Ok(())
}
Integration with Time Series
use orbit_shared::graph::InMemoryGraphStorage;
use orbit_shared::timeseries::TimeSeriesEngine;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let graph = InMemoryGraphStorage::new();
let ts_engine = TimeSeriesEngine::new();
// Create sensor node
let sensor = graph.create_node(
vec!["Sensor".to_string()],
[("name", "temperature_01"), ("location", "room_1")].iter().cloned().collect()
).await?;
// Link to time series
ts_engine.create_series(&sensor.id.to_string(), Default::default()).await?;
// Insert data
ts_engine.insert_point(
&sensor.id.to_string(),
chrono::Utc::now(),
25.5
).await?;
// Query combined data
let sensors = graph.find_nodes_by_label("Sensor", None, None).await?;
for sensor in sensors {
let data = ts_engine.query_range(
&sensor.id.to_string(),
chrono::Utc::now() - chrono::Duration::hours(1),
chrono::Utc::now()
).await?;
println!("Sensor {}: {} data points", sensor.id, data.len());
}
Ok(())
}
API Reference
Node Operations
// Create node
async fn create_node(labels: Vec<String>, properties: Properties) -> Result<GraphNode>
// Get node by ID
async fn get_node(node_id: &NodeId) -> Result<Option<GraphNode>>
// Update node properties
async fn update_node(node_id: &NodeId, properties: Properties) -> Result<()>
// Delete node (and relationships)
async fn delete_node(node_id: &NodeId) -> Result<bool>
// Label management
async fn add_labels(node_id: &NodeId, labels: Vec<String>) -> Result<()>
async fn remove_labels(node_id: &NodeId, labels: Vec<String>) -> Result<()>
async fn get_labels(node_id: &NodeId) -> Result<Vec<String>>
Relationship Operations
// Create relationship
async fn create_relationship(
start: &NodeId,
end: &NodeId,
rel_type: String,
properties: Properties
) -> Result<GraphRelationship>
// Get relationships
async fn get_relationships(
node_id: &NodeId,
direction: Direction,
rel_types: Option<Vec<String>>
) -> Result<Vec<GraphRelationship>>
// Update relationship
async fn update_relationship(
rel_id: &RelationshipId,
properties: Properties
) -> Result<()>
// Delete relationship
async fn delete_relationship(rel_id: &RelationshipId) -> Result<bool>
Query Operations
// Find nodes by label
async fn find_nodes_by_label(
label: &str,
property_filters: Option<Properties>,
limit: Option<usize>
) -> Result<Vec<GraphNode>>
// Count operations
async fn count_nodes_by_label(label: &str) -> Result<u64>
async fn count_relationships_by_type(rel_type: &str) -> Result<u64>
This comprehensive graph database system provides the foundation for complex graph applications, from social networks to knowledge graphs, with high performance and scalability built on the Orbit actor system.