Storage System Documentation

Overview

The ArthaChain blockchain employs a sophisticated multi-layered storage system designed for high performance, durability, and scalability. The storage system is responsible for persisting blockchain data, state, and supporting AI model data in an efficient manner.

Architecture

The storage module is organized into several specialized components:

blockchain_node/src/storage/
├── blockchain_storage.rs     # Core blockchain data storage
├── hybrid_storage.rs         # Combined memory and disk storage
├── memmap_storage.rs         # Memory-mapped file storage
├── mod.rs                    # Module definition
├── rocksdb_storage.rs        # RocksDB-based persistent storage
├── svdb_storage.rs           # Specialized vector database storage
└── transaction.rs            # Transaction handling for storage

Core Components

1. Blockchain Storage (blockchain_storage.rs)

The BlockchainStorage is the high-level abstraction for storing blockchain data:

pub struct BlockchainStorage {
    // Block storage
    block_store: Box<dyn BlockStore>,
    
    // Transaction storage
    tx_store: Box<dyn TransactionStore>,
    
    // State storage
    state_store: Box<dyn StateStore>,
    
    // Receipt storage
    receipt_store: Box<dyn ReceiptStore>,
    
    // Storage configuration
    config: StorageConfig,
}

Key functionality:

Storage and retrieval of blocks, transactions, state, and receipts
Support for different storage backends
Optimized access patterns for blockchain data
Transaction batch support for atomicity

2. RocksDB Storage (rocksdb_storage.rs)

The RocksDbStorage provides persistent storage using the RocksDB key-value store:

pub struct RocksDbStorage {
    /// RocksDB database instance
    db: Arc<RwLock<Option<DB>>>,
    /// Path to database for reopening
    db_path: Arc<RwLock<Option<std::path::PathBuf>>>,
}

Key functionality:

Persistent storage of blockchain data
Blake3 hashing for data integrity
Thread-safe access via Arc<RwLock> wrapper
Automated database reopening if closed
Efficient key-value operations

3. Memory-Mapped Storage (memmap_storage.rs)

The MemMapStorage provides high-performance storage using memory-mapped files:

pub struct MemMapStorage {
    // Index file for fast lookups
    index_map: Arc<RwLock<Option<MmapMut>>>,
    // Data file for actual data storage
    data_map: Arc<RwLock<Option<MmapMut>>>,
    // Index file handle
    index_file: Arc<Mutex<Option<File>>>,
    // Data file handle
    data_file: Arc<Mutex<Option<File>>>,
    // In-memory index for faster lookups (hash -> offset)
    index: Arc<DashMap<Hash, (u64, u32, u8)>>, // Hash -> (offset, size, compression)
    // Secondary index for key lookups (key hash -> data hash)
    key_index: Arc<DashMap<u64, Hash>>,
    // Pending writes batch
    pending_batch: Arc<Mutex<Batch>>,
    // Semaphore for controlling concurrent operations
    semaphore: Arc<Semaphore>,
    // Configuration
    options: MemMapOptions,
    // Current data file size
    data_size: Arc<RwLock<u64>>,
    // Statistics
    stats: Arc<RwLock<StorageStats>>,
}

Key functionality:

Zero-copy data access via memory mapping
Adaptive compression with multiple algorithms (LZ4, Zstd, Brotli)
Small data inline storage optimization
Concurrent access control with fine-grained synchronization
Performance statistics tracking
Parallel data processing capabilities

4. SVDB Storage (svdb_storage.rs)

The SvdbStorage provides specialized storage for vector data, particularly useful for AI models:

pub struct SvdbStorage {
    /// HTTP client for SVDB API requests
    _client: Client,

    /// Base URL for SVDB API
    _base_url: String,

    /// Database instance
    db: Arc<RwLock<Option<DB>>>,

    /// Path to database for reopening
    db_path: Arc<RwLock<Option<std::path::PathBuf>>>,

    _data: HashMap<String, Vec<u8>>,
}

Key functionality:

Specialized storage for vector data used by AI models
HTTP client for remote SVDB API integration
Local RocksDB fallback for offline operation
Concurrent access support
Blake3 hashing for data integrity

5. Hybrid Storage (hybrid_storage.rs)

The HybridStorage combines multiple storage backends for optimized performance:

pub struct HybridStorage {
    /// RocksDB storage for on-chain data
    rocksdb: Box<dyn Storage>,

    /// SVDB storage for off-chain data
    svdb: Box<dyn Storage>,

    /// Size threshold (in bytes) for deciding between RocksDB and SVDB
    size_threshold: usize,
}

Key functionality:

Intelligent data routing based on size thresholds
Combined query capability across both backends
Seamless fallback between storage types
Optimized for different data access patterns
Size-based optimization for storage efficiency

Storage Interfaces

The storage system defines several key interfaces:

Storage Trait

#[async_trait]
pub trait Storage: Send + Sync + 'static {
    async fn store(&self, data: &[u8]) -> Result<Hash>;
    async fn retrieve(&self, hash: &Hash) -> Result<Option<Vec<u8>>>;
    async fn exists(&self, hash: &Hash) -> Result<bool>;
    async fn delete(&self, hash: &Hash) -> Result<()>;
    async fn verify(&self, hash: &Hash, data: &[u8]) -> Result<bool>;
    async fn close(&self) -> Result<()>;
    fn as_any(&self) -> &dyn Any;
    fn as_any_mut(&mut self) -> &mut dyn Any;
}

StorageInit Trait

#[async_trait]
pub trait StorageInit: Storage {
    async fn init(&mut self, path: Box<dyn AsRef<Path> + Send + Sync>) -> Result<()>;
}

Compression Features

The storage system supports multiple compression algorithms to optimize storage efficiency:

pub enum CompressionAlgorithm {
    None,
    LZ4,
    Zstd,
    Brotli,
    Adaptive,
}

Key compression features:

LZ4: Fast compression with good ratio
Zstd: Balanced compression with excellent ratio
Brotli: High compression ratio for cold data
Adaptive: Dynamically selects the best algorithm based on data characteristics
Threshold-based compression: Only compresses data above a certain size

Performance Characteristics

The storage system is designed for high performance:

Memory-mapped access: Sub-microsecond access times for cached data
Concurrent operations: Fine-grained locking for maximum parallelism
Batched operations: Efficient handling of multiple operations
Zero-copy access: Direct memory access without intermediate copying
Adaptive compression: Optimizes for both space and speed
Inline storage: Small values stored directly in index for faster access

Configuration

The storage system can be configured through multiple options:

MemMapOptions

pub struct MemMapOptions {
    pub compression_algorithm: CompressionAlgorithm,
    pub preload_data: bool,
    pub sync_writes: bool,
    pub index_cache_size: usize,
    pub enable_stats: bool,
}

Storage Configuration Examples

Production Configuration

let storage_config = StorageConfig {
    path: "/data/blockchain/production",
    cache_size_mb: 1024, 
    compression: CompressionAlgorithm::Adaptive,
    sync_writes: true,
    max_open_files: 10000,
    keep_log_file_num: 10,
    paranoid_checks: true,
    background_compaction: true,
    use_direct_io: true,
};

Testing Configuration

let testing_config = StorageConfig {
    path: "/tmp/blockchain/test",
    cache_size_mb: 128,
    compression: CompressionAlgorithm::None,
    sync_writes: false,
    max_open_files: 100,
    keep_log_file_num: 2,
    paranoid_checks: false,
    background_compaction: false,
    use_direct_io: false,
};

Monitoring and Instrumentation

The storage system provides comprehensive monitoring capabilities:

StorageStats

struct StorageStats {
    reads: u64,
    writes: u64,
    deletes: u64,
    cache_hits: u64,
    compression_saved: u64,
    read_time_ns: u64,
    write_time_ns: u64,
    compressed_blocks: u64,
    uncompressed_blocks: u64,
}

Metrics Collection

Operation Counts: Tracking of reads, writes, and deletes
Timing Information: Nanosecond-precision timing of operations
Cache Performance: Hit rates and efficiency metrics
Compression Efficiency: Space saved and compression ratio
Throughput Calculation: Operations per second

Error Handling Patterns

The storage system employs a comprehensive error handling strategy:

pub enum StorageError {
    NotFound,
    AlreadyExists,
    Corrupted(String),
    IO(String),
    Serialization(String),
    Compression(String),
    InvalidArgument(String),
    DatabaseError(String),
    LockError(String),
    Other(String),
}

Error handling patterns:

Specific Error Types: Detailed error categorization
Error Context: Additional information about the error source
Graceful Degradation: Fallback mechanisms for partial system failures
Retry Logic: Automatic retry for transient failures
Propagation Control: Appropriate error bubbling

Security Considerations

The storage system implements several security features:

Data Integrity

Cryptographic Hashing: Blake3 hashing for all stored data
Content Verification: Hash verification on retrieval
Atomic Operations: Transactional guarantees for multi-part operations

Data Protection

Encryption Options: Support for encrypted storage
Access Controls: Fine-grained permissions for data access
Isolation: Strong separation between different data types
Secure Deletion: Proper data wiping on deletion requests

// Example of using encrypted storage
let encrypted_storage = EncryptedStorageWrapper::new(
    Box::new(RocksDbStorage::new()),
    EncryptionKey::from_secure_source(),
    EncryptionAlgorithm::AES256GCM
);

Best Practices

Storage Access Patterns

Batch Operations: Group related operations to minimize overhead
Prefix Scanning: Use key prefixes for efficient range queries
Caching Hot Data: Cache frequently accessed data in memory
Write Amplification Awareness: Structure writes to minimize amplification

Storage Management

Regular Compaction: Schedule compaction during low-usage periods
Backup Strategies: Implement regular backup procedures
Monitoring: Set up alerts for storage-related metrics
Capacity Planning: Proactive management of storage growth

Future Enhancements

The storage system roadmap includes:

Sharded Storage: Horizontal scaling across multiple storage nodes
Tiered Storage: Automatic data migration between hot and cold storage
Enhanced Encryption: Additional encryption algorithms and performance optimizations
Remote Storage Integration: Support for cloud storage backends
Quantum-Resistant Cryptography: Preparation for post-quantum era

Overview​

Architecture​

Core Components​

1. Blockchain Storage (blockchain_storage.rs)​

2. RocksDB Storage (rocksdb_storage.rs)​

3. Memory-Mapped Storage (memmap_storage.rs)​

4. SVDB Storage (svdb_storage.rs)​

5. Hybrid Storage (hybrid_storage.rs)​

Storage Interfaces​

Storage Trait​

StorageInit Trait​

Compression Features​

Performance Characteristics​

Configuration​

MemMapOptions​

Storage Configuration Examples​

Production Configuration​

Testing Configuration​

Monitoring and Instrumentation​

StorageStats​

Metrics Collection​

Error Handling Patterns​

Security Considerations​

Data Integrity​

Data Protection​

Best Practices​

Storage Access Patterns​

Storage Management​

Future Enhancements​

Overview

Architecture

Core Components

1. Blockchain Storage (blockchain_storage.rs)

2. RocksDB Storage (rocksdb_storage.rs)

3. Memory-Mapped Storage (memmap_storage.rs)

4. SVDB Storage (svdb_storage.rs)

5. Hybrid Storage (hybrid_storage.rs)

Storage Interfaces

Storage Trait

StorageInit Trait

Compression Features

Performance Characteristics

Configuration

MemMapOptions

Storage Configuration Examples

Production Configuration

Testing Configuration

Monitoring and Instrumentation

StorageStats

Metrics Collection

Error Handling Patterns

Security Considerations

Data Integrity

Data Protection

Best Practices

Storage Access Patterns

Storage Management

Future Enhancements