# Tiered Storage A lightweight, pluggable tiered storage library that orchestrates caching across hot (memory), warm (disk/database), and cold (S3/object storage) tiers. ## Features - **Cascading Containment Model**: Hot ⊆ Warm ⊆ Cold (lower tiers contain all data from upper tiers) - **Pluggable Backends**: Bring your own Redis, Postgres, SQLite, or use built-in implementations - **Automatic Promotion**: Configurable eager/lazy promotion strategies for cache warming - **TTL Management**: Per-key TTL with automatic expiration and renewal - **Prefix Invalidation**: Efficiently delete groups of keys by prefix - **Bootstrap Support**: Warm up caches from lower tiers on startup - **Compression**: Optional transparent gzip compression - **TypeScript First**: Full type safety with comprehensive TSDoc comments - **Zero Forced Dependencies**: Only require what you use ## Installation ```bash npm install tiered-storage # or bun add tiered-storage ``` ## Quick Start ```typescript import { TieredStorage, MemoryStorageTier, DiskStorageTier, S3StorageTier } from 'tiered-storage'; const storage = new TieredStorage({ tiers: { hot: new MemoryStorageTier({ maxSizeBytes: 100 * 1024 * 1024 }), // 100MB warm: new DiskStorageTier({ directory: './cache' }), cold: new S3StorageTier({ bucket: 'my-bucket', region: 'us-east-1', credentials: { accessKeyId: process.env.AWS_ACCESS_KEY_ID!, secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!, }, }), }, compression: true, defaultTTL: 14 * 24 * 60 * 60 * 1000, // 14 days promotionStrategy: 'lazy', }); // Store data (cascades to all tiers) await storage.set('user:123', { name: 'Alice', email: 'alice@example.com' }); // Retrieve data (bubbles up from cold → warm → hot) const user = await storage.get('user:123'); // Get data with metadata and source tier const result = await storage.getWithMetadata('user:123'); console.log(`Served from ${result.source}`); // 'hot', 'warm', or 'cold' // Invalidate all keys with prefix await storage.invalidate('user:'); // Renew TTL await storage.touch('user:123'); ``` ## Core Concepts ### Cascading Containment Model ``` ┌──────────────────────────────────────────────────────┐ │ Cold Storage (S3/Object Storage) │ │ • Contains ALL objects (source of truth) │ │ • Slowest access, unlimited capacity │ ├──────────────────────────────────────────────────────┤ │ Warm Storage (Disk/Database) │ │ • Contains ALL hot objects + additional warm objects │ │ • Medium access speed, large capacity │ ├──────────────────────────────────────────────────────┤ │ Hot Storage (Memory) │ │ • Contains only the hottest objects │ │ • Fastest access, limited capacity │ └──────────────────────────────────────────────────────┘ ``` **Write Strategy (Cascading Down):** - Write to **hot** → also writes to **warm** and **cold** - Write to **warm** → also writes to **cold** - Write to **cold** → only writes to **cold** **Read Strategy (Bubbling Up):** - Check **hot** first → if miss, check **warm** → if miss, check **cold** - On cache miss, optionally promote data up through tiers ### Selective Tier Placement For use cases like static site hosting, you can control which files go into which tiers: ```typescript // Small, critical file (index.html) - store in all tiers for instant serving await storage.set('site:abc/index.html', htmlContent); // Large file (video) - skip hot tier to avoid memory bloat await storage.set('site:abc/video.mp4', videoData, { skipTiers: ['hot'] }); // Medium files (images, CSS) - skip hot, use warm + cold await storage.set('site:abc/style.css', cssData, { skipTiers: ['hot'] }); ``` This pattern ensures: - Hot tier stays small and fast (only critical files) - Warm tier caches everything (all site files on disk) - Cold tier is source of truth (all data) ## API Reference ### `TieredStorage` Main orchestrator class for tiered storage. #### Constructor ```typescript new TieredStorage(config: TieredStorageConfig) ``` **Config Options:** ```typescript interface TieredStorageConfig { tiers: { hot?: StorageTier; // Optional: fastest tier (memory/Redis) warm?: StorageTier; // Optional: medium tier (disk/SQLite/Postgres) cold: StorageTier; // Required: slowest tier (S3/object storage) }; compression?: boolean; // Auto-compress before storing (default: false) defaultTTL?: number; // Default TTL in milliseconds promotionStrategy?: 'eager' | 'lazy'; // When to promote to upper tiers (default: 'lazy') serialization?: { // Custom serialization (default: JSON) serialize: (data: unknown) => Promise; deserialize: (data: Uint8Array) => Promise; }; } ``` #### Methods **`get(key: string): Promise`** Retrieve data for a key. Returns null if not found or expired. **`getWithMetadata(key: string): Promise | null>`** Retrieve data with metadata and source tier information. ```typescript const result = await storage.getWithMetadata('user:123'); console.log(result.data); // The actual data console.log(result.source); // 'hot' | 'warm' | 'cold' console.log(result.metadata); // Metadata (size, timestamps, TTL, etc.) ``` **`set(key: string, data: T, options?: SetOptions): Promise`** Store data with optional configuration. ```typescript await storage.set('key', data, { ttl: 24 * 60 * 60 * 1000, // Custom TTL (24 hours) metadata: { contentType: 'application/json' }, // Custom metadata skipTiers: ['hot'], // Skip specific tiers }); ``` **`delete(key: string): Promise`** Delete data from all tiers. **`exists(key: string): Promise`** Check if a key exists (and hasn't expired). **`touch(key: string, ttlMs?: number): Promise`** Renew TTL for a key. Useful for "keep alive" behavior. **`invalidate(prefix: string): Promise`** Delete all keys matching a prefix. Returns number of keys deleted. ```typescript await storage.invalidate('user:'); // Delete all user keys await storage.invalidate('site:abc/'); // Delete all files for site 'abc' await storage.invalidate(''); // Delete everything ``` **`listKeys(prefix?: string): AsyncIterableIterator`** List all keys, optionally filtered by prefix. ```typescript for await (const key of storage.listKeys('user:')) { console.log(key); // 'user:123', 'user:456', etc. } ``` **`getStats(): Promise`** Get aggregated statistics across all tiers. ```typescript const stats = await storage.getStats(); console.log(stats.hot); // Hot tier stats (size, items, hits, misses) console.log(stats.hitRate); // Overall hit rate (0-1) ``` **`bootstrapHot(limit?: number): Promise`** Load most frequently accessed items from warm into hot. Returns number of items loaded. ```typescript // On server startup: warm up hot tier const loaded = await storage.bootstrapHot(1000); // Load top 1000 items console.log(`Loaded ${loaded} items into hot tier`); ``` **`bootstrapWarm(options?: { limit?: number; sinceDate?: Date }): Promise`** Load recent items from cold into warm. Returns number of items loaded. ```typescript // Load items accessed in last 7 days const loaded = await storage.bootstrapWarm({ sinceDate: new Date(Date.now() - 7 * 24 * 60 * 60 * 1000), limit: 10000, }); ``` **`export(): Promise`** Export metadata snapshot for backup or migration. **`import(snapshot: StorageSnapshot): Promise`** Import metadata snapshot. **`clear(): Promise`** Clear all data from all tiers. ⚠️ Use with extreme caution! **`clearTier(tier: 'hot' | 'warm' | 'cold'): Promise`** Clear a specific tier. ### Built-in Storage Tiers #### `MemoryStorageTier` In-memory storage using TinyLRU for efficient LRU eviction. ```typescript import { MemoryStorageTier } from 'tiered-storage'; const tier = new MemoryStorageTier({ maxSizeBytes: 100 * 1024 * 1024, // 100MB maxItems: 1000, // Optional: max number of items }); ``` **Features:** - Battle-tested TinyLRU library - Automatic LRU eviction - Size-based and count-based limits - Single process only (not distributed) #### `DiskStorageTier` Filesystem-based storage with `.meta` files. ```typescript import { DiskStorageTier } from 'tiered-storage'; const tier = new DiskStorageTier({ directory: './cache', maxSizeBytes: 10 * 1024 * 1024 * 1024, // 10GB (optional) evictionPolicy: 'lru', // 'lru' | 'fifo' | 'size' }); ``` **Features:** - Human-readable file structure - Optional size-based eviction - Three eviction policies: LRU, FIFO, size-based - Atomic writes with `.meta` files - Zero external dependencies **File structure:** ``` cache/ ├── user%3A123 # Data file (encoded key) ├── user%3A123.meta # Metadata JSON ├── site%3Aabc%2Findex.html └── site%3Aabc%2Findex.html.meta ``` #### `S3StorageTier` AWS S3 or S3-compatible object storage. ```typescript import { S3StorageTier } from 'tiered-storage'; // AWS S3 with separate metadata bucket (RECOMMENDED!) const tier = new S3StorageTier({ bucket: 'my-data-bucket', metadataBucket: 'my-metadata-bucket', // Stores metadata separately for fast updates region: 'us-east-1', credentials: { accessKeyId: process.env.AWS_ACCESS_KEY_ID!, secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!, }, prefix: 'cache/', // Optional key prefix }); // Cloudflare R2 with metadata bucket const r2Tier = new S3StorageTier({ bucket: 'my-r2-data-bucket', metadataBucket: 'my-r2-metadata-bucket', region: 'auto', endpoint: 'https://account-id.r2.cloudflarestorage.com', credentials: { accessKeyId: process.env.R2_ACCESS_KEY_ID!, secretAccessKey: process.env.R2_SECRET_ACCESS_KEY!, }, }); // Without metadata bucket (legacy mode - slower, more expensive) const legacyTier = new S3StorageTier({ bucket: 'my-bucket', region: 'us-east-1', // No metadataBucket - metadata stored in S3 object metadata fields }); ``` **Features:** - Compatible with AWS S3, Cloudflare R2, MinIO, and other S3-compatible services - **Separate metadata bucket support (RECOMMENDED)** - stores metadata as JSON objects for fast, cheap updates - Legacy mode: metadata in S3 object metadata fields (requires object copying for updates) - Efficient batch deletions (up to 1000 keys per request) - Optional key prefixing for multi-tenant scenarios - Typically used as cold tier (source of truth) **⚠️ Important:** Without `metadataBucket`, updating metadata (e.g., access counts) requires copying the entire object, which is slow and expensive for large files. Use a separate metadata bucket in production! ## Usage Patterns ### Pattern 1: Simple Single-Server Setup ```typescript import { TieredStorage, MemoryStorageTier, DiskStorageTier } from 'tiered-storage'; const storage = new TieredStorage({ tiers: { hot: new MemoryStorageTier({ maxSizeBytes: 100 * 1024 * 1024 }), warm: new DiskStorageTier({ directory: './cache' }), cold: new DiskStorageTier({ directory: './storage' }), }, compression: true, defaultTTL: 14 * 24 * 60 * 60 * 1000, // 14 days }); await storage.set('user:123', { name: 'Alice', email: 'alice@example.com' }); const user = await storage.get('user:123'); ``` ### Pattern 2: Static Site Hosting (wisp.place-style) ```typescript import { TieredStorage, MemoryStorageTier, DiskStorageTier } from 'tiered-storage'; const storage = new TieredStorage({ tiers: { hot: new MemoryStorageTier({ maxSizeBytes: 100 * 1024 * 1024, // 100MB maxItems: 500, }), warm: new DiskStorageTier({ directory: './cache/sites', maxSizeBytes: 10 * 1024 * 1024 * 1024, // 10GB }), // Cold tier is PDS (fetched on demand via custom tier implementation) }, compression: true, defaultTTL: 14 * 24 * 60 * 60 * 1000, promotionStrategy: 'lazy', // Don't auto-promote large files to hot }); // Store index.html in all tiers (fast access) await storage.set(`${did}/${rkey}/index.html`, htmlBuffer, { metadata: { mimeType: 'text/html', encoding: 'gzip' }, }); // Store large files only in warm + cold (skip hot) await storage.set(`${did}/${rkey}/video.mp4`, videoBuffer, { skipTiers: ['hot'], metadata: { mimeType: 'video/mp4' }, }); // Get file with source tracking const result = await storage.getWithMetadata(`${did}/${rkey}/index.html`); console.log(`Served from ${result.source}`); // Likely 'hot' for index.html // Invalidate entire site await storage.invalidate(`${did}/${rkey}/`); // Renew TTL when site is accessed await storage.touch(`${did}/${rkey}/index.html`); ``` ### Pattern 3: Custom Backend (SQLite) Implement the `StorageTier` interface to use any backend: ```typescript import { StorageTier, StorageMetadata, TierStats } from 'tiered-storage'; import Database from 'better-sqlite3'; class SQLiteStorageTier implements StorageTier { private db: Database.Database; constructor(dbPath: string) { this.db = new Database(dbPath); this.db.exec(` CREATE TABLE IF NOT EXISTS cache ( key TEXT PRIMARY KEY, data BLOB NOT NULL, metadata TEXT NOT NULL ) `); } async get(key: string): Promise { const row = this.db.prepare('SELECT data FROM cache WHERE key = ?').get(key); return row ? new Uint8Array(row.data) : null; } async set(key: string, data: Uint8Array, metadata: StorageMetadata): Promise { this.db.prepare('INSERT OR REPLACE INTO cache (key, data, metadata) VALUES (?, ?, ?)') .run(key, Buffer.from(data), JSON.stringify(metadata)); } async delete(key: string): Promise { this.db.prepare('DELETE FROM cache WHERE key = ?').run(key); } async exists(key: string): Promise { const row = this.db.prepare('SELECT 1 FROM cache WHERE key = ?').get(key); return !!row; } async *listKeys(prefix?: string): AsyncIterableIterator { const query = prefix ? this.db.prepare('SELECT key FROM cache WHERE key LIKE ?') : this.db.prepare('SELECT key FROM cache'); const rows = prefix ? query.all(`${prefix}%`) : query.all(); for (const row of rows) { yield row.key; } } async deleteMany(keys: string[]): Promise { const placeholders = keys.map(() => '?').join(','); this.db.prepare(`DELETE FROM cache WHERE key IN (${placeholders})`).run(...keys); } async getMetadata(key: string): Promise { const row = this.db.prepare('SELECT metadata FROM cache WHERE key = ?').get(key); return row ? JSON.parse(row.metadata) : null; } async setMetadata(key: string, metadata: StorageMetadata): Promise { this.db.prepare('UPDATE cache SET metadata = ? WHERE key = ?') .run(JSON.stringify(metadata), key); } async getStats(): Promise { const row = this.db.prepare('SELECT COUNT(*) as count, SUM(LENGTH(data)) as bytes FROM cache').get(); return { items: row.count, bytes: row.bytes || 0 }; } async clear(): Promise { this.db.prepare('DELETE FROM cache').run(); } } // Use it const storage = new TieredStorage({ tiers: { warm: new SQLiteStorageTier('./cache.db'), cold: new DiskStorageTier({ directory: './storage' }), }, }); ``` ## Running Examples ### Interactive Demo Server Run a **real HTTP server** that serves the example site using tiered storage: ```bash # Configure S3 credentials first (copy .env.example to .env and fill in) cp .env.example .env # Start the demo server bun run serve ``` Then visit: - **http://localhost:3000/** - The demo site served from tiered storage - **http://localhost:3000/admin/stats** - Live cache statistics dashboard Watch the console to see which tier serves each request: - 🔥 **Hot tier (memory)** - index.html served instantly - 💾 **Warm tier (disk)** - Other pages served from disk cache - ☁️ **Cold tier (S3)** - First access fetches from S3, then cached ### Command-Line Examples Or run the non-interactive examples: ```bash bun run example ``` The examples include: - **Basic CRUD operations** with statistics tracking - **Static site hosting** using the real site in `example-site/` directory - **Bootstrap demonstrations** (warming caches from lower tiers) - **Promotion strategy comparisons** (eager vs lazy) The `example-site/` directory contains a complete static website with: - `index.html` - Stored in hot + warm + cold (instant serving) - `about.html`, `docs.html` - Stored in warm + cold (skips hot) - `style.css`, `script.js` - Stored in warm + cold (skips hot) This demonstrates the exact pattern you'd use for wisp.place: critical files in memory, everything else on disk/S3. ## Testing ```bash bun test ``` ## Development ```bash # Install dependencies bun install # Type check bun run check # Build bun run build # Run tests bun test ``` ## License MIT