wip library to store cold objects in s3, warm objects on disk, and hot objects in memory
nodejs typescript
1# Tiered Storage 2 3A lightweight, pluggable tiered storage library that orchestrates caching across hot (memory), warm (disk/database), and cold (S3/object storage) tiers. 4 5## Features 6 7- **Cascading Containment Model**: Hot ⊆ Warm ⊆ Cold (lower tiers contain all data from upper tiers) 8- **Pluggable Backends**: Bring your own Redis, Postgres, SQLite, or use built-in implementations 9- **Automatic Promotion**: Configurable eager/lazy promotion strategies for cache warming 10- **TTL Management**: Per-key TTL with automatic expiration and renewal 11- **Prefix Invalidation**: Efficiently delete groups of keys by prefix 12- **Bootstrap Support**: Warm up caches from lower tiers on startup 13- **Compression**: Optional transparent gzip compression 14- **TypeScript First**: Full type safety with comprehensive TSDoc comments 15- **Zero Forced Dependencies**: Only require what you use 16 17## Installation 18 19```bash 20npm install tiered-storage 21# or 22bun add tiered-storage 23``` 24 25## Quick Start 26 27```typescript 28import { TieredStorage, MemoryStorageTier, DiskStorageTier, S3StorageTier } from 'tiered-storage'; 29 30const storage = new TieredStorage({ 31 tiers: { 32 hot: new MemoryStorageTier({ maxSizeBytes: 100 * 1024 * 1024 }), // 100MB 33 warm: new DiskStorageTier({ directory: './cache' }), 34 cold: new S3StorageTier({ 35 bucket: 'my-bucket', 36 region: 'us-east-1', 37 credentials: { 38 accessKeyId: process.env.AWS_ACCESS_KEY_ID!, 39 secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!, 40 }, 41 }), 42 }, 43 compression: true, 44 defaultTTL: 14 * 24 * 60 * 60 * 1000, // 14 days 45 promotionStrategy: 'lazy', 46}); 47 48// Store data (cascades to all tiers) 49await storage.set('user:123', { name: 'Alice', email: 'alice@example.com' }); 50 51// Retrieve data (bubbles up from cold → warm → hot) 52const user = await storage.get('user:123'); 53 54// Get data with metadata and source tier 55const result = await storage.getWithMetadata('user:123'); 56console.log(`Served from ${result.source}`); // 'hot', 'warm', or 'cold' 57 58// Invalidate all keys with prefix 59await storage.invalidate('user:'); 60 61// Renew TTL 62await storage.touch('user:123'); 63``` 64 65## Core Concepts 66 67### Cascading Containment Model 68 69``` 70┌──────────────────────────────────────────────────────┐ 71│ Cold Storage (S3/Object Storage) │ 72│ • Contains ALL objects (source of truth) │ 73│ • Slowest access, unlimited capacity │ 74├──────────────────────────────────────────────────────┤ 75│ Warm Storage (Disk/Database) │ 76│ • Contains ALL hot objects + additional warm objects │ 77│ • Medium access speed, large capacity │ 78├──────────────────────────────────────────────────────┤ 79│ Hot Storage (Memory) │ 80│ • Contains only the hottest objects │ 81│ • Fastest access, limited capacity │ 82└──────────────────────────────────────────────────────┘ 83``` 84 85**Write Strategy (Cascading Down):** 86- Write to **hot** → also writes to **warm** and **cold** 87- Write to **warm** → also writes to **cold** 88- Write to **cold** → only writes to **cold** 89 90**Read Strategy (Bubbling Up):** 91- Check **hot** first → if miss, check **warm** → if miss, check **cold** 92- On cache miss, optionally promote data up through tiers 93 94### Selective Tier Placement 95 96For use cases like static site hosting, you can control which files go into which tiers: 97 98```typescript 99// Small, critical file (index.html) - store in all tiers for instant serving 100await storage.set('site:abc/index.html', htmlContent); 101 102// Large file (video) - skip hot tier to avoid memory bloat 103await storage.set('site:abc/video.mp4', videoData, { skipTiers: ['hot'] }); 104 105// Medium files (images, CSS) - skip hot, use warm + cold 106await storage.set('site:abc/style.css', cssData, { skipTiers: ['hot'] }); 107``` 108 109This pattern ensures: 110- Hot tier stays small and fast (only critical files) 111- Warm tier caches everything (all site files on disk) 112- Cold tier is source of truth (all data) 113 114## API Reference 115 116### `TieredStorage` 117 118Main orchestrator class for tiered storage. 119 120#### Constructor 121 122```typescript 123new TieredStorage<T>(config: TieredStorageConfig) 124``` 125 126**Config Options:** 127 128```typescript 129interface TieredStorageConfig { 130 tiers: { 131 hot?: StorageTier; // Optional: fastest tier (memory/Redis) 132 warm?: StorageTier; // Optional: medium tier (disk/SQLite/Postgres) 133 cold: StorageTier; // Required: slowest tier (S3/object storage) 134 }; 135 compression?: boolean; // Auto-compress before storing (default: false) 136 defaultTTL?: number; // Default TTL in milliseconds 137 promotionStrategy?: 'eager' | 'lazy'; // When to promote to upper tiers (default: 'lazy') 138 serialization?: { // Custom serialization (default: JSON) 139 serialize: (data: unknown) => Promise<Uint8Array>; 140 deserialize: (data: Uint8Array) => Promise<unknown>; 141 }; 142} 143``` 144 145#### Methods 146 147**`get(key: string): Promise<T | null>`** 148 149Retrieve data for a key. Returns null if not found or expired. 150 151**`getWithMetadata(key: string): Promise<StorageResult<T> | null>`** 152 153Retrieve data with metadata and source tier information. 154 155```typescript 156const result = await storage.getWithMetadata('user:123'); 157console.log(result.data); // The actual data 158console.log(result.source); // 'hot' | 'warm' | 'cold' 159console.log(result.metadata); // Metadata (size, timestamps, TTL, etc.) 160``` 161 162**`set(key: string, data: T, options?: SetOptions): Promise<SetResult>`** 163 164Store data with optional configuration. 165 166```typescript 167await storage.set('key', data, { 168 ttl: 24 * 60 * 60 * 1000, // Custom TTL (24 hours) 169 metadata: { contentType: 'application/json' }, // Custom metadata 170 skipTiers: ['hot'], // Skip specific tiers 171}); 172``` 173 174**`delete(key: string): Promise<void>`** 175 176Delete data from all tiers. 177 178**`exists(key: string): Promise<boolean>`** 179 180Check if a key exists (and hasn't expired). 181 182**`touch(key: string, ttlMs?: number): Promise<void>`** 183 184Renew TTL for a key. Useful for "keep alive" behavior. 185 186**`invalidate(prefix: string): Promise<number>`** 187 188Delete all keys matching a prefix. Returns number of keys deleted. 189 190```typescript 191await storage.invalidate('user:'); // Delete all user keys 192await storage.invalidate('site:abc/'); // Delete all files for site 'abc' 193await storage.invalidate(''); // Delete everything 194``` 195 196**`listKeys(prefix?: string): AsyncIterableIterator<string>`** 197 198List all keys, optionally filtered by prefix. 199 200```typescript 201for await (const key of storage.listKeys('user:')) { 202 console.log(key); // 'user:123', 'user:456', etc. 203} 204``` 205 206**`getStats(): Promise<AllTierStats>`** 207 208Get aggregated statistics across all tiers. 209 210```typescript 211const stats = await storage.getStats(); 212console.log(stats.hot); // Hot tier stats (size, items, hits, misses) 213console.log(stats.hitRate); // Overall hit rate (0-1) 214``` 215 216**`bootstrapHot(limit?: number): Promise<number>`** 217 218Load most frequently accessed items from warm into hot. Returns number of items loaded. 219 220```typescript 221// On server startup: warm up hot tier 222const loaded = await storage.bootstrapHot(1000); // Load top 1000 items 223console.log(`Loaded ${loaded} items into hot tier`); 224``` 225 226**`bootstrapWarm(options?: { limit?: number; sinceDate?: Date }): Promise<number>`** 227 228Load recent items from cold into warm. Returns number of items loaded. 229 230```typescript 231// Load items accessed in last 7 days 232const loaded = await storage.bootstrapWarm({ 233 sinceDate: new Date(Date.now() - 7 * 24 * 60 * 60 * 1000), 234 limit: 10000, 235}); 236``` 237 238**`export(): Promise<StorageSnapshot>`** 239 240Export metadata snapshot for backup or migration. 241 242**`import(snapshot: StorageSnapshot): Promise<void>`** 243 244Import metadata snapshot. 245 246**`clear(): Promise<void>`** 247 248Clear all data from all tiers. ⚠️ Use with extreme caution! 249 250**`clearTier(tier: 'hot' | 'warm' | 'cold'): Promise<void>`** 251 252Clear a specific tier. 253 254### Built-in Storage Tiers 255 256#### `MemoryStorageTier` 257 258In-memory storage using TinyLRU for efficient LRU eviction. 259 260```typescript 261import { MemoryStorageTier } from 'tiered-storage'; 262 263const tier = new MemoryStorageTier({ 264 maxSizeBytes: 100 * 1024 * 1024, // 100MB 265 maxItems: 1000, // Optional: max number of items 266}); 267``` 268 269**Features:** 270- Battle-tested TinyLRU library 271- Automatic LRU eviction 272- Size-based and count-based limits 273- Single process only (not distributed) 274 275#### `DiskStorageTier` 276 277Filesystem-based storage with `.meta` files. 278 279```typescript 280import { DiskStorageTier } from 'tiered-storage'; 281 282const tier = new DiskStorageTier({ 283 directory: './cache', 284 maxSizeBytes: 10 * 1024 * 1024 * 1024, // 10GB (optional) 285 evictionPolicy: 'lru', // 'lru' | 'fifo' | 'size' 286}); 287``` 288 289**Features:** 290- Human-readable file structure 291- Optional size-based eviction 292- Three eviction policies: LRU, FIFO, size-based 293- Atomic writes with `.meta` files 294- Zero external dependencies 295 296**File structure:** 297``` 298cache/ 299├── user%3A123 # Data file (encoded key) 300├── user%3A123.meta # Metadata JSON 301├── site%3Aabc%2Findex.html 302└── site%3Aabc%2Findex.html.meta 303``` 304 305#### `S3StorageTier` 306 307AWS S3 or S3-compatible object storage. 308 309```typescript 310import { S3StorageTier } from 'tiered-storage'; 311 312// AWS S3 with separate metadata bucket (RECOMMENDED!) 313const tier = new S3StorageTier({ 314 bucket: 'my-data-bucket', 315 metadataBucket: 'my-metadata-bucket', // Stores metadata separately for fast updates 316 region: 'us-east-1', 317 credentials: { 318 accessKeyId: process.env.AWS_ACCESS_KEY_ID!, 319 secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!, 320 }, 321 prefix: 'cache/', // Optional key prefix 322}); 323 324// Cloudflare R2 with metadata bucket 325const r2Tier = new S3StorageTier({ 326 bucket: 'my-r2-data-bucket', 327 metadataBucket: 'my-r2-metadata-bucket', 328 region: 'auto', 329 endpoint: 'https://account-id.r2.cloudflarestorage.com', 330 credentials: { 331 accessKeyId: process.env.R2_ACCESS_KEY_ID!, 332 secretAccessKey: process.env.R2_SECRET_ACCESS_KEY!, 333 }, 334}); 335 336// Without metadata bucket (legacy mode - slower, more expensive) 337const legacyTier = new S3StorageTier({ 338 bucket: 'my-bucket', 339 region: 'us-east-1', 340 // No metadataBucket - metadata stored in S3 object metadata fields 341}); 342``` 343 344**Features:** 345- Compatible with AWS S3, Cloudflare R2, MinIO, and other S3-compatible services 346- **Separate metadata bucket support (RECOMMENDED)** - stores metadata as JSON objects for fast, cheap updates 347- Legacy mode: metadata in S3 object metadata fields (requires object copying for updates) 348- Efficient batch deletions (up to 1000 keys per request) 349- Optional key prefixing for multi-tenant scenarios 350- Typically used as cold tier (source of truth) 351 352**⚠️ Important:** Without `metadataBucket`, updating metadata (e.g., access counts) requires copying the entire object, which is slow and expensive for large files. Use a separate metadata bucket in production! 353 354## Usage Patterns 355 356### Pattern 1: Simple Single-Server Setup 357 358```typescript 359import { TieredStorage, MemoryStorageTier, DiskStorageTier } from 'tiered-storage'; 360 361const storage = new TieredStorage({ 362 tiers: { 363 hot: new MemoryStorageTier({ maxSizeBytes: 100 * 1024 * 1024 }), 364 warm: new DiskStorageTier({ directory: './cache' }), 365 cold: new DiskStorageTier({ directory: './storage' }), 366 }, 367 compression: true, 368 defaultTTL: 14 * 24 * 60 * 60 * 1000, // 14 days 369}); 370 371await storage.set('user:123', { name: 'Alice', email: 'alice@example.com' }); 372const user = await storage.get('user:123'); 373``` 374 375### Pattern 2: Static Site Hosting (wisp.place-style) 376 377```typescript 378import { TieredStorage, MemoryStorageTier, DiskStorageTier } from 'tiered-storage'; 379 380const storage = new TieredStorage({ 381 tiers: { 382 hot: new MemoryStorageTier({ 383 maxSizeBytes: 100 * 1024 * 1024, // 100MB 384 maxItems: 500, 385 }), 386 warm: new DiskStorageTier({ 387 directory: './cache/sites', 388 maxSizeBytes: 10 * 1024 * 1024 * 1024, // 10GB 389 }), 390 // Cold tier is PDS (fetched on demand via custom tier implementation) 391 }, 392 compression: true, 393 defaultTTL: 14 * 24 * 60 * 60 * 1000, 394 promotionStrategy: 'lazy', // Don't auto-promote large files to hot 395}); 396 397// Store index.html in all tiers (fast access) 398await storage.set(`${did}/${rkey}/index.html`, htmlBuffer, { 399 metadata: { mimeType: 'text/html', encoding: 'gzip' }, 400}); 401 402// Store large files only in warm + cold (skip hot) 403await storage.set(`${did}/${rkey}/video.mp4`, videoBuffer, { 404 skipTiers: ['hot'], 405 metadata: { mimeType: 'video/mp4' }, 406}); 407 408// Get file with source tracking 409const result = await storage.getWithMetadata(`${did}/${rkey}/index.html`); 410console.log(`Served from ${result.source}`); // Likely 'hot' for index.html 411 412// Invalidate entire site 413await storage.invalidate(`${did}/${rkey}/`); 414 415// Renew TTL when site is accessed 416await storage.touch(`${did}/${rkey}/index.html`); 417``` 418 419### Pattern 3: Custom Backend (SQLite) 420 421Implement the `StorageTier` interface to use any backend: 422 423```typescript 424import { StorageTier, StorageMetadata, TierStats } from 'tiered-storage'; 425import Database from 'better-sqlite3'; 426 427class SQLiteStorageTier implements StorageTier { 428 private db: Database.Database; 429 430 constructor(dbPath: string) { 431 this.db = new Database(dbPath); 432 this.db.exec(` 433 CREATE TABLE IF NOT EXISTS cache ( 434 key TEXT PRIMARY KEY, 435 data BLOB NOT NULL, 436 metadata TEXT NOT NULL 437 ) 438 `); 439 } 440 441 async get(key: string): Promise<Uint8Array | null> { 442 const row = this.db.prepare('SELECT data FROM cache WHERE key = ?').get(key); 443 return row ? new Uint8Array(row.data) : null; 444 } 445 446 async set(key: string, data: Uint8Array, metadata: StorageMetadata): Promise<void> { 447 this.db.prepare('INSERT OR REPLACE INTO cache (key, data, metadata) VALUES (?, ?, ?)') 448 .run(key, Buffer.from(data), JSON.stringify(metadata)); 449 } 450 451 async delete(key: string): Promise<void> { 452 this.db.prepare('DELETE FROM cache WHERE key = ?').run(key); 453 } 454 455 async exists(key: string): Promise<boolean> { 456 const row = this.db.prepare('SELECT 1 FROM cache WHERE key = ?').get(key); 457 return !!row; 458 } 459 460 async *listKeys(prefix?: string): AsyncIterableIterator<string> { 461 const query = prefix 462 ? this.db.prepare('SELECT key FROM cache WHERE key LIKE ?') 463 : this.db.prepare('SELECT key FROM cache'); 464 465 const rows = prefix ? query.all(`${prefix}%`) : query.all(); 466 467 for (const row of rows) { 468 yield row.key; 469 } 470 } 471 472 async deleteMany(keys: string[]): Promise<void> { 473 const placeholders = keys.map(() => '?').join(','); 474 this.db.prepare(`DELETE FROM cache WHERE key IN (${placeholders})`).run(...keys); 475 } 476 477 async getMetadata(key: string): Promise<StorageMetadata | null> { 478 const row = this.db.prepare('SELECT metadata FROM cache WHERE key = ?').get(key); 479 return row ? JSON.parse(row.metadata) : null; 480 } 481 482 async setMetadata(key: string, metadata: StorageMetadata): Promise<void> { 483 this.db.prepare('UPDATE cache SET metadata = ? WHERE key = ?') 484 .run(JSON.stringify(metadata), key); 485 } 486 487 async getStats(): Promise<TierStats> { 488 const row = this.db.prepare('SELECT COUNT(*) as count, SUM(LENGTH(data)) as bytes FROM cache').get(); 489 return { items: row.count, bytes: row.bytes || 0 }; 490 } 491 492 async clear(): Promise<void> { 493 this.db.prepare('DELETE FROM cache').run(); 494 } 495} 496 497// Use it 498const storage = new TieredStorage({ 499 tiers: { 500 warm: new SQLiteStorageTier('./cache.db'), 501 cold: new DiskStorageTier({ directory: './storage' }), 502 }, 503}); 504``` 505 506## Running Examples 507 508### Interactive Demo Server 509 510Run a **real HTTP server** that serves the example site using tiered storage: 511 512```bash 513# Configure S3 credentials first (copy .env.example to .env and fill in) 514cp .env.example .env 515 516# Start the demo server 517bun run serve 518``` 519 520Then visit: 521- **http://localhost:3000/** - The demo site served from tiered storage 522- **http://localhost:3000/admin/stats** - Live cache statistics dashboard 523 524Watch the console to see which tier serves each request: 525- 🔥 **Hot tier (memory)** - index.html served instantly 526- 💾 **Warm tier (disk)** - Other pages served from disk cache 527- ☁️ **Cold tier (S3)** - First access fetches from S3, then cached 528 529### Command-Line Examples 530 531Or run the non-interactive examples: 532 533```bash 534bun run example 535``` 536 537The examples include: 538- **Basic CRUD operations** with statistics tracking 539- **Static site hosting** using the real site in `example-site/` directory 540- **Bootstrap demonstrations** (warming caches from lower tiers) 541- **Promotion strategy comparisons** (eager vs lazy) 542 543The `example-site/` directory contains a complete static website with: 544- `index.html` - Stored in hot + warm + cold (instant serving) 545- `about.html`, `docs.html` - Stored in warm + cold (skips hot) 546- `style.css`, `script.js` - Stored in warm + cold (skips hot) 547 548This demonstrates the exact pattern you'd use for wisp.place: critical files in memory, everything else on disk/S3. 549 550## Testing 551 552```bash 553bun test 554``` 555 556## Development 557 558```bash 559# Install dependencies 560bun install 561 562# Type check 563bun run check 564 565# Build 566bun run build 567 568# Run tests 569bun test 570 571``` 572## License 573 574MIT