wip library to store cold objects in s3, warm objects on disk, and hot objects in memory
nodejs typescript

humans write the best readmes, not bots

nekomimi.pet b4de0bdf 34ec0d34

verified
Changed files
+96 -483
+96 -483
README.md
···
-
# Tiered Storage
-
A lightweight, pluggable tiered storage library that orchestrates caching across hot (memory), warm (disk/database), and cold (S3/object storage) tiers.
## Features
-
- **Cascading Containment Model**: Hot ⊆ Warm ⊆ Cold (lower tiers contain all data from upper tiers)
-
- **Pluggable Backends**: Bring your own Redis, Postgres, SQLite, or use built-in implementations
-
- **Automatic Promotion**: Configurable eager/lazy promotion strategies for cache warming
-
- **TTL Management**: Per-key TTL with automatic expiration and renewal
-
- **Prefix Invalidation**: Efficiently delete groups of keys by prefix
-
- **Bootstrap Support**: Warm up caches from lower tiers on startup
-
- **Compression**: Optional transparent gzip compression
-
- **TypeScript First**: Full type safety with comprehensive TSDoc comments
-
- **Zero Forced Dependencies**: Only require what you use
-
## Installation
```bash
npm install tiered-storage
-
# or
-
bun add tiered-storage
```
-
## Quick Start
```typescript
-
import { TieredStorage, MemoryStorageTier, DiskStorageTier, S3StorageTier } from 'tiered-storage';
const storage = new TieredStorage({
tiers: {
-
hot: new MemoryStorageTier({ maxSizeBytes: 100 * 1024 * 1024 }), // 100MB
warm: new DiskStorageTier({ directory: './cache' }),
-
cold: new S3StorageTier({
-
bucket: 'my-bucket',
-
region: 'us-east-1',
-
credentials: {
-
accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
-
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
-
},
-
}),
},
compression: true,
-
defaultTTL: 14 * 24 * 60 * 60 * 1000, // 14 days
-
promotionStrategy: 'lazy',
-
});
-
-
// Store data (cascades to all tiers)
-
await storage.set('user:123', { name: 'Alice', email: 'alice@example.com' });
-
-
// Retrieve data (bubbles up from cold → warm → hot)
-
const user = await storage.get('user:123');
-
-
// Get data with metadata and source tier
-
const result = await storage.getWithMetadata('user:123');
-
console.log(`Served from ${result.source}`); // 'hot', 'warm', or 'cold'
-
-
// Invalidate all keys with prefix
-
await storage.invalidate('user:');
-
// Renew TTL
-
await storage.touch('user:123');
-
```
-
## Core Concepts
-
### Cascading Containment Model
```
-
┌──────────────────────────────────────────────────────┐
-
│ Cold Storage (S3/Object Storage) │
-
│ • Contains ALL objects (source of truth) │
-
│ • Slowest access, unlimited capacity │
-
├──────────────────────────────────────────────────────┤
-
│ Warm Storage (Disk/Database) │
-
│ • Contains ALL hot objects + additional warm objects │
-
│ • Medium access speed, large capacity │
-
├──────────────────────────────────────────────────────┤
-
│ Hot Storage (Memory) │
-
│ • Contains only the hottest objects │
-
│ • Fastest access, limited capacity │
-
└──────────────────────────────────────────────────────┘
-
```
-
-
**Write Strategy (Cascading Down):**
-
- Write to **hot** → also writes to **warm** and **cold**
-
- Write to **warm** → also writes to **cold**
-
- Write to **cold** → only writes to **cold**
-
-
**Read Strategy (Bubbling Up):**
-
- Check **hot** first → if miss, check **warm** → if miss, check **cold**
-
- On cache miss, optionally promote data up through tiers
-
-
### Selective Tier Placement
-
-
For use cases like static site hosting, you can control which files go into which tiers:
-
-
```typescript
-
// Small, critical file (index.html) - store in all tiers for instant serving
-
await storage.set('site:abc/index.html', htmlContent);
-
-
// Large file (video) - skip hot tier to avoid memory bloat
-
await storage.set('site:abc/video.mp4', videoData, { skipTiers: ['hot'] });
-
-
// Medium files (images, CSS) - skip hot, use warm + cold
-
await storage.set('site:abc/style.css', cssData, { skipTiers: ['hot'] });
-
```
-
-
This pattern ensures:
-
- Hot tier stays small and fast (only critical files)
-
- Warm tier caches everything (all site files on disk)
-
- Cold tier is source of truth (all data)
-
-
## API Reference
-
-
### `TieredStorage`
-
-
Main orchestrator class for tiered storage.
-
-
#### Constructor
-
-
```typescript
-
new TieredStorage<T>(config: TieredStorageConfig)
-
```
-
-
**Config Options:**
-
-
```typescript
-
interface TieredStorageConfig {
-
tiers: {
-
hot?: StorageTier; // Optional: fastest tier (memory/Redis)
-
warm?: StorageTier; // Optional: medium tier (disk/SQLite/Postgres)
-
cold: StorageTier; // Required: slowest tier (S3/object storage)
-
};
-
compression?: boolean; // Auto-compress before storing (default: false)
-
defaultTTL?: number; // Default TTL in milliseconds
-
promotionStrategy?: 'eager' | 'lazy'; // When to promote to upper tiers (default: 'lazy')
-
serialization?: { // Custom serialization (default: JSON)
-
serialize: (data: unknown) => Promise<Uint8Array>;
-
deserialize: (data: Uint8Array) => Promise<unknown>;
-
};
-
}
-
```
-
-
#### Methods
-
**`get(key: string): Promise<T | null>`**
-
-
Retrieve data for a key. Returns null if not found or expired.
-
-
**`getWithMetadata(key: string): Promise<StorageResult<T> | null>`**
-
-
Retrieve data with metadata and source tier information.
-
```typescript
-
const result = await storage.getWithMetadata('user:123');
-
console.log(result.data); // The actual data
-
console.log(result.source); // 'hot' | 'warm' | 'cold'
-
console.log(result.metadata); // Metadata (size, timestamps, TTL, etc.)
```
-
-
**`set(key: string, data: T, options?: SetOptions): Promise<SetResult>`**
-
-
Store data with optional configuration.
-
-
```typescript
-
await storage.set('key', data, {
-
ttl: 24 * 60 * 60 * 1000, // Custom TTL (24 hours)
-
metadata: { contentType: 'application/json' }, // Custom metadata
-
skipTiers: ['hot'], // Skip specific tiers
-
});
```
-
**`delete(key: string): Promise<void>`**
-
Delete data from all tiers.
-
**`exists(key: string): Promise<boolean>`**
-
Check if a key exists (and hasn't expired).
-
**`touch(key: string, ttlMs?: number): Promise<void>`**
-
-
Renew TTL for a key. Useful for "keep alive" behavior.
-
-
**`invalidate(prefix: string): Promise<number>`**
-
-
Delete all keys matching a prefix. Returns number of keys deleted.
-
```typescript
-
await storage.invalidate('user:'); // Delete all user keys
-
await storage.invalidate('site:abc/'); // Delete all files for site 'abc'
-
await storage.invalidate(''); // Delete everything
-
```
-
**`listKeys(prefix?: string): AsyncIterableIterator<string>`**
-
List all keys, optionally filtered by prefix.
```typescript
-
for await (const key of storage.listKeys('user:')) {
-
console.log(key); // 'user:123', 'user:456', etc.
}
```
-
**`getStats(): Promise<AllTierStats>`**
-
Get aggregated statistics across all tiers.
-
```typescript
-
const stats = await storage.getStats();
-
console.log(stats.hot); // Hot tier stats (size, items, hits, misses)
-
console.log(stats.hitRate); // Overall hit rate (0-1)
-
```
-
**`bootstrapHot(limit?: number): Promise<number>`**
-
-
Load most frequently accessed items from warm into hot. Returns number of items loaded.
-
-
```typescript
-
// On server startup: warm up hot tier
-
const loaded = await storage.bootstrapHot(1000); // Load top 1000 items
-
console.log(`Loaded ${loaded} items into hot tier`);
-
```
-
-
**`bootstrapWarm(options?: { limit?: number; sinceDate?: Date }): Promise<number>`**
-
-
Load recent items from cold into warm. Returns number of items loaded.
-
```typescript
-
// Load items accessed in last 7 days
-
const loaded = await storage.bootstrapWarm({
-
sinceDate: new Date(Date.now() - 7 * 24 * 60 * 60 * 1000),
-
limit: 10000,
-
});
-
```
-
**`export(): Promise<StorageSnapshot>`**
-
Export metadata snapshot for backup or migration.
-
**`import(snapshot: StorageSnapshot): Promise<void>`**
-
Import metadata snapshot.
-
**`clear(): Promise<void>`**
-
Clear all data from all tiers. ⚠️ Use with extreme caution!
-
**`clearTier(tier: 'hot' | 'warm' | 'cold'): Promise<void>`**
-
Clear a specific tier.
-
### Built-in Storage Tiers
-
#### `MemoryStorageTier`
-
In-memory storage using TinyLRU for efficient LRU eviction.
```typescript
-
import { MemoryStorageTier } from 'tiered-storage';
-
-
const tier = new MemoryStorageTier({
-
maxSizeBytes: 100 * 1024 * 1024, // 100MB
-
maxItems: 1000, // Optional: max number of items
-
});
```
-
**Features:**
-
- Battle-tested TinyLRU library
-
- Automatic LRU eviction
-
- Size-based and count-based limits
-
- Single process only (not distributed)
-
#### `DiskStorageTier`
-
-
Filesystem-based storage with `.meta` files.
```typescript
-
import { DiskStorageTier } from 'tiered-storage';
-
-
const tier = new DiskStorageTier({
directory: './cache',
-
maxSizeBytes: 10 * 1024 * 1024 * 1024, // 10GB (optional)
-
evictionPolicy: 'lru', // 'lru' | 'fifo' | 'size'
-
});
```
-
**Features:**
-
- Human-readable file structure
-
- Optional size-based eviction
-
- Three eviction policies: LRU, FIFO, size-based
-
- Atomic writes with `.meta` files
-
- Zero external dependencies
-
**File structure:**
-
```
-
cache/
-
├── user%3A123 # Data file (encoded key)
-
├── user%3A123.meta # Metadata JSON
-
├── site%3Aabc%2Findex.html
-
└── site%3Aabc%2Findex.html.meta
-
```
-
-
#### `S3StorageTier`
-
-
AWS S3 or S3-compatible object storage.
```typescript
-
import { S3StorageTier } from 'tiered-storage';
-
-
// AWS S3 with separate metadata bucket (RECOMMENDED!)
-
const tier = new S3StorageTier({
-
bucket: 'my-data-bucket',
-
metadataBucket: 'my-metadata-bucket', // Stores metadata separately for fast updates
region: 'us-east-1',
-
credentials: {
-
accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
-
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
-
},
-
prefix: 'cache/', // Optional key prefix
-
});
-
-
// Cloudflare R2 with metadata bucket
-
const r2Tier = new S3StorageTier({
-
bucket: 'my-r2-data-bucket',
-
metadataBucket: 'my-r2-metadata-bucket',
-
region: 'auto',
-
endpoint: 'https://account-id.r2.cloudflarestorage.com',
-
credentials: {
-
accessKeyId: process.env.R2_ACCESS_KEY_ID!,
-
secretAccessKey: process.env.R2_SECRET_ACCESS_KEY!,
-
},
-
});
-
-
// Without metadata bucket (legacy mode - slower, more expensive)
-
const legacyTier = new S3StorageTier({
-
bucket: 'my-bucket',
-
region: 'us-east-1',
-
// No metadataBucket - metadata stored in S3 object metadata fields
-
});
```
-
**Features:**
-
- Compatible with AWS S3, Cloudflare R2, MinIO, and other S3-compatible services
-
- **Separate metadata bucket support (RECOMMENDED)** - stores metadata as JSON objects for fast, cheap updates
-
- Legacy mode: metadata in S3 object metadata fields (requires object copying for updates)
-
- Efficient batch deletions (up to 1000 keys per request)
-
- Optional key prefixing for multi-tenant scenarios
-
- Typically used as cold tier (source of truth)
-
**⚠️ Important:** Without `metadataBucket`, updating metadata (e.g., access counts) requires copying the entire object, which is slow and expensive for large files. Use a separate metadata bucket in production!
-
## Usage Patterns
-
-
### Pattern 1: Simple Single-Server Setup
```typescript
-
import { TieredStorage, MemoryStorageTier, DiskStorageTier } from 'tiered-storage';
-
-
const storage = new TieredStorage({
-
tiers: {
-
hot: new MemoryStorageTier({ maxSizeBytes: 100 * 1024 * 1024 }),
-
warm: new DiskStorageTier({ directory: './cache' }),
-
cold: new DiskStorageTier({ directory: './storage' }),
-
},
-
compression: true,
-
defaultTTL: 14 * 24 * 60 * 60 * 1000, // 14 days
-
});
-
-
await storage.set('user:123', { name: 'Alice', email: 'alice@example.com' });
-
const user = await storage.get('user:123');
-
```
-
-
### Pattern 2: Static Site Hosting (wisp.place-style)
-
-
```typescript
-
import { TieredStorage, MemoryStorageTier, DiskStorageTier } from 'tiered-storage';
-
-
const storage = new TieredStorage({
-
tiers: {
-
hot: new MemoryStorageTier({
-
maxSizeBytes: 100 * 1024 * 1024, // 100MB
-
maxItems: 500,
-
}),
-
warm: new DiskStorageTier({
-
directory: './cache/sites',
-
maxSizeBytes: 10 * 1024 * 1024 * 1024, // 10GB
-
}),
-
// Cold tier is PDS (fetched on demand via custom tier implementation)
-
},
-
compression: true,
-
defaultTTL: 14 * 24 * 60 * 60 * 1000,
-
promotionStrategy: 'lazy', // Don't auto-promote large files to hot
-
});
-
-
// Store index.html in all tiers (fast access)
-
await storage.set(`${did}/${rkey}/index.html`, htmlBuffer, {
-
metadata: { mimeType: 'text/html', encoding: 'gzip' },
-
});
-
-
// Store large files only in warm + cold (skip hot)
-
await storage.set(`${did}/${rkey}/video.mp4`, videoBuffer, {
-
skipTiers: ['hot'],
-
metadata: { mimeType: 'video/mp4' },
-
});
-
-
// Get file with source tracking
-
const result = await storage.getWithMetadata(`${did}/${rkey}/index.html`);
-
console.log(`Served from ${result.source}`); // Likely 'hot' for index.html
-
-
// Invalidate entire site
-
await storage.invalidate(`${did}/${rkey}/`);
-
-
// Renew TTL when site is accessed
-
await storage.touch(`${did}/${rkey}/index.html`);
```
-
### Pattern 3: Custom Backend (SQLite)
-
Implement the `StorageTier` interface to use any backend:
```typescript
-
import { StorageTier, StorageMetadata, TierStats } from 'tiered-storage';
-
import Database from 'better-sqlite3';
-
-
class SQLiteStorageTier implements StorageTier {
-
private db: Database.Database;
-
-
constructor(dbPath: string) {
-
this.db = new Database(dbPath);
-
this.db.exec(`
-
CREATE TABLE IF NOT EXISTS cache (
-
key TEXT PRIMARY KEY,
-
data BLOB NOT NULL,
-
metadata TEXT NOT NULL
-
)
-
`);
-
}
-
-
async get(key: string): Promise<Uint8Array | null> {
-
const row = this.db.prepare('SELECT data FROM cache WHERE key = ?').get(key);
-
return row ? new Uint8Array(row.data) : null;
-
}
-
-
async set(key: string, data: Uint8Array, metadata: StorageMetadata): Promise<void> {
-
this.db.prepare('INSERT OR REPLACE INTO cache (key, data, metadata) VALUES (?, ?, ?)')
-
.run(key, Buffer.from(data), JSON.stringify(metadata));
-
}
-
-
async delete(key: string): Promise<void> {
-
this.db.prepare('DELETE FROM cache WHERE key = ?').run(key);
-
}
-
-
async exists(key: string): Promise<boolean> {
-
const row = this.db.prepare('SELECT 1 FROM cache WHERE key = ?').get(key);
-
return !!row;
-
}
-
-
async *listKeys(prefix?: string): AsyncIterableIterator<string> {
-
const query = prefix
-
? this.db.prepare('SELECT key FROM cache WHERE key LIKE ?')
-
: this.db.prepare('SELECT key FROM cache');
-
-
const rows = prefix ? query.all(`${prefix}%`) : query.all();
-
-
for (const row of rows) {
-
yield row.key;
-
}
-
}
-
-
async deleteMany(keys: string[]): Promise<void> {
-
const placeholders = keys.map(() => '?').join(',');
-
this.db.prepare(`DELETE FROM cache WHERE key IN (${placeholders})`).run(...keys);
-
}
-
-
async getMetadata(key: string): Promise<StorageMetadata | null> {
-
const row = this.db.prepare('SELECT metadata FROM cache WHERE key = ?').get(key);
-
return row ? JSON.parse(row.metadata) : null;
-
}
-
-
async setMetadata(key: string, metadata: StorageMetadata): Promise<void> {
-
this.db.prepare('UPDATE cache SET metadata = ? WHERE key = ?')
-
.run(JSON.stringify(metadata), key);
-
}
-
-
async getStats(): Promise<TierStats> {
-
const row = this.db.prepare('SELECT COUNT(*) as count, SUM(LENGTH(data)) as bytes FROM cache').get();
-
return { items: row.count, bytes: row.bytes || 0 };
-
}
-
-
async clear(): Promise<void> {
-
this.db.prepare('DELETE FROM cache').run();
-
}
-
}
-
-
// Use it
-
const storage = new TieredStorage({
-
tiers: {
-
warm: new SQLiteStorageTier('./cache.db'),
-
cold: new DiskStorageTier({ directory: './storage' }),
-
},
-
});
```
-
## Running Examples
-
-
### Interactive Demo Server
-
-
Run a **real HTTP server** that serves the example site using tiered storage:
```bash
-
# Configure S3 credentials first (copy .env.example to .env and fill in)
-
cp .env.example .env
-
-
# Start the demo server
bun run serve
```
-
Then visit:
-
- **http://localhost:3000/** - The demo site served from tiered storage
-
- **http://localhost:3000/admin/stats** - Live cache statistics dashboard
-
-
Watch the console to see which tier serves each request:
-
- 🔥 **Hot tier (memory)** - index.html served instantly
-
- 💾 **Warm tier (disk)** - Other pages served from disk cache
-
- ☁️ **Cold tier (S3)** - First access fetches from S3, then cached
-
-
### Command-Line Examples
-
-
Or run the non-interactive examples:
-
-
```bash
-
bun run example
-
```
-
-
The examples include:
-
- **Basic CRUD operations** with statistics tracking
-
- **Static site hosting** using the real site in `example-site/` directory
-
- **Bootstrap demonstrations** (warming caches from lower tiers)
-
- **Promotion strategy comparisons** (eager vs lazy)
-
-
The `example-site/` directory contains a complete static website with:
-
- `index.html` - Stored in hot + warm + cold (instant serving)
-
- `about.html`, `docs.html` - Stored in warm + cold (skips hot)
-
- `style.css`, `script.js` - Stored in warm + cold (skips hot)
-
-
This demonstrates the exact pattern you'd use for wisp.place: critical files in memory, everything else on disk/S3.
-
-
## Testing
-
-
```bash
-
bun test
-
```
-
-
## Development
-
-
```bash
-
# Install dependencies
-
bun install
-
-
# Type check
-
bun run check
-
-
# Build
-
bun run build
-
# Run tests
-
bun test
-
-
```
## License
MIT
···
+
# tiered-storage
+
Cascading cache that flows hot → warm → cold. Memory, disk, S3—or bring your own.
## Features
+
- **Cascading writes** - data flows down through all tiers
+
- **Bubbling reads** - check hot first, fall back to warm, then cold
+
- **Pluggable backends** - memory, disk, S3, or implement your own
+
- **Selective placement** - skip tiers for big files that don't need memory caching
+
- **Prefix invalidation** - `invalidate('user:')` nukes all user keys
+
- **Optional compression** - transparent gzip
+
## Install
```bash
npm install tiered-storage
```
+
## Example
```typescript
+
import { TieredStorage, MemoryStorageTier, DiskStorageTier, S3StorageTier } from 'tiered-storage'
const storage = new TieredStorage({
tiers: {
+
hot: new MemoryStorageTier({ maxSizeBytes: 100 * 1024 * 1024 }),
warm: new DiskStorageTier({ directory: './cache' }),
+
cold: new S3StorageTier({ bucket: 'my-bucket', region: 'us-east-1' }),
},
compression: true,
+
})
+
// write cascades down
+
await storage.set('user:123', { name: 'Alice' })
+
// read bubbles up
+
const user = await storage.get('user:123')
+
// see where it came from
+
const result = await storage.getWithMetadata('user:123')
+
console.log(result.source) // 'hot', 'warm', or 'cold'
+
// nuke by prefix
+
await storage.invalidate('user:')
```
+
## How it works
```
+
┌─────────────────────────────────────────────┐
+
│ Cold (S3) - source of truth, all data │
+
│ ↑ │
+
│ Warm (disk) - everything hot has + more │
+
│ ↑ │
+
│ Hot (memory) - just the hottest stuff │
+
└─────────────────────────────────────────────┘
```
+
Writes cascade **down**. Reads bubble **up**.
+
## API
+
### `storage.get(key)`
+
Get data. Returns `null` if missing or expired.
+
### `storage.getWithMetadata(key)`
+
Get data plus which tier served it.
+
### `storage.set(key, data, options?)`
+
Store data. Options:
```typescript
+
{
+
ttl: 86400000, // custom TTL
+
skipTiers: ['hot'], // skip specific tiers
+
metadata: { ... }, // custom metadata
}
```
+
### `storage.delete(key)`
+
Delete from all tiers.
+
### `storage.invalidate(prefix)`
+
Delete all keys matching prefix. Returns count.
+
### `storage.touch(key, ttl?)`
+
Renew TTL.
+
### `storage.listKeys(prefix?)`
+
Async iterator over keys.
+
### `storage.getStats()`
+
Stats across all tiers.
+
### `storage.bootstrapHot(limit?)`
+
Warm up hot tier from warm tier. Run on startup.
+
### `storage.bootstrapWarm(options?)`
+
Warm up warm tier from cold tier.
+
## Built-in tiers
+
### MemoryStorageTier
```typescript
+
new MemoryStorageTier({
+
maxSizeBytes: 100 * 1024 * 1024,
+
maxItems: 1000,
+
})
```
+
LRU eviction. Fast. Single process only.
+
### DiskStorageTier
```typescript
+
new DiskStorageTier({
directory: './cache',
+
maxSizeBytes: 10 * 1024 * 1024 * 1024,
+
evictionPolicy: 'lru', // or 'fifo', 'size'
+
})
```
+
Files on disk with `.meta` sidecars.
+
### S3StorageTier
```typescript
+
new S3StorageTier({
+
bucket: 'data',
+
metadataBucket: 'metadata', // recommended!
region: 'us-east-1',
+
})
```
+
Works with AWS S3, Cloudflare R2, MinIO. Use a separate metadata bucket—otherwise updating access counts requires copying entire objects.
+
## Custom tiers
+
Implement `StorageTier`:
```typescript
+
interface StorageTier {
+
get(key: string): Promise<Uint8Array | null>
+
set(key: string, data: Uint8Array, metadata: StorageMetadata): Promise<void>
+
delete(key: string): Promise<void>
+
exists(key: string): Promise<boolean>
+
listKeys(prefix?: string): AsyncIterableIterator<string>
+
deleteMany(keys: string[]): Promise<void>
+
getMetadata(key: string): Promise<StorageMetadata | null>
+
setMetadata(key: string, metadata: StorageMetadata): Promise<void>
+
getStats(): Promise<TierStats>
+
clear(): Promise<void>
+
}
```
+
## Skipping tiers
+
Don't want big videos in memory? Skip hot:
```typescript
+
await storage.set('video.mp4', data, { skipTiers: ['hot'] })
```
+
## Running the demo
```bash
+
cp .env.example .env # add S3 creds
bun run serve
```
+
Visit http://localhost:3000 to see it work. Check http://localhost:3000/admin/stats for live cache stats.
## License
MIT