🪻 distributed transcription service thistle.dunkirk.sh
1# Thistle - Project Guidelines 2 3This is a Bun-based transcription service using the [Bun fullstack pattern](https://bun.com/docs/bundler/fullstack) for routing and bundled HTML. 4 5## Workflow 6 7**IMPORTANT**: Do NOT commit changes until the user explicitly asks you to commit. Always wait for user verification that changes are working correctly before making commits. 8 9## Project Info 10 11- Name: Thistle 12- Purpose: Transcription service 13- Runtime: Bun (NOT Node.js) 14- Language: TypeScript with strict mode 15- Frontend: Vanilla HTML/CSS/JS with lightweight helpers on top of web components 16 17## Design System 18 19ALWAYS use the project's CSS variables for colors: 20 21```css 22:root { 23 /* Color palette */ 24 --gunmetal: #2d3142ff; /* dark blue-gray */ 25 --paynes-gray: #4f5d75ff; /* medium blue-gray */ 26 --silver: #bfc0c0ff; /* light gray */ 27 --white: #ffffffff; /* white */ 28 --coral: #ef8354ff; /* warm orange */ 29 30 /* Semantic color assignments */ 31 --text: var(--gunmetal); 32 --background: var(--white); 33 --primary: var(--paynes-gray); 34 --secondary: var(--silver); 35 --accent: var(--coral); 36} 37``` 38 39**Color usage:** 40- NEVER hardcode colors like `#4f46e5`, `white`, `red`, etc. 41- Always use semantic variables (`var(--primary)`, `var(--background)`, `var(--accent)`, etc.) or named color variables (`var(--gunmetal)`, `var(--coral)`, etc.) 42 43**Dimensions:** 44- Use `rem` for all sizes, spacing, and widths (not `px`) 45- Base font size is 16px (1rem = 16px) 46- Common values: `0.5rem` (8px), `1rem` (16px), `2rem` (32px), `3rem` (48px) 47- Max widths: `48rem` (768px) for content, `56rem` (896px) for forms/data 48- Spacing scale: `0.25rem`, `0.5rem`, `0.75rem`, `1rem`, `1.5rem`, `2rem`, `3rem` 49 50## NO FRAMEWORKS 51 52NEVER use React, Vue, Svelte, or any heavy framework. 53 54This project prioritizes: 55- Speed: Minimal JavaScript, fast load times 56- Small bundle sizes: Keep bundles tiny 57- Native web platform: Use web standards (Web Components, native DOM APIs) 58- Simplicity: Vanilla HTML, CSS, and JavaScript 59 60Allowed lightweight helpers: 61- Lit (~8-10KB gzipped) for reactive web components 62- Native Web Components 63- Plain JavaScript/TypeScript 64 65Explicitly forbidden: 66- React, React DOM 67- Vue 68- Svelte 69- Angular 70- Any framework with a virtual DOM or large runtime 71 72## Commands 73 74```bash 75# Install dependencies 76bun install 77 78# Development server with hot reload 79bun dev 80 81# Run tests 82bun test 83 84# Build files 85bun build <file.html|file.ts|file.css> 86 87# Make a user an admin 88bun scripts/make-admin.ts <email> 89``` 90 91Development workflow: `bun dev` runs the server with hot module reloading. Changes to TypeScript, HTML, or CSS files automatically reload. 92 93**IMPORTANT**: NEVER run `bun dev` yourself - the user always has it running already. 94 95## Bun Usage 96 97Default to using Bun instead of Node.js. 98 99- Use `bun <file>` instead of `node <file>` or `ts-node <file>` 100- Use `bun test` instead of `jest` or `vitest` 101- Use `bun build <file>` instead of `webpack` or `esbuild` 102- Use `bun install` instead of `npm install` or `yarn install` or `pnpm install` 103- Use `bun run <script>` instead of `npm run <script>` or `yarn run <script>` 104- Bun automatically loads .env, so don't use dotenv 105 106## Bun APIs 107 108Use Bun's built-in APIs instead of npm packages: 109 110- `Bun.serve()` supports WebSockets, HTTPS, and routes. Don't use `express`. 111- `bun:sqlite` for SQLite. Don't use `better-sqlite3`. 112- `Bun.redis` for Redis. Don't use `ioredis`. 113- `Bun.sql` for Postgres. Don't use `pg` or `postgres.js`. 114- `WebSocket` is built-in. Don't use `ws`. 115- Prefer `Bun.file` over `node:fs`'s readFile/writeFile 116- `Bun.$\`ls\`` instead of execa 117 118## Server Setup 119 120Use `Bun.serve()` with the routes pattern: 121 122```ts 123import index from "./index.html" 124 125Bun.serve({ 126 routes: { 127 "/": index, 128 "/api/users/:id": { 129 GET: (req) => { 130 return new Response(JSON.stringify({ id: req.params.id })); 131 }, 132 }, 133 }, 134 // optional websocket support 135 websocket: { 136 open: (ws) => { 137 ws.send("Hello, world!"); 138 }, 139 message: (ws, message) => { 140 ws.send(message); 141 }, 142 close: (ws) => { 143 // handle close 144 } 145 }, 146 development: { 147 hmr: true, 148 console: true, 149 } 150}) 151``` 152 153## Frontend Pattern 154 155Don't use Vite or any build tools. Use HTML imports with `Bun.serve()`. 156 157HTML files can directly import `.ts` or `.js` files: 158 159```html 160<!DOCTYPE html> 161<html lang="en"> 162 163<head> 164 <meta charset="UTF-8"> 165 <meta name="viewport" content="width=device-width, initial-scale=1.0"> 166 <title>Page Title - Thistle</title> 167 <link rel="icon" 168 href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='0.9em' font-size='90'>🪻</text></svg>"> 169 <link rel="stylesheet" href="../styles/main.css"> 170</head> 171 172<body> 173 <auth-component></auth-component> 174 175 <main> 176 <h1>Page Title</h1> 177 <my-component></my-component> 178 </main> 179 180 <script type="module" src="../components/auth.ts"></script> 181 <script type="module" src="../components/my-component.ts"></script> 182</body> 183 184</html> 185``` 186 187**Standard HTML template:** 188- Always include the `<auth-component>` element for consistent login/logout UI 189- Always include the thistle emoji favicon 190- Always include proper meta tags (charset, viewport) 191- Structure: auth component, then main content, then scripts 192- Import `auth.ts` on every page for authentication UI 193 194Bun's bundler will transpile and bundle automatically. `<link>` tags pointing to stylesheets work with Bun's CSS bundler. 195 196Frontend TypeScript (vanilla or with Lit web components): 197 198```ts 199import { LitElement, html, css } from 'lit'; 200import { customElement, property } from 'lit/decorators.js'; 201 202// Define a Lit web component 203@customElement('my-component') 204export class MyComponent extends LitElement { 205 @property({ type: String }) name = 'World'; 206 207 // Scoped styles using css tagged template 208 static styles = css` 209 :host { 210 display: block; 211 padding: 1rem; 212 } 213 .greeting { 214 color: blue; 215 } 216 `; 217 218 // Render using html tagged template 219 render() { 220 return html` 221 <div class="greeting"> 222 Hello, ${this.name}! 223 </div> 224 `; 225 } 226} 227 228// Or use plain DOM manipulation for simple interactions 229document.querySelector('h1')?.addEventListener('click', () => { 230 console.log('Clicked!'); 231}); 232``` 233 234**When to use Lit:** 235- Components with reactive properties (auto-updates when data changes) 236- Complex components needing scoped styles 237- Form controls with internal state 238- Components with lifecycle needs 239 240**When to skip Lit:** 241- Static content (use plain HTML) 242- Simple one-off interactions (use vanilla JS) 243- Anything without reactive state 244 245Lit provides: 246- `@customElement` decorator to register components 247- `@property` decorator for reactive properties 248- `html` tagged template for declarative rendering 249- `css` tagged template for scoped styles 250- Automatic re-rendering when properties change 251- Size: ~8-10KB minified+gzipped 252 253## Testing 254 255Use `bun test` to run tests. 256 257### Basic Test Structure 258 259```ts 260import { test, expect } from "bun:test"; 261 262test("hello world", () => { 263 expect(1).toBe(1); 264}); 265``` 266 267### Test File Naming 268 269- Place tests next to the code they test: `foo.ts``foo.test.ts` 270- This keeps tests close to implementation for easy maintenance 271- Bun automatically discovers `*.test.ts` files 272 273### Writing Good Tests 274 275**Test security-critical code:** 276- File path operations (directory traversal, injection) 277- User input validation 278- Authentication/authorization 279- API endpoint security 280 281**Test edge cases:** 282- Empty strings, null, undefined 283- Very large inputs (size limits) 284- Invalid formats 285- Boundary conditions 286 287**Test async operations:** 288```ts 289test("async function", async () => { 290 const result = await someAsyncFunction(); 291 expect(result).toBe("expected value"); 292}); 293``` 294 295**Test error conditions:** 296```ts 297test("rejects invalid input", async () => { 298 await expect(dangerousFunction("../../../etc/passwd")).rejects.toThrow(); 299 await expect(dangerousFunction("invalid")).rejects.toThrow("Invalid format"); 300}); 301``` 302 303**Example: Security-focused tests** 304```ts 305test("prevents directory traversal", async () => { 306 const maliciousIds = [ 307 "../../../etc/passwd", 308 "../../secret.txt", 309 "test/../../../config", 310 ]; 311 312 for (const id of maliciousIds) { 313 await expect(loadFile(id)).rejects.toThrow(); 314 } 315}); 316 317test("validates input format", async () => { 318 const invalidInputs = [ 319 "test; rm -rf /", 320 "test`whoami`", 321 "test\x00null", 322 ]; 323 324 for (const input of invalidInputs) { 325 await expect(processInput(input)).rejects.toThrow("Invalid format"); 326 } 327}); 328``` 329 330### Running Tests 331 332```bash 333# Run all tests 334bun test 335 336# Run specific test file 337bun test src/lib/auth.test.ts 338 339# Watch mode (re-run on changes) 340bun test --watch 341``` 342 343### What to Test 344 345**Always test:** 346- Security-critical functions (file I/O, user input) 347- Complex business logic 348- Edge cases and error handling 349- Public API functions 350 351**Don't need to test:** 352- Simple getters/setters 353- Framework/library code 354- UI components (unless complex logic) 355- One-line utility functions 356 357## TypeScript Configuration 358 359Strict mode is enabled with these settings: 360 361```json 362{ 363 "strict": true, 364 "noFallthroughCasesInSwitch": true, 365 "noUncheckedIndexedAccess": true, 366 "noImplicitOverride": true 367} 368``` 369 370Deliberately disabled: 371- `noUnusedLocals`: false 372- `noUnusedParameters`: false 373- `noPropertyAccessFromIndexSignature`: false 374 375Module system: 376- `moduleResolution`: "bundler" 377- `module`: "Preserve" 378- JSX: `preserve` (NOT react-jsx - we don't use React) 379- Allows importing `.ts` extensions directly 380 381## Frontend Technologies 382 383Core (always use): 384- Vanilla HTML, CSS, JavaScript/TypeScript 385- Native Web Components API 386- Native DOM APIs (querySelector, addEventListener, etc.) 387 388Lightweight helpers: 389- Lit (~8-10KB gzipped): For reactive web components with state management 390 391Bundle size philosophy: 392- Start with vanilla JS 393- Add helpers only when they significantly reduce complexity 394- Measure bundle size impact before adding any library 395- Target: Keep total JS bundle under 50KB 396 397## Project Structure 398 399Based on Bun fullstack pattern: 400- `src/index.ts`: Server imports HTML files as modules 401- `src/pages/`: HTML files (route entry points) 402- `src/components/`: Lit web components 403- `src/styles/`: CSS files 404- `public/`: Static assets (images, fonts, etc.) 405 406**File flow:** 4071. Server imports HTML: `import indexHTML from "./pages/index.html"` 4082. HTML imports components: `<script type="module" src="../components/counter.ts"></script>` 4093. HTML links styles: `<link rel="stylesheet" href="../styles/main.css">` 4104. Components self-register as custom elements 4115. Bun bundles everything automatically 412 413## Database Schema & Migrations 414 415Database migrations are managed in `src/db/schema.ts` using a versioned migration system. 416 417**Migration structure:** 418```typescript 419const migrations = [ 420 { 421 version: 1, 422 name: "Description of migration", 423 sql: ` 424 CREATE TABLE IF NOT EXISTS ...; 425 CREATE INDEX IF NOT EXISTS ...; 426 `, 427 }, 428]; 429``` 430 431**Important migration rules:** 4321. **Never modify existing migrations** - they may have already run in production 4332. **Always add new migrations** with incrementing version numbers 4343. **Drop indexes before dropping columns** - SQLite will error if you try to drop a column with an index still attached 4354. **Use IF NOT EXISTS** for CREATE statements to be idempotent 4365. **Test migrations** on a copy of production data before deploying 437 438**Example: Dropping a column** 439```sql 440-- ❌ WRONG: Will error if idx_users_old_column exists 441ALTER TABLE users DROP COLUMN old_column; 442 443-- ✅ CORRECT: Drop index first, then column 444DROP INDEX IF EXISTS idx_users_old_column; 445ALTER TABLE users DROP COLUMN old_column; 446``` 447 448**Migration workflow:** 4491. Add migration to `migrations` array with next version number 4502. Migrations auto-apply on server start 4513. Check `schema_migrations` table to see applied versions 4524. Migrations are transactional and show timing in console 453 454## File Organization 455 456- `src/index.ts`: Main server entry point with `Bun.serve()` routes 457- `src/pages/*.html`: Route entry points (imported as modules) 458- `src/components/*.ts`: Lit web components 459- `src/styles/*.css`: Stylesheets (linked from HTML) 460- `public/`: Static assets directory 461- Tests: `*.test.ts` files 462 463**Current structure example:** 464``` 465src/ 466 index.ts # Imports HTML, defines routes 467 pages/ 468 index.html # Imports components via <script type="module"> 469 components/ 470 counter.ts # Lit component with @customElement 471 styles/ 472 main.css # Linked from HTML with <link> 473``` 474 475## Naming Conventions 476 477Follow TypeScript conventions: 478- PascalCase for components and classes 479- camelCase for functions and variables 480- kebab-case for file names 481 482## Development Workflow 483 4841. Make changes to `.ts`, `.html`, or `.css` files 4852. Bun's HMR automatically reloads changes 4863. Write tests in `*.test.ts` files 4874. Run `bun test` to verify 488 489## IDE Setup 490 491Biome LSP is configured in `crush.json` for linting and formatting support. 492 493## Common Tasks 494 495### Adding a new route 496Add to the `routes` object in `Bun.serve()` configuration 497 498### Adding a new page 499Create an HTML file, import it in the server, add to routes 500 501### Adding frontend functionality 502Import TS/JS files directly from HTML using `<script type="module" src="../components/my-component.ts"></script>`. Use Lit for reactive components or vanilla JS for simple interactions. Never React. 503 504### Adding WebSocket support 505Add `websocket` configuration to `Bun.serve()` 506 507## Important Notes 508 5091. No npm scripts needed: Bun is fast enough to run commands directly 5102. Private package: `package.json` has `"private": true` 5113. No build step for development: Hot reload handles everything 5124. Module type: Package uses `"type": "module"` (ESM) 5135. Bun types: Available via `@types/bun` (check `node_modules/bun-types/docs/**.md` for API docs) 514 515## Gotchas 516 5171. Don't use Node.js commands: Use `bun` instead of `node`, `npm`, `npx`, etc. 5182. Don't install Express/Vite/other tools: Bun has built-in equivalents 5193. NEVER EVER use React: This project is vanilla JS/TS with web components only. React is explicitly forbidden. 5204. Import .ts extensions: Bun allows importing `.ts` files directly 5215. No dotenv needed: Bun loads `.env` automatically 5226. HTML imports are special: They trigger Bun's bundler, don't treat them as static files 5237. Bundle size matters: Always consider the size impact before adding any library 524 525## Documentation Lookup 526 527Use Context7 MCP for looking up official documentation for libraries and frameworks. 528 529## Resources 530 531- [Bun Fullstack Documentation](https://bun.com/docs/bundler/fullstack) 532- [Lit Documentation](https://lit.dev/) 533- [Web Components MDN](https://developer.mozilla.org/en-US/docs/Web/Web_Components) 534- Bun API docs in `node_modules/bun-types/docs/**.md` 535 536## Admin System 537 538The application includes a role-based admin system for managing users and transcriptions. 539 540**User roles:** 541- `user` - Default role, can create and manage their own transcriptions 542- `admin` - Full administrative access to all data and users 543 544**Admin privileges:** 545- View all transcriptions (with user info, status, errors) 546- Delete transcriptions 547- View all users (with emails, join dates, roles) 548- Change user roles (user ↔ admin) 549- Delete user accounts 550- Access admin dashboard at `/admin` 551 552**Making users admin:** 553Use the provided script to grant admin access: 554```bash 555bun scripts/make-admin.ts user@example.com 556``` 557 558**Admin routes:** 559- `/admin` - Admin dashboard (protected by `requireAdmin` middleware) 560- `/api/admin/transcriptions` - Get all transcriptions with user info 561- `/api/admin/transcriptions/:id` - Delete a transcription (DELETE) 562- `/api/admin/users` - Get all users 563- `/api/admin/users/:id` - Delete a user account (DELETE) 564- `/api/admin/users/:id/role` - Update a user's role (PUT) 565 566**Admin UI features:** 567- Statistics cards (total users, total/failed transcriptions) 568- Tabbed interface (Pending Recordings / Transcriptions / Users / Classes) 569- Status badges for transcription states 570- Delete buttons for transcriptions with confirmation 571- Role dropdown for changing user roles 572- Delete buttons for user accounts with confirmation 573- User avatars and info display 574- Timestamp formatting 575- Admin badge on user listings 576- Query parameter support for direct tab navigation (`?tab=<tabname>`) 577 578**Admin tab navigation:** 579- `/admin` - Opens to default "pending" tab 580- `/admin?tab=pending` - Pending recordings tab 581- `/admin?tab=transcriptions` - All transcriptions tab 582- `/admin?tab=users` - Users management tab 583- `/admin?tab=classes` - Classes management tab 584- URL updates when switching tabs (browser history support) 585 586**Implementation notes:** 587- `role` column in users table ('user' or 'admin', default 'user') 588- `requireAdmin()` middleware checks authentication + admin role 589- Returns 403 if non-admin tries to access admin routes 590- Admin link shows in auth menu only for admin users 591- Redirects to home page if non-admin accesses admin page 592 593## Subscription System 594 595The application uses Polar for subscription management to gate access to transcription features. 596 597**Subscription requirement:** 598- Users must have an active subscription to upload and transcribe audio files 599- Users can join classes and request classes without a subscription 600- Admins bypass subscription requirements 601 602**Protected routes:** 603- `POST /api/transcriptions` - Upload audio file (requires subscription or admin) 604- `GET /api/transcriptions` - List user's transcriptions (requires subscription or admin) 605- `GET /api/transcriptions/:id` - Get transcription details (requires subscription or admin) 606- `GET /api/transcriptions/:id/audio` - Download audio file (requires subscription or admin) 607- `GET /api/transcriptions/:id/stream` - Real-time transcription updates (requires subscription or admin) 608 609**Open routes (no subscription required):** 610- All authentication endpoints (`/api/auth/*`) 611- Class search and joining (`/api/classes/search`, `/api/classes/join`) 612- Waitlist requests (`/api/classes/waitlist`) 613- Billing/subscription management (`/api/billing/*`) 614 615**Subscription statuses:** 616- `active` - Full access to transcription features 617- `trialing` - Trial period, full access 618- `past_due` - Payment failed but still has access (grace period) 619- `canceled` - No access to transcription features 620- `expired` - No access to transcription features 621 622**Implementation:** 623- `subscriptions` table tracks user subscriptions from Polar 624- `hasActiveSubscription(userId)` checks for active/trialing/past_due status 625- `requireSubscription()` middleware enforces subscription requirement 626- `/api/auth/me` returns `has_subscription` boolean 627- Webhook at `/api/webhooks/polar` receives subscription updates from Polar 628- Frontend components check `has_subscription` and show subscribe prompt 629 630**User settings with query parameters:** 631- Settings page supports `?tab=<tabname>` query parameter to open specific tabs 632- Valid tabs: `account`, `sessions`, `passkeys`, `billing`, `danger` 633- Example: `/settings?tab=billing` opens the billing tab directly 634- Subscribe prompts link to `/settings?tab=billing` for direct access 635- URL updates when switching tabs (browser history support) 636 637**Testing subscriptions:** 638Manually add a test subscription to the database: 639```sql 640INSERT INTO subscriptions (id, user_id, customer_id, status) 641VALUES ('test-sub', <user_id>, 'test-customer', 'active'); 642``` 643 644## Transcription Service Integration (Murmur) 645 646The application uses [Murmur](https://github.com/taciturnaxolotl/murmur) as the transcription backend. 647 648**Murmur API endpoints:** 649- `POST /transcribe` - Upload audio file and create transcription job 650- `GET /transcribe/:job_id` - Get job status and transcript (supports `?format=json|vtt`) 651- `GET /transcribe/:job_id/stream` - Stream real-time progress via Server-Sent Events 652- `GET /jobs` - List all jobs (newest first) 653- `DELETE /transcribe/:job_id` - Delete a job from Murmur's database 654 655**Job synchronization:** 656The `TranscriptionService` runs periodic syncs to reconcile state between our database and Murmur: 657- Reconnects to active jobs on server restart 658- Syncs status updates for processing/transcribing jobs 659- Handles completed jobs (fetches VTT, cleans transcript, saves to storage) 660- **Cleans up finished jobs** - After successful completion or failure, jobs are deleted from Murmur 661- **Cleans up orphaned jobs** - Jobs found in Murmur but not in our database are automatically deleted 662 663**Job cleanup:** 664- **Completed jobs**: After fetching transcript and saving to storage, the job is deleted from Murmur 665- **Failed jobs**: After recording the error in our database, the job is deleted from Murmur 666- **Orphaned jobs**: Jobs in Murmur but not in our database are deleted on discovery 667- All deletions use `DELETE /transcribe/:job_id` 668- This prevents Murmur's database from accumulating stale jobs (Murmur doesn't have automatic cleanup) 669- Logs success/failure of deletion attempts for monitoring 670 671**Job lifecycle:** 6721. User uploads audio → creates transcription in our DB with `status='uploading'` 6732. Audio uploaded to Murmur → get `whisper_job_id`, update to `status='processing'` 6743. Murmur transcribes → stream progress updates, update to `status='transcribing'` 6754. Job completes → fetch VTT, clean with LLM, save transcript, update to `status='completed'`, **delete from Murmur** 6765. If job fails in Murmur → update to `status='failed'` with error message, **delete from Murmur** 677 678**Configuration:** 679Set `WHISPER_SERVICE_URL` in `.env` (default: `http://localhost:8000`) 680 681## Future Additions 682 683As the codebase grows, document: 684- Database schema and migrations 685- API endpoint patterns 686- Authentication/authorization approach 687- Deployment process 688- Environment variables needed 689