🪻 distributed transcription service
thistle.dunkirk.sh
1# Thistle - Project Guidelines
2
3This is a Bun-based transcription service using the [Bun fullstack pattern](https://bun.com/docs/bundler/fullstack) for routing and bundled HTML.
4
5## Workflow
6
7**IMPORTANT**: Do NOT commit changes until the user explicitly asks you to commit. Always wait for user verification that changes are working correctly before making commits.
8
9## Project Info
10
11- Name: Thistle
12- Purpose: Transcription service
13- Runtime: Bun (NOT Node.js)
14- Language: TypeScript with strict mode
15- Frontend: Vanilla HTML/CSS/JS with lightweight helpers on top of web components
16
17## Design System
18
19ALWAYS use the project's CSS variables for colors:
20
21```css
22:root {
23 /* Color palette */
24 --gunmetal: #2d3142ff; /* dark blue-gray */
25 --paynes-gray: #4f5d75ff; /* medium blue-gray */
26 --silver: #bfc0c0ff; /* light gray */
27 --white: #ffffffff; /* white */
28 --coral: #ef8354ff; /* warm orange */
29
30 /* Semantic color assignments */
31 --text: var(--gunmetal);
32 --background: var(--white);
33 --primary: var(--paynes-gray);
34 --secondary: var(--silver);
35 --accent: var(--coral);
36}
37```
38
39**Color usage:**
40- NEVER hardcode colors like `#4f46e5`, `white`, `red`, etc.
41- Always use semantic variables (`var(--primary)`, `var(--background)`, `var(--accent)`, etc.) or named color variables (`var(--gunmetal)`, `var(--coral)`, etc.)
42
43**Dimensions:**
44- Use `rem` for all sizes, spacing, and widths (not `px`)
45- Base font size is 16px (1rem = 16px)
46- Common values: `0.5rem` (8px), `1rem` (16px), `2rem` (32px), `3rem` (48px)
47- Max widths: `48rem` (768px) for content, `56rem` (896px) for forms/data
48- Spacing scale: `0.25rem`, `0.5rem`, `0.75rem`, `1rem`, `1.5rem`, `2rem`, `3rem`
49
50## NO FRAMEWORKS
51
52NEVER use React, Vue, Svelte, or any heavy framework.
53
54This project prioritizes:
55- Speed: Minimal JavaScript, fast load times
56- Small bundle sizes: Keep bundles tiny
57- Native web platform: Use web standards (Web Components, native DOM APIs)
58- Simplicity: Vanilla HTML, CSS, and JavaScript
59
60Allowed lightweight helpers:
61- Lit (~8-10KB gzipped) for reactive web components
62- Native Web Components
63- Plain JavaScript/TypeScript
64
65Explicitly forbidden:
66- React, React DOM
67- Vue
68- Svelte
69- Angular
70- Any framework with a virtual DOM or large runtime
71
72## Commands
73
74```bash
75# Install dependencies
76bun install
77
78# Development server with hot reload
79bun dev
80
81# Run tests
82bun test
83
84# Build files
85bun build <file.html|file.ts|file.css>
86
87# Make a user an admin
88bun scripts/make-admin.ts <email>
89```
90
91Development workflow: `bun dev` runs the server with hot module reloading. Changes to TypeScript, HTML, or CSS files automatically reload.
92
93**IMPORTANT**: NEVER run `bun dev` yourself - the user always has it running already.
94
95## Bun Usage
96
97Default to using Bun instead of Node.js.
98
99- Use `bun <file>` instead of `node <file>` or `ts-node <file>`
100- Use `bun test` instead of `jest` or `vitest`
101- Use `bun build <file>` instead of `webpack` or `esbuild`
102- Use `bun install` instead of `npm install` or `yarn install` or `pnpm install`
103- Use `bun run <script>` instead of `npm run <script>` or `yarn run <script>`
104- Bun automatically loads .env, so don't use dotenv
105
106## Bun APIs
107
108Use Bun's built-in APIs instead of npm packages:
109
110- `Bun.serve()` supports WebSockets, HTTPS, and routes. Don't use `express`.
111- `bun:sqlite` for SQLite. Don't use `better-sqlite3`.
112- `Bun.redis` for Redis. Don't use `ioredis`.
113- `Bun.sql` for Postgres. Don't use `pg` or `postgres.js`.
114- `WebSocket` is built-in. Don't use `ws`.
115- Prefer `Bun.file` over `node:fs`'s readFile/writeFile
116- `Bun.$\`ls\`` instead of execa
117
118## Server Setup
119
120Use `Bun.serve()` with the routes pattern:
121
122```ts
123import index from "./index.html"
124
125Bun.serve({
126 routes: {
127 "/": index,
128 "/api/users/:id": {
129 GET: (req) => {
130 return new Response(JSON.stringify({ id: req.params.id }));
131 },
132 },
133 },
134 // optional websocket support
135 websocket: {
136 open: (ws) => {
137 ws.send("Hello, world!");
138 },
139 message: (ws, message) => {
140 ws.send(message);
141 },
142 close: (ws) => {
143 // handle close
144 }
145 },
146 development: {
147 hmr: true,
148 console: true,
149 }
150})
151```
152
153## Frontend Pattern
154
155Don't use Vite or any build tools. Use HTML imports with `Bun.serve()`.
156
157HTML files can directly import `.ts` or `.js` files:
158
159```html
160<!DOCTYPE html>
161<html lang="en">
162
163<head>
164 <meta charset="UTF-8">
165 <meta name="viewport" content="width=device-width, initial-scale=1.0">
166 <title>Page Title - Thistle</title>
167 <link rel="icon"
168 href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='0.9em' font-size='90'>🪻</text></svg>">
169 <link rel="stylesheet" href="../styles/main.css">
170</head>
171
172<body>
173 <auth-component></auth-component>
174
175 <main>
176 <h1>Page Title</h1>
177 <my-component></my-component>
178 </main>
179
180 <script type="module" src="../components/auth.ts"></script>
181 <script type="module" src="../components/my-component.ts"></script>
182</body>
183
184</html>
185```
186
187**Standard HTML template:**
188- Always include the `<auth-component>` element for consistent login/logout UI
189- Always include the thistle emoji favicon
190- Always include proper meta tags (charset, viewport)
191- Structure: auth component, then main content, then scripts
192- Import `auth.ts` on every page for authentication UI
193
194Bun's bundler will transpile and bundle automatically. `<link>` tags pointing to stylesheets work with Bun's CSS bundler.
195
196Frontend TypeScript (vanilla or with Lit web components):
197
198```ts
199import { LitElement, html, css } from 'lit';
200import { customElement, property } from 'lit/decorators.js';
201
202// Define a Lit web component
203@customElement('my-component')
204export class MyComponent extends LitElement {
205 @property({ type: String }) name = 'World';
206
207 // Scoped styles using css tagged template
208 static styles = css`
209 :host {
210 display: block;
211 padding: 1rem;
212 }
213 .greeting {
214 color: blue;
215 }
216 `;
217
218 // Render using html tagged template
219 render() {
220 return html`
221 <div class="greeting">
222 Hello, ${this.name}!
223 </div>
224 `;
225 }
226}
227
228// Or use plain DOM manipulation for simple interactions
229document.querySelector('h1')?.addEventListener('click', () => {
230 console.log('Clicked!');
231});
232```
233
234**When to use Lit:**
235- Components with reactive properties (auto-updates when data changes)
236- Complex components needing scoped styles
237- Form controls with internal state
238- Components with lifecycle needs
239
240**When to skip Lit:**
241- Static content (use plain HTML)
242- Simple one-off interactions (use vanilla JS)
243- Anything without reactive state
244
245Lit provides:
246- `@customElement` decorator to register components
247- `@property` decorator for reactive properties
248- `html` tagged template for declarative rendering
249- `css` tagged template for scoped styles
250- Automatic re-rendering when properties change
251- Size: ~8-10KB minified+gzipped
252
253## Testing
254
255Use `bun test` to run tests.
256
257### Basic Test Structure
258
259```ts
260import { test, expect } from "bun:test";
261
262test("hello world", () => {
263 expect(1).toBe(1);
264});
265```
266
267### Test File Naming
268
269- Place tests next to the code they test: `foo.ts` → `foo.test.ts`
270- This keeps tests close to implementation for easy maintenance
271- Bun automatically discovers `*.test.ts` files
272
273### Writing Good Tests
274
275**Test security-critical code:**
276- File path operations (directory traversal, injection)
277- User input validation
278- Authentication/authorization
279- API endpoint security
280
281**Test edge cases:**
282- Empty strings, null, undefined
283- Very large inputs (size limits)
284- Invalid formats
285- Boundary conditions
286
287**Test async operations:**
288```ts
289test("async function", async () => {
290 const result = await someAsyncFunction();
291 expect(result).toBe("expected value");
292});
293```
294
295**Test error conditions:**
296```ts
297test("rejects invalid input", async () => {
298 await expect(dangerousFunction("../../../etc/passwd")).rejects.toThrow();
299 await expect(dangerousFunction("invalid")).rejects.toThrow("Invalid format");
300});
301```
302
303**Example: Security-focused tests**
304```ts
305test("prevents directory traversal", async () => {
306 const maliciousIds = [
307 "../../../etc/passwd",
308 "../../secret.txt",
309 "test/../../../config",
310 ];
311
312 for (const id of maliciousIds) {
313 await expect(loadFile(id)).rejects.toThrow();
314 }
315});
316
317test("validates input format", async () => {
318 const invalidInputs = [
319 "test; rm -rf /",
320 "test`whoami`",
321 "test\x00null",
322 ];
323
324 for (const input of invalidInputs) {
325 await expect(processInput(input)).rejects.toThrow("Invalid format");
326 }
327});
328```
329
330### Running Tests
331
332```bash
333# Run all tests
334bun test
335
336# Run specific test file
337bun test src/lib/auth.test.ts
338
339# Watch mode (re-run on changes)
340bun test --watch
341```
342
343### What to Test
344
345**Always test:**
346- Security-critical functions (file I/O, user input)
347- Complex business logic
348- Edge cases and error handling
349- Public API functions
350
351**Don't need to test:**
352- Simple getters/setters
353- Framework/library code
354- UI components (unless complex logic)
355- One-line utility functions
356
357## TypeScript Configuration
358
359Strict mode is enabled with these settings:
360
361```json
362{
363 "strict": true,
364 "noFallthroughCasesInSwitch": true,
365 "noUncheckedIndexedAccess": true,
366 "noImplicitOverride": true
367}
368```
369
370Deliberately disabled:
371- `noUnusedLocals`: false
372- `noUnusedParameters`: false
373- `noPropertyAccessFromIndexSignature`: false
374
375Module system:
376- `moduleResolution`: "bundler"
377- `module`: "Preserve"
378- JSX: `preserve` (NOT react-jsx - we don't use React)
379- Allows importing `.ts` extensions directly
380
381## Frontend Technologies
382
383Core (always use):
384- Vanilla HTML, CSS, JavaScript/TypeScript
385- Native Web Components API
386- Native DOM APIs (querySelector, addEventListener, etc.)
387
388Lightweight helpers:
389- Lit (~8-10KB gzipped): For reactive web components with state management
390
391Bundle size philosophy:
392- Start with vanilla JS
393- Add helpers only when they significantly reduce complexity
394- Measure bundle size impact before adding any library
395- Target: Keep total JS bundle under 50KB
396
397## Project Structure
398
399Based on Bun fullstack pattern:
400- `src/index.ts`: Server imports HTML files as modules
401- `src/pages/`: HTML files (route entry points)
402- `src/components/`: Lit web components
403- `src/styles/`: CSS files
404- `public/`: Static assets (images, fonts, etc.)
405
406**File flow:**
4071. Server imports HTML: `import indexHTML from "./pages/index.html"`
4082. HTML imports components: `<script type="module" src="../components/counter.ts"></script>`
4093. HTML links styles: `<link rel="stylesheet" href="../styles/main.css">`
4104. Components self-register as custom elements
4115. Bun bundles everything automatically
412
413## Database Schema & Migrations
414
415Database migrations are managed in `src/db/schema.ts` using a versioned migration system.
416
417**Migration structure:**
418```typescript
419const migrations = [
420 {
421 version: 1,
422 name: "Description of migration",
423 sql: `
424 CREATE TABLE IF NOT EXISTS ...;
425 CREATE INDEX IF NOT EXISTS ...;
426 `,
427 },
428];
429```
430
431**Important migration rules:**
4321. **Never modify existing migrations** - they may have already run in production
4332. **Always add new migrations** with incrementing version numbers
4343. **Drop indexes before dropping columns** - SQLite will error if you try to drop a column with an index still attached
4354. **Use IF NOT EXISTS** for CREATE statements to be idempotent
4365. **Test migrations** on a copy of production data before deploying
437
438**Example: Dropping a column**
439```sql
440-- ❌ WRONG: Will error if idx_users_old_column exists
441ALTER TABLE users DROP COLUMN old_column;
442
443-- ✅ CORRECT: Drop index first, then column
444DROP INDEX IF EXISTS idx_users_old_column;
445ALTER TABLE users DROP COLUMN old_column;
446```
447
448**Migration workflow:**
4491. Add migration to `migrations` array with next version number
4502. Migrations auto-apply on server start
4513. Check `schema_migrations` table to see applied versions
4524. Migrations are transactional and show timing in console
453
454## File Organization
455
456- `src/index.ts`: Main server entry point with `Bun.serve()` routes
457- `src/pages/*.html`: Route entry points (imported as modules)
458- `src/components/*.ts`: Lit web components
459- `src/styles/*.css`: Stylesheets (linked from HTML)
460- `public/`: Static assets directory
461- Tests: `*.test.ts` files
462
463**Current structure example:**
464```
465src/
466 index.ts # Imports HTML, defines routes
467 pages/
468 index.html # Imports components via <script type="module">
469 components/
470 counter.ts # Lit component with @customElement
471 styles/
472 main.css # Linked from HTML with <link>
473```
474
475## Naming Conventions
476
477Follow TypeScript conventions:
478- PascalCase for components and classes
479- camelCase for functions and variables
480- kebab-case for file names
481
482## Development Workflow
483
4841. Make changes to `.ts`, `.html`, or `.css` files
4852. Bun's HMR automatically reloads changes
4863. Write tests in `*.test.ts` files
4874. Run `bun test` to verify
488
489## IDE Setup
490
491Biome LSP is configured in `crush.json` for linting and formatting support.
492
493## Common Tasks
494
495### Adding a new route
496Add to the `routes` object in `Bun.serve()` configuration
497
498### Adding a new page
499Create an HTML file, import it in the server, add to routes
500
501### Adding frontend functionality
502Import TS/JS files directly from HTML using `<script type="module" src="../components/my-component.ts"></script>`. Use Lit for reactive components or vanilla JS for simple interactions. Never React.
503
504### Adding WebSocket support
505Add `websocket` configuration to `Bun.serve()`
506
507## Important Notes
508
5091. No npm scripts needed: Bun is fast enough to run commands directly
5102. Private package: `package.json` has `"private": true`
5113. No build step for development: Hot reload handles everything
5124. Module type: Package uses `"type": "module"` (ESM)
5135. Bun types: Available via `@types/bun` (check `node_modules/bun-types/docs/**.md` for API docs)
514
515## Gotchas
516
5171. Don't use Node.js commands: Use `bun` instead of `node`, `npm`, `npx`, etc.
5182. Don't install Express/Vite/other tools: Bun has built-in equivalents
5193. NEVER EVER use React: This project is vanilla JS/TS with web components only. React is explicitly forbidden.
5204. Import .ts extensions: Bun allows importing `.ts` files directly
5215. No dotenv needed: Bun loads `.env` automatically
5226. HTML imports are special: They trigger Bun's bundler, don't treat them as static files
5237. Bundle size matters: Always consider the size impact before adding any library
524
525## Documentation Lookup
526
527Use Context7 MCP for looking up official documentation for libraries and frameworks.
528
529## Resources
530
531- [Bun Fullstack Documentation](https://bun.com/docs/bundler/fullstack)
532- [Lit Documentation](https://lit.dev/)
533- [Web Components MDN](https://developer.mozilla.org/en-US/docs/Web/Web_Components)
534- Bun API docs in `node_modules/bun-types/docs/**.md`
535
536## Admin System
537
538The application includes a role-based admin system for managing users and transcriptions.
539
540**User roles:**
541- `user` - Default role, can create and manage their own transcriptions
542- `admin` - Full administrative access to all data and users
543
544**Admin privileges:**
545- View all transcriptions (with user info, status, errors)
546- Delete transcriptions
547- View all users (with emails, join dates, roles)
548- Change user roles (user ↔ admin)
549- Delete user accounts
550- Access admin dashboard at `/admin`
551
552**Making users admin:**
553Use the provided script to grant admin access:
554```bash
555bun scripts/make-admin.ts user@example.com
556```
557
558**Admin routes:**
559- `/admin` - Admin dashboard (protected by `requireAdmin` middleware)
560- `/api/admin/transcriptions` - Get all transcriptions with user info
561- `/api/admin/transcriptions/:id` - Delete a transcription (DELETE)
562- `/api/admin/users` - Get all users
563- `/api/admin/users/:id` - Delete a user account (DELETE)
564- `/api/admin/users/:id/role` - Update a user's role (PUT)
565
566**Admin UI features:**
567- Statistics cards (total users, total/failed transcriptions)
568- Tabbed interface (Pending Recordings / Transcriptions / Users / Classes)
569- Status badges for transcription states
570- Delete buttons for transcriptions with confirmation
571- Role dropdown for changing user roles
572- Delete buttons for user accounts with confirmation
573- User avatars and info display
574- Timestamp formatting
575- Admin badge on user listings
576- Query parameter support for direct tab navigation (`?tab=<tabname>`)
577
578**Admin tab navigation:**
579- `/admin` - Opens to default "pending" tab
580- `/admin?tab=pending` - Pending recordings tab
581- `/admin?tab=transcriptions` - All transcriptions tab
582- `/admin?tab=users` - Users management tab
583- `/admin?tab=classes` - Classes management tab
584- URL updates when switching tabs (browser history support)
585
586**Implementation notes:**
587- `role` column in users table ('user' or 'admin', default 'user')
588- `requireAdmin()` middleware checks authentication + admin role
589- Returns 403 if non-admin tries to access admin routes
590- Admin link shows in auth menu only for admin users
591- Redirects to home page if non-admin accesses admin page
592
593## Subscription System
594
595The application uses Polar for subscription management to gate access to transcription features.
596
597**Subscription requirement:**
598- Users must have an active subscription to upload and transcribe audio files
599- Users can join classes and request classes without a subscription
600- Admins bypass subscription requirements
601
602**Protected routes:**
603- `POST /api/transcriptions` - Upload audio file (requires subscription or admin)
604- `GET /api/transcriptions` - List user's transcriptions (requires subscription or admin)
605- `GET /api/transcriptions/:id` - Get transcription details (requires subscription or admin)
606- `GET /api/transcriptions/:id/audio` - Download audio file (requires subscription or admin)
607- `GET /api/transcriptions/:id/stream` - Real-time transcription updates (requires subscription or admin)
608
609**Open routes (no subscription required):**
610- All authentication endpoints (`/api/auth/*`)
611- Class search and joining (`/api/classes/search`, `/api/classes/join`)
612- Waitlist requests (`/api/classes/waitlist`)
613- Billing/subscription management (`/api/billing/*`)
614
615**Subscription statuses:**
616- `active` - Full access to transcription features
617- `trialing` - Trial period, full access
618- `past_due` - Payment failed but still has access (grace period)
619- `canceled` - No access to transcription features
620- `expired` - No access to transcription features
621
622**Implementation:**
623- `subscriptions` table tracks user subscriptions from Polar
624- `hasActiveSubscription(userId)` checks for active/trialing/past_due status
625- `requireSubscription()` middleware enforces subscription requirement
626- `/api/auth/me` returns `has_subscription` boolean
627- Webhook at `/api/webhooks/polar` receives subscription updates from Polar
628- Frontend components check `has_subscription` and show subscribe prompt
629
630**User settings with query parameters:**
631- Settings page supports `?tab=<tabname>` query parameter to open specific tabs
632- Valid tabs: `account`, `sessions`, `passkeys`, `billing`, `danger`
633- Example: `/settings?tab=billing` opens the billing tab directly
634- Subscribe prompts link to `/settings?tab=billing` for direct access
635- URL updates when switching tabs (browser history support)
636
637**Testing subscriptions:**
638Manually add a test subscription to the database:
639```sql
640INSERT INTO subscriptions (id, user_id, customer_id, status)
641VALUES ('test-sub', <user_id>, 'test-customer', 'active');
642```
643
644## Transcription Service Integration (Murmur)
645
646The application uses [Murmur](https://github.com/taciturnaxolotl/murmur) as the transcription backend.
647
648**Murmur API endpoints:**
649- `POST /transcribe` - Upload audio file and create transcription job
650- `GET /transcribe/:job_id` - Get job status and transcript (supports `?format=json|vtt`)
651- `GET /transcribe/:job_id/stream` - Stream real-time progress via Server-Sent Events
652- `GET /jobs` - List all jobs (newest first)
653- `DELETE /transcribe/:job_id` - Delete a job from Murmur's database
654
655**Job synchronization:**
656The `TranscriptionService` runs periodic syncs to reconcile state between our database and Murmur:
657- Reconnects to active jobs on server restart
658- Syncs status updates for processing/transcribing jobs
659- Handles completed jobs (fetches VTT, cleans transcript, saves to storage)
660- **Cleans up finished jobs** - After successful completion or failure, jobs are deleted from Murmur
661- **Cleans up orphaned jobs** - Jobs found in Murmur but not in our database are automatically deleted
662
663**Job cleanup:**
664- **Completed jobs**: After fetching transcript and saving to storage, the job is deleted from Murmur
665- **Failed jobs**: After recording the error in our database, the job is deleted from Murmur
666- **Orphaned jobs**: Jobs in Murmur but not in our database are deleted on discovery
667- All deletions use `DELETE /transcribe/:job_id`
668- This prevents Murmur's database from accumulating stale jobs (Murmur doesn't have automatic cleanup)
669- Logs success/failure of deletion attempts for monitoring
670
671**Job lifecycle:**
6721. User uploads audio → creates transcription in our DB with `status='uploading'`
6732. Audio uploaded to Murmur → get `whisper_job_id`, update to `status='processing'`
6743. Murmur transcribes → stream progress updates, update to `status='transcribing'`
6754. Job completes → fetch VTT, clean with LLM, save transcript, update to `status='completed'`, **delete from Murmur**
6765. If job fails in Murmur → update to `status='failed'` with error message, **delete from Murmur**
677
678**Configuration:**
679Set `WHISPER_SERVICE_URL` in `.env` (default: `http://localhost:8000`)
680
681## Future Additions
682
683As the codebase grows, document:
684- Database schema and migrations
685- API endpoint patterns
686- Authentication/authorization approach
687- Deployment process
688- Environment variables needed
689