๐Ÿชป distributed transcription service thistle.dunkirk.sh

feat: add Whisper transcription service

- Add faster-whisper Python service with SSE streaming
- Support for multiple audio formats (MP3, WAV, M4A, etc)
- SQLite-based job tracking and progress updates
- Add setup instructions to README

๐Ÿ’˜ Generated with Crush

Co-Authored-By: Crush <crush@charm.land>

dunkirk.sh ac430290 38c106d5

verified
Changed files
+363 -5
whisper-server
+36 -5
README.md
···
```bash
.
โ”œโ”€โ”€ public
-
โ””โ”€โ”€ src
-
โ”œโ”€โ”€ components
-
โ”œโ”€โ”€ pages
-
โ””โ”€โ”€ styles
+
โ”œโ”€โ”€ src
+
โ”‚ โ”œโ”€โ”€ components
+
โ”‚ โ”œโ”€โ”€ pages
+
โ”‚ โ””โ”€โ”€ styles
+
โ””โ”€โ”€ whisper-server
+
โ”œโ”€โ”€ main.py
+
โ”œโ”€โ”€ requirements.txt
+
โ””โ”€โ”€ README.md
-
6 directories
+
9 directories, 3 files
```
## What's this?
···
```
Your server will be running at `http://localhost:3000` with hot module reloading. Just edit any `.ts`, `.html`, or `.css` file and watch it update in the browser.
+
+
### Transcription Service
+
+
Thistle requires a separate Whisper transcription server for audio processing. Set it up in the `whisper-server/` directory:
+
+
```bash
+
cd whisper-server
+
./run.sh
+
```
+
+
Or manually:
+
```bash
+
cd whisper-server
+
pip install -r requirements.txt
+
python main.py
+
```
+
+
The Whisper server will run on `http://localhost:8000`. Make sure it's running before using transcription features.
+
+
### Environment Setup
+
+
Copy `.env.example` to `.env` and configure:
+
+
```bash
+
cp .env.example .env
+
# Edit .env to set WHISPER_SERVICE_URL=http://localhost:8000
+
```
The tech stack is pretty minimal on purpose. Lit components (~8-10KB gzipped) for things that need reactivity, vanilla JS for simple stuff, and CSS variables for theming. The goal is to keep the total JS bundle as small as possible.
+86
whisper-server/README.md
···
+
# Whisper Transcription Server
+
+
This is a FastAPI server that provides real-time audio transcription using the faster-whisper library.
+
+
## Features
+
+
- Real-time transcription with streaming progress updates
+
- Supports multiple audio formats (MP3, WAV, M4A, etc.)
+
- Language detection
+
- Segment-based transcription with timestamps
+
- RESTful API endpoint
+
+
## Setup
+
+
### 1. Install Dependencies
+
+
```bash
+
pip install -r requirements.txt
+
```
+
+
### 2. Run the Server
+
+
**Option 1: Manual setup**
+
```bash
+
pip install -r requirements.txt
+
python main.py
+
```
+
+
**Option 2: Quick start script**
+
```bash
+
./run.sh
+
```
+
+
The server will start on `http://localhost:8000` and load the Whisper model (this may take a few minutes on first run).
+
+
## API Usage
+
+
### POST `/transcribe-with-progress`
+
+
Upload an audio file to get real-time transcription progress.
+
+
**Example with curl:**
+
```bash
+
curl -X POST "http://localhost:8000/transcribe-with-progress" \
+
-F "file=@/path/to/your/audio.mp3"
+
```
+
+
**Streaming Response:**
+
The endpoint returns a stream of JSON objects:
+
+
```json
+
{"status": "starting", "total_duration": 15.36, "language": "en", "language_probability": 0.99}
+
{"status": "progress", "percentage": 25.59, "start": 0.0, "end": 3.93, "text": "This is a test of the transcription server."}
+
{"status": "progress", "percentage": 57.68, "start": 3.93, "end": 8.86, "text": "It should be streaming the results back in real time."}
+
{"status": "complete"}
+
```
+
+
### Response Format
+
+
- `starting`: Initial metadata about the audio file
+
- `progress`: Transcription segments with progress percentage
+
- `complete`: Transcription finished successfully
+
- `error`: An error occurred during transcription
+
+
## Configuration
+
+
You can modify the model settings in `main.py`:
+
+
```python
+
model_size = "base" # Options: tiny, base, small, medium, large-v1, large-v2, large-v3
+
model = WhisperModel(model_size, device="cpu", compute_type="int8")
+
```
+
+
For GPU acceleration, change to:
+
```python
+
model = WhisperModel(model_size, device="cuda", compute_type="float16")
+
```
+
+
## Integration with Thistle
+
+
This server is designed to work with the Thistle web application. Set the `WHISPER_SERVICE_URL` environment variable in Thistle to point to this server.
+
+
```bash
+
# In Thistle's .env file
+
WHISPER_SERVICE_URL=http://localhost:8000
+
```
+223
whisper-server/main.py
···
+
import os
+
import json
+
import tempfile
+
import asyncio
+
import sqlite3
+
import time
+
import uuid
+
from faster_whisper import WhisperModel
+
from fastapi import FastAPI, UploadFile, File
+
from fastapi.responses import StreamingResponse
+
from sse_starlette.sse import EventSourceResponse
+
+
# --- 1. Load Model on Startup ---
+
# This loads the model only once, not on every request
+
print("--- Loading faster-whisper model... ---")
+
model_size = "small"
+
# You can change this to "cuda" and "float16" if you have a GPU
+
model = WhisperModel(model_size, device="cpu", compute_type="int8")
+
print(f"--- Model '{model_size}' loaded. Server is ready. ---")
+
+
# --- 2. Setup Database for Job Tracking ---
+
db_path = "./whisper.db" # Independent DB for Whisper server
+
db = sqlite3.connect(db_path, check_same_thread=False)
+
db.execute("""
+
CREATE TABLE IF NOT EXISTS whisper_jobs (
+
id TEXT PRIMARY KEY,
+
status TEXT DEFAULT 'pending',
+
progress REAL DEFAULT 0,
+
transcript TEXT DEFAULT '',
+
error_message TEXT DEFAULT '',
+
created_at INTEGER,
+
updated_at INTEGER
+
)
+
""")
+
db.commit()
+
+
# --- 2. Create FastAPI App ---
+
app = FastAPI(title="Whisper Transcription Server with Progress")
+
+
+
# --- 3. Define the Transcription Function ---
+
# Runs in background and updates DB
+
def run_transcription(job_id: str, temp_file_path: str):
+
try:
+
# 1. Update to processing
+
db.execute("UPDATE whisper_jobs SET status = 'processing', updated_at = ? WHERE id = ?", (int(time.time()), job_id))
+
db.commit()
+
+
# 2. Get segments and total audio duration
+
segments, info = model.transcribe(
+
temp_file_path,
+
beam_size=5,
+
vad_filter=True
+
)
+
+
total_duration = round(info.duration, 2)
+
print(f"Job {job_id}: Total audio duration: {total_duration}s")
+
print(f"Job {job_id}: Detected language: {info.language}")
+
+
transcript = ""
+
+
# 3. Process each segment
+
for segment in segments:
+
progress_percent = (segment.end / total_duration) * 100
+
transcript += segment.text.strip() + " "
+
+
db.execute("""
+
UPDATE whisper_jobs SET progress = ?, transcript = ?, updated_at = ? WHERE id = ?
+
""", (round(progress_percent, 2), transcript.strip(), int(time.time()), job_id))
+
db.commit()
+
+
# 4. Complete
+
db.execute("UPDATE whisper_jobs SET status = 'completed', progress = 100, updated_at = ? WHERE id = ?", (int(time.time()), job_id))
+
db.commit()
+
+
except Exception as e:
+
db.execute("UPDATE whisper_jobs SET status = 'failed', error_message = ?, updated_at = ? WHERE id = ?", (str(e), int(time.time()), job_id))
+
db.commit()
+
+
finally:
+
# Clean up temp file
+
print(f"Job {job_id}: Cleaning up temp file: {temp_file_path}")
+
os.remove(temp_file_path)
+
+
+
# --- 4. Define the FastAPI Endpoints ---
+
@app.post("/transcribe")
+
async def transcribe_endpoint(file: UploadFile = File(...)):
+
"""
+
Accepts an audio file, starts transcription in background, returns job ID.
+
"""
+
+
# Generate job ID
+
job_id = str(uuid.uuid4())
+
+
# Save the uploaded file to a temporary file
+
with tempfile.NamedTemporaryFile(delete=False, suffix=".tmp") as temp_file:
+
while content := await file.read(1024 * 1024):
+
temp_file.write(content)
+
temp_file_path = temp_file.name
+
+
print(f"Job {job_id}: File saved to temporary path: {temp_file_path}")
+
+
# Create job in DB
+
db.execute("INSERT INTO whisper_jobs (id, created_at, updated_at) VALUES (?, ?, ?)", (job_id, int(time.time()), int(time.time())))
+
db.commit()
+
+
# Start transcription in background
+
asyncio.create_task(asyncio.to_thread(run_transcription, job_id, temp_file_path))
+
+
return {"job_id": job_id}
+
+
@app.get("/transcribe/{job_id}/stream")
+
async def stream_transcription_status(job_id: str):
+
"""
+
Stream the status and progress of a transcription job via SSE.
+
"""
+
async def event_generator():
+
last_updated_at = None
+
+
while True:
+
row = db.execute("""
+
SELECT status, progress, transcript, error_message, updated_at
+
FROM whisper_jobs
+
WHERE id = ?
+
""", (job_id,)).fetchone()
+
+
if not row:
+
yield {
+
"event": "error",
+
"data": json.dumps({"error": "Job not found"})
+
}
+
return
+
+
status, progress, transcript, error_message, updated_at = row
+
+
# Only send if data changed
+
if updated_at != last_updated_at:
+
last_updated_at = updated_at
+
+
data = {
+
"status": status,
+
"progress": progress,
+
}
+
+
# Include transcript only if it changed (save bandwidth)
+
if transcript:
+
data["transcript"] = transcript
+
+
if error_message:
+
data["error_message"] = error_message
+
+
yield {
+
"event": "message",
+
"data": json.dumps(data)
+
}
+
+
# Close stream if job is complete or failed
+
if status in ('completed', 'failed'):
+
return
+
+
# Poll every 500ms
+
await asyncio.sleep(0.5)
+
+
return EventSourceResponse(event_generator())
+
+
@app.get("/transcribe/{job_id}")
+
def get_transcription_status(job_id: str):
+
"""
+
Get the status and progress of a transcription job.
+
"""
+
row = db.execute("SELECT status, progress, transcript, error_message FROM whisper_jobs WHERE id = ?", (job_id,)).fetchone()
+
if not row:
+
return {"error": "Job not found"}, 404
+
+
status, progress, transcript, error_message = row
+
return {
+
"status": status,
+
"progress": progress,
+
"transcript": transcript,
+
"error_message": error_message
+
}
+
+
@app.get("/jobs")
+
def list_jobs():
+
"""
+
List all jobs with their current status. Used for recovery/sync.
+
"""
+
rows = db.execute("""
+
SELECT id, status, progress, created_at, updated_at
+
FROM whisper_jobs
+
ORDER BY created_at DESC
+
""").fetchall()
+
+
jobs = []
+
for row in rows:
+
jobs.append({
+
"id": row[0],
+
"status": row[1],
+
"progress": row[2],
+
"created_at": row[3],
+
"updated_at": row[4]
+
})
+
+
return {"jobs": jobs}
+
+
@app.delete("/transcribe/{job_id}")
+
def delete_job(job_id: str):
+
"""
+
Delete a job from the database. Used for cleanup.
+
"""
+
result = db.execute("DELETE FROM whisper_jobs WHERE id = ?", (job_id,))
+
db.commit()
+
+
if result.rowcount == 0:
+
return {"error": "Job not found"}, 404
+
+
return {"success": True}
+
+
+
if __name__ == "__main__":
+
import uvicorn
+
uvicorn.run(app, host="0.0.0.0", port=8000)
+4
whisper-server/requirements.txt
···
+
fastapi[all]==0.115.6
+
uvicorn[standard]==0.32.1
+
faster-whisper==1.1.1
+
sse-starlette==2.2.1
+14
whisper-server/run.sh
···
+
#!/bin/bash
+
+
# Quick script to run the Whisper transcription server
+
+
echo "Setting up Whisper transcription server..."
+
echo "Installing dependencies..."
+
pip3 install -r requirements.txt
+
+
echo ""
+
echo "Starting Whisper server on http://localhost:8000"
+
echo "Press Ctrl+C to stop"
+
echo ""
+
+
python main.py
whisper-server/whisper.db

This is a binary file and will not be displayed.