···
1
+
# Backlog PRD: Platform Improvements & Technical Debt
4
+
**Owner:** Platform Team
5
+
**Last Updated:** 2025-10-11
9
+
Miscellaneous platform improvements, bug fixes, and technical debt that don't fit into feature-specific PRDs.
13
+
## 🔴 P0: Critical Security
15
+
### did:web Domain Verification
16
+
**Added:** 2025-10-11 | **Effort:** 2-3 days | **Severity:** Medium
18
+
**Problem:** Self-hosters can set `INSTANCE_DID=did:web:nintendo.com` without owning the domain, enabling domain impersonation attacks (e.g., `mario.communities.nintendo.com` on malicious instance).
20
+
**Solution:** Implement did:web verification per [atProto spec](https://atproto.com/specs/did-web) - fetch `https://domain/.well-known/did.json` on startup and verify it matches claimed DID. Add `SKIP_DID_WEB_VERIFICATION=true` for dev mode.
23
+
- ✅ Default changed from `coves.local` → `coves.social` (fixes `.local` TLD bug)
24
+
- ✅ TODO comment in [cmd/server/main.go:126-131](../cmd/server/main.go#L126-L131)
25
+
- ⚠️ Verification not implemented
31
+
### Token Refresh Logic for Community Credentials
32
+
**Added:** 2025-10-11 | **Effort:** 1-2 days
34
+
**Problem:** Community PDS access tokens expire (~2hrs). Updates fail until manual intervention.
36
+
**Solution:** Auto-refresh tokens before PDS operations. Parse JWT exp claim, use refresh token when expired, update DB.
40
+
## 🟢 P2: Nice-to-Have
42
+
### Improve .local TLD Error Messages
43
+
**Added:** 2025-10-11 | **Effort:** 1 hour
45
+
**Problem:** Generic error "TLD .local is not allowed" confuses developers.
47
+
**Solution:** Enhance `InvalidHandleError` to explain root cause and suggest fixing `INSTANCE_DID`.
51
+
### Self-Hosting Security Guide
52
+
**Added:** 2025-10-11 | **Effort:** 1 day
54
+
**Needed:** Document did:web setup, DNS config, secrets management, rate limiting, PostgreSQL hardening, monitoring.
58
+
### OAuth Session Cleanup Race Condition
59
+
**Added:** 2025-10-11 | **Effort:** 2 hours
61
+
**Problem:** Cleanup goroutine doesn't handle graceful shutdown, may orphan DB connections.
63
+
**Solution:** Pass cancellable context, handle SIGTERM, add cleanup timeout.
67
+
## 🔵 P3: Technical Debt
69
+
### Consolidate Environment Variable Validation
70
+
**Added:** 2025-10-11 | **Effort:** 2-3 hours
72
+
Create `internal/config` package with structured config validation. Fail fast with clear errors.
76
+
### Add Connection Pooling for PDS HTTP Clients
77
+
**Added:** 2025-10-11 | **Effort:** 2 hours
79
+
Create shared `http.Client` with connection pooling instead of new client per request.
83
+
### Architecture Decision Records (ADRs)
84
+
**Added:** 2025-10-11 | **Effort:** Ongoing
86
+
Document: did:plc choice, pgcrypto encryption, Jetstream vs firehose, write-forward pattern, single handle field.
90
+
## Recent Completions
92
+
### ✅ Fix .local TLD Bug (2025-10-11)
93
+
Changed default `INSTANCE_DID` from `did:web:coves.local` → `did:web:coves.social`. Fixed community creation failure due to disallowed `.local` TLD.
99
+
- **P0:** Security vulns, data loss, prod blockers
100
+
- **P1:** Major UX/reliability issues
101
+
- **P2:** QOL improvements, minor bugs, docs
102
+
- **P3:** Refactoring, code quality