A community based topic aggregation platform built on atproto
1# Backlog PRD: Platform Improvements & Technical Debt 2 3**Status:** Ongoing 4**Owner:** Platform Team 5**Last Updated:** 2025-10-17 6 7## Overview 8 9Miscellaneous platform improvements, bug fixes, and technical debt that don't fit into feature-specific PRDs. 10 11--- 12 13## 🔴 P0: Critical (Alpha Blockers) 14 15### OAuth DPoP Token Architecture - Voting Write-Forward 16**Added:** 2025-11-02 | **Completed:** 2025-11-02 | **Effort:** 2 hours | **Priority:** ALPHA BLOCKER 17**Status:** ✅ COMPLETE 18 19**Problem:** 20Our backend is attempting to use DPoP-bound OAuth tokens to write votes to users' PDSs, causing "Malformed token" errors. This violates atProto architecture patterns. 21 22**Current (Incorrect) Flow:** 23``` 24Mobile Client (OAuth + DPoP) → Coves Backend → User's PDS ❌ 25 26 "Malformed token" error 27``` 28 29**Root Cause:** 30- Mobile app uses OAuth with DPoP (Demonstrating Proof of Possession) 31- DPoP tokens are cryptographically bound to client's private key via `cnf.jkt` claim 32- Each PDS request requires **both**: 33 - `Authorization: Bearer <token>` 34 - `DPoP: <signed-proof-jwt>` (signature proves client has private key) 35- Backend cannot create DPoP proofs (doesn't have client's private key) 36- **DPoP tokens are intentionally non-transferable** (security feature to prevent token theft) 37 38**Evidence:** 39```json 40// Token decoded from mobile app session 41{ 42 "sub": "did:plc:txrork7rurdueix27ulzi7ke", 43 "cnf": { 44 "jkt": "LSWROJhTkPn4yT18xUjiIz2Z7z7l_gozKfjjQTYgW9o" // ← DPoP binding 45 }, 46 "client_id": "https://lingering-darkness-50a6.brettmay0212.workers.dev/client-metadata.json", 47 "iss": "http://localhost:3001" 48} 49``` 50 51**atProto Best Practice (from Bluesky social-app analysis):** 52- ✅ Clients write **directly to their own PDS** (no backend proxy) 53- ✅ AppView **only indexes** from Jetstream (eventual consistency) 54- ✅ PDS = User's personal data store (user controls writes) 55- ✅ AppView = Read-only aggregator/indexer 56- ❌ Backend should NOT proxy user write operations 57 58**Correct Architecture:** 59``` 60Mobile Client → User's PDS (direct write with DPoP proof) ✓ 61 62 Jetstream (firehose) 63 64 Coves AppView (indexes votes from firehose) 65``` 66 67**Affected Endpoints:** 681. **Vote Creation** - [create_vote.go:76](../internal/api/handlers/vote/create_vote.go#L76) 69 - Currently: Backend writes to PDS using user's token 70 - Should: Return error directing client to write directly 71 722. **Vote Service** - [service.go:126](../internal/core/votes/service.go#L126) 73 - Currently: `createRecordOnPDSAs()` attempts write-forward 74 - Should: Remove write-forward, rely on Jetstream indexing only 75 76**Solution Options:** 77 78**Option A: Client Direct Write (RECOMMENDED - Follows Bluesky)** 79```typescript 80// Mobile client writes directly (like Bluesky social-app) 81const agent = new Agent(oauthSession) 82await agent.call('com.atproto.repo.createRecord', { 83 repo: userDid, 84 collection: 'social.coves.interaction.vote', 85 record: { 86 $type: 'social.coves.interaction.vote', 87 subject: { uri: postUri, cid: postCid }, 88 direction: 'up', 89 createdAt: new Date().toISOString() 90 } 91}) 92``` 93 94Backend changes: 95- Remove write-forward code from vote service 96- Return error from XRPC endpoint: "Votes must be created directly at your PDS" 97- Index votes from Jetstream consumer (already implemented) 98 99**Option B: Backend App Passwords (NOT RECOMMENDED)** 100- User creates app-specific password 101- Backend uses password auth (gets regular JWTs, not DPoP) 102- Security downgrade, poor UX 103 104**Option C: Service Auth Token (Complex)** 105- Backend gets its own service credentials 106- Requires PDS to trust our AppView as delegated writer 107- Non-standard atProto pattern 108 109**Recommendation:** Option A (Client Direct Write) 110- Matches atProto architecture 111- Follows Bluesky social-app pattern 112- Best security (user controls their data) 113- Simplest implementation 114 115**Implementation Tasks:** 1161. Update Flutter OAuth package to expose `agent.call()` for custom lexicons 1172. Update mobile vote UI to write directly to PDS 1183. Remove write-forward code from backend vote service 1194. Update vote XRPC handler to return helpful error message 1205. Verify Jetstream consumer correctly indexes votes 1216. Update integration tests to match new flow 122 123**References:** 124- Bluesky social-app: Direct PDS writes via agent 125- atProto OAuth spec: DPoP binding prevents token reuse 126- atProto architecture: AppView = read-only indexer 127 128--- 129 130### OAuth DPoP Token Architecture - Community Subscriptions 131**Added:** 2025-11-02 | **Effort:** 1-2 hours | **Priority:** ALPHA BLOCKER 132**Status:** 📋 TODO (Waiting for frontend implementation) 133 134**Problem:** 135Same DPoP token issue as voting - backend cannot use user's DPoP-bound OAuth tokens to write subscription records to user's PDS. 136 137**Affected Operations:** 138- `SubscribeToCommunity()` - [service.go:564-624](../internal/core/communities/service.go#L564-L624) 139- `UnsubscribeFromCommunity()` - [service.go:626-660](../internal/core/communities/service.go#L626-L660) 140 141**Collection:** `social.coves.community.subscription` 142 143**Solution:** 144Client writes directly using `com.atproto.repo.createRecord`: 145```typescript 146await agent.call('com.atproto.repo.createRecord', { 147 repo: userDid, 148 collection: 'social.coves.community.subscription', 149 record: { 150 $type: 'social.coves.community.subscription', 151 subject: communityDid, 152 contentVisibility: 3, 153 createdAt: new Date().toISOString() 154 } 155}) 156``` 157 158**Backend Changes Needed:** 1591. Remove write-forward from `SubscribeToCommunity()` and `UnsubscribeFromCommunity()` 1602. Update handlers to return errors directing to client-direct pattern 1613. Verify Jetstream consumer indexes subscriptions (already working) 162 163**Files to Modify:** 164- `internal/core/communities/service.go` 165- `internal/api/handlers/community/subscribe.go` 166 167--- 168 169### OAuth DPoP Token Architecture - Community Blocking 170**Added:** 2025-11-02 | **Effort:** 1-2 hours | **Priority:** ALPHA BLOCKER 171**Status:** 📋 TODO (Waiting for frontend implementation) 172 173**Problem:** 174Same DPoP token issue - backend cannot use user's DPoP-bound OAuth tokens to write block records to user's PDS. 175 176**Affected Operations:** 177- `BlockCommunity()` - [service.go:709-781](../internal/core/communities/service.go#L709-L781) 178- `UnblockCommunity()` - [service.go:783-816](../internal/core/communities/service.go#L783-L816) 179 180**Collection:** `social.coves.community.block` 181 182**Solution:** 183Client writes directly using `com.atproto.repo.createRecord`: 184```typescript 185await agent.call('com.atproto.repo.createRecord', { 186 repo: userDid, 187 collection: 'social.coves.community.block', 188 record: { 189 $type: 'social.coves.community.block', 190 subject: communityDid, 191 createdAt: new Date().toISOString() 192 } 193}) 194``` 195 196**Backend Changes Needed:** 1971. Remove write-forward from `BlockCommunity()` and `UnblockCommunity()` 1982. Update handlers to return errors directing to client-direct pattern 1993. Verify Jetstream consumer indexes blocks (already working) 200 201**Files to Modify:** 202- `internal/core/communities/service.go` 203- `internal/api/handlers/community/block.go` 204 205--- 206 207## 🟡 P1: Important (Alpha Blockers) 208 209### at-identifier Handle Resolution in Endpoints 210**Added:** 2025-10-18 | **Effort:** 2-3 hours | **Priority:** ALPHA BLOCKER 211 212**Problem:** 213Current implementation rejects handles in endpoints that declare `"format": "at-identifier"` in their lexicon schemas, violating atProto best practices and breaking legitimate client usage. 214 215**Impact:** 216- ❌ Post creation fails when client sends community handle (e.g., `!gardening.communities.coves.social`) 217- ❌ Subscribe/unsubscribe endpoints reject handles despite lexicon declaring `at-identifier` 218- ❌ Block endpoints use `"format": "did"` but should use `at-identifier` for consistency 219- 🔴 **P0 Issue:** API contract violation - clients following the schema are rejected 220 221**Root Cause:** 222Handlers and services validate `strings.HasPrefix(req.Community, "did:")` instead of calling `ResolveCommunityIdentifier()`. 223 224**Affected Endpoints:** 2251. **Post Creation** - [create.go:54](../internal/api/handlers/post/create.go#L54), [service.go:51](../internal/core/posts/service.go#L51) 226 - Lexicon declares `at-identifier`: [post/create.json:16](../internal/atproto/lexicon/social/coves/post/create.json#L16) 227 2282. **Subscribe** - [subscribe.go:52](../internal/api/handlers/community/subscribe.go#L52) 229 - Lexicon declares `at-identifier`: [subscribe.json:16](../internal/atproto/lexicon/social/coves/community/subscribe.json#L16) 230 2313. **Unsubscribe** - [subscribe.go:120](../internal/api/handlers/community/subscribe.go#L120) 232 - Lexicon declares `at-identifier`: [unsubscribe.json:16](../internal/atproto/lexicon/social/coves/community/unsubscribe.json#L16) 233 2344. **Block/Unblock** - [block.go:58](../internal/api/handlers/community/block.go#L58), [block.go:132](../internal/api/handlers/community/block.go#L132) 235 - Lexicon declares `"format": "did"`: [block.json:15](../internal/atproto/lexicon/social/coves/community/block.json#L15) 236 - Should be changed to `at-identifier` for consistency and best practice 237 238**atProto Best Practice (from docs):** 239- ✅ API endpoints should accept both DIDs and handles via `at-identifier` format 240- ✅ Resolve handles to DIDs immediately at API boundary 241- ✅ Use DIDs internally for all business logic and storage 242- ✅ Handles are weak refs (changeable), DIDs are strong refs (permanent) 243- ⚠️ Bidirectional verification required (already handled by `identity.CachingResolver`) 244 245**Solution:** 246Replace direct DID validation with handle resolution using existing `ResolveCommunityIdentifier()`: 247 248```go 249// BEFORE (wrong) ❌ 250if !strings.HasPrefix(req.Community, "did:") { 251 return error 252} 253 254// AFTER (correct) ✅ 255communityDID, err := h.communityService.ResolveCommunityIdentifier(ctx, req.Community) 256if err != nil { 257 if communities.IsNotFound(err) { 258 writeError(w, http.StatusNotFound, "CommunityNotFound", "Community not found") 259 return 260 } 261 writeError(w, http.StatusBadRequest, "InvalidRequest", err.Error()) 262 return 263} 264// Now use communityDID (guaranteed to be a DID) 265``` 266 267**Implementation Plan:** 2681.**Phase 1 (Alpha Blocker):** Fix post creation endpoint 269 - Update handler validation in `internal/api/handlers/post/create.go` 270 - Update service validation in `internal/core/posts/service.go` 271 - Add integration tests for handle resolution in post creation 272 2732. 📋 **Phase 2 (Beta):** Fix subscription endpoints 274 - Update subscribe/unsubscribe handlers 275 - Add tests for handle resolution in subscriptions 276 2773. 📋 **Phase 3 (Beta):** Fix block endpoints 278 - Update lexicon from `"format": "did"``"format": "at-identifier"` 279 - Update block/unblock handlers 280 - Add tests for handle resolution in blocking 281 282**Files to Modify (Phase 1 - Post Creation):** 283- `internal/api/handlers/post/create.go` - Remove DID validation, add handle resolution 284- `internal/core/posts/service.go` - Remove DID validation, add handle resolution 285- `internal/core/posts/interfaces.go` - Add `CommunityService` dependency 286- `cmd/server/main.go` - Pass community service to post service constructor 287- `tests/integration/post_creation_test.go` - Add handle resolution test cases 288 289**Existing Infrastructure:** 290`ResolveCommunityIdentifier()` already implemented at [service.go:843](../internal/core/communities/service.go#L843) 291`identity.CachingResolver` handles bidirectional verification and caching 292✅ Supports both handle (`!name.communities.instance.com`) and DID formats 293 294**Current Status:** 295- ⚠️ **BLOCKING POST CREATION PR**: Identified as P0 issue in code review 296- 📋 Phase 1 (post creation) - To be implemented immediately 297- 📋 Phase 2-3 (other endpoints) - Deferred to Beta 298 299--- 300 301### did:web Domain Verification & hostedByDID Auto-Population 302**Added:** 2025-10-11 | **Updated:** 2025-10-16 | **Effort:** 2-3 days | **Priority:** ALPHA BLOCKER 303 304**Problem:** 3051. **Domain Impersonation**: Self-hosters can set `INSTANCE_DID=did:web:nintendo.com` without owning the domain, enabling attacks where communities appear hosted by trusted domains 3062. **hostedByDID Spoofing**: Malicious instance operators can modify source code to claim communities are hosted by domains they don't own, enabling reputation hijacking and phishing 307 308**Attack Scenarios:** 309- Malicious instance sets `instanceDID="did:web:coves.social"` → communities show as hosted by official Coves 310- Federation partners can't verify instance authenticity 311- AppView pollution with fake hosting claims 312 313**Solution:** 3141. **Basic Validation (Phase 1)**: Verify `did:web:` domain matches configured `instanceDomain` 3152. **Cryptographic Verification (Phase 2)**: Fetch `https://domain/.well-known/did.json` and verify: 316 - DID document exists and is valid 317 - Domain ownership proven via HTTPS hosting 318 - DID document matches claimed `instanceDID` 3193. **Auto-populate hostedByDID**: Remove from client API, derive from instance configuration in service layer 320 321**Current Status:** 322- ✅ Default changed from `coves.local``coves.social` (fixes `.local` TLD bug) 323- ✅ TODO comment in [cmd/server/main.go:126-131](../cmd/server/main.go#L126-L131) 324- ✅ hostedByDID removed from client requests (2025-10-16) 325- ✅ Service layer auto-populates `hostedByDID` from `instanceDID` (2025-10-16) 326- ✅ Handler rejects client-provided `hostedByDID` (2025-10-16) 327- ✅ Basic validation: Logs warning if `did:web:` domain ≠ `instanceDomain` (2025-10-16) 328- ⚠️ **REMAINING**: Full DID document verification (cryptographic proof of ownership) 329 330**Implementation Notes:** 331- Phase 1 complete: Basic validation catches config errors, logs warnings 332- Phase 2 needed: Fetch `https://domain/.well-known/did.json` and verify ownership 333- Add `SKIP_DID_WEB_VERIFICATION=true` for dev mode 334- Full verification blocks startup if domain ownership cannot be proven 335 336--- 337 338### ✅ Token Refresh Logic for Community Credentials - COMPLETE 339**Added:** 2025-10-11 | **Completed:** 2025-10-17 | **Effort:** 1.5 days | **Status:** ✅ DONE 340 341**Problem:** Community PDS access tokens expire (~2hrs). Updates fail until manual intervention. 342 343**Solution Implemented:** 344- ✅ Automatic token refresh before PDS operations (5-minute buffer before expiration) 345- ✅ JWT expiration parsing without signature verification (`parseJWTExpiration`, `needsRefresh`) 346- ✅ Token refresh using Indigo SDK (`atproto.ServerRefreshSession`) 347- ✅ Password fallback when refresh tokens expire (~2 months) via `atproto.ServerCreateSession` 348- ✅ Atomic credential updates (`UpdateCredentials` repository method) 349- ✅ Concurrency-safe with per-community mutex locking 350- ✅ Structured logging for monitoring (`[TOKEN-REFRESH]` events) 351- ✅ Integration tests for token expiration detection and credential updates 352 353**Files Created:** 354- [internal/core/communities/token_utils.go](../internal/core/communities/token_utils.go) - JWT parsing utilities 355- [internal/core/communities/token_refresh.go](../internal/core/communities/token_refresh.go) - Refresh and re-auth logic 356- [tests/integration/token_refresh_test.go](../tests/integration/token_refresh_test.go) - Integration tests 357 358**Files Modified:** 359- [internal/core/communities/service.go](../internal/core/communities/service.go) - Added `ensureFreshToken` + concurrency control 360- [internal/core/communities/interfaces.go](../internal/core/communities/interfaces.go) - Added `UpdateCredentials` interface 361- [internal/db/postgres/community_repo.go](../internal/db/postgres/community_repo.go) - Implemented `UpdateCredentials` 362 363**Documentation:** See [IMPLEMENTATION_TOKEN_REFRESH.md](../docs/IMPLEMENTATION_TOKEN_REFRESH.md) for full details 364 365**Impact:** ✅ Communities can now be updated 24+ hours after creation without manual intervention 366 367--- 368 369### ✅ Subscription Visibility Level (Feed Slider 1-5 Scale) - COMPLETE 370**Added:** 2025-10-15 | **Completed:** 2025-10-16 | **Effort:** 1 day | **Status:** ✅ DONE 371 372**Problem:** Users couldn't control how much content they see from each community. Lexicon had `contentVisibility` (1-5 scale) but code didn't use it. 373 374**Solution Implemented:** 375- ✅ Updated subscribe handler to accept `contentVisibility` parameter (1-5, default 3) 376- ✅ Store in subscription record on PDS (`social.coves.community.subscription`) 377- ✅ Migration 008 adds `content_visibility` column to database with CHECK constraint 378- ✅ Clamping at all layers (handler, service, consumer) for defense in depth 379- ✅ Atomic subscriber count updates (SubscribeWithCount/UnsubscribeWithCount) 380- ✅ Idempotent operations (safe for Jetstream event replays) 381- ✅ Fixed critical collection name bug (was using wrong namespace) 382- ✅ Production Jetstream consumer now running 383- ✅ 13 comprehensive integration tests - all passing 384 385**Files Modified:** 386- Lexicon: [subscription.json](../internal/atproto/lexicon/social/coves/community/subscription.json) ✅ Updated to atProto conventions 387- Handler: [community/subscribe.go](../internal/api/handlers/community/subscribe.go) ✅ Accepts contentVisibility 388- Service: [communities/service.go](../internal/core/communities/service.go) ✅ Clamps and passes to PDS 389- Consumer: [community_consumer.go](../internal/atproto/jetstream/community_consumer.go) ✅ Extracts and indexes 390- Repository: [community_repo_subscriptions.go](../internal/db/postgres/community_repo_subscriptions.go) ✅ All queries updated 391- Migration: [008_add_content_visibility_to_subscriptions.sql](../internal/db/migrations/008_add_content_visibility_to_subscriptions.sql) ✅ Schema changes 392- Tests: [subscription_indexing_test.go](../tests/integration/subscription_indexing_test.go) ✅ Comprehensive coverage 393 394**Documentation:** See [IMPLEMENTATION_SUBSCRIPTION_INDEXING.md](../docs/IMPLEMENTATION_SUBSCRIPTION_INDEXING.md) for full details 395 396**Impact:** ✅ Users can now adjust feed volume per community (key feature from DOMAIN_KNOWLEDGE.md enabled) 397 398--- 399 400### Community Blocking 401**Added:** 2025-10-15 | **Effort:** 1 day | **Priority:** ALPHA BLOCKER 402 403**Problem:** Users have no way to block unwanted communities from their feeds. 404 405**Solution:** 4061. **Lexicon:** Extend `social.coves.actor.block` to support community DIDs (currently user-only) 4072. **Service:** Implement `BlockCommunity(userDID, communityDID)` and `UnblockCommunity()` 4083. **Handlers:** Add XRPC endpoints `social.coves.community.block` and `unblock` 4094. **Repository:** Add methods to track blocked communities 4105. **Feed:** Filter blocked communities from feed queries (beta work) 411 412**Code:** 413- Lexicon: [actor/block.json](../internal/atproto/lexicon/social/coves/actor/block.json) - Currently only supports user DIDs 414- Service: New methods needed 415- Handlers: New files needed 416 417**Impact:** Users can't avoid unwanted content without blocking 418 419--- 420 421## 🔴 P1.5: Federation Blockers (Beta Launch) 422 423### Cross-PDS Write-Forward Support for Community Service 424**Added:** 2025-10-17 | **Updated:** 2025-11-02 | **Effort:** 3-4 hours | **Priority:** FEDERATION BLOCKER (Beta) 425 426**Problem:** Community service write-forward methods assume all users are on the same PDS as the Coves instance. This breaks federation when users from external PDSs try to subscribe/block communities. 427 428**Current Behavior:** 429- User on `pds.bsky.social` subscribes to community on `coves.social` 430- Coves calls `s.pdsURL` (instance default: `http://localhost:3001`) 431- Write goes to WRONG PDS → fails with `{"error":"InvalidToken","message":"Malformed token"}` 432 433**Impact:** 434-**Alpha**: Works fine (single PDS deployment, no federation) 435-**Beta**: Breaks federation (users on different PDSs can't subscribe/block) 436 437**Root Cause:** 438- [service.go:1033](../internal/core/communities/service.go#L1033): `createRecordOnPDSAs` hardcodes `s.pdsURL` 439- [service.go:1050](../internal/core/communities/service.go#L1050): `putRecordOnPDSAs` hardcodes `s.pdsURL` 440- [service.go:1063](../internal/core/communities/service.go#L1063): `deleteRecordOnPDSAs` hardcodes `s.pdsURL` 441 442**Affected Operations:** 443- `SubscribeToCommunity` ([service.go:608](../internal/core/communities/service.go#L608)) 444- `UnsubscribeFromCommunity` (calls `deleteRecordOnPDSAs`) 445- `BlockCommunity` ([service.go:739](../internal/core/communities/service.go#L739)) 446- `UnblockCommunity` (calls `deleteRecordOnPDSAs`) 447 448**Solution:** 4491. Add `identityResolver identity.Resolver` to `communityService` struct 4502. Before write-forward, resolve user's DID → extract PDS URL 4513. Call user's actual PDS instead of hardcoded `s.pdsURL` 452 453**Implementation Pattern (from Vote Service):** 454```go 455// Add helper method to resolve user's PDS 456func (s *communityService) resolveUserPDS(ctx context.Context, userDID string) (string, error) { 457 identity, err := s.identityResolver.Resolve(ctx, userDID) 458 if err != nil { 459 return "", fmt.Errorf("failed to resolve user PDS: %w", err) 460 } 461 if identity.PDSURL == "" { 462 log.Printf("[COMMUNITY-PDS] WARNING: No PDS URL found for %s, using fallback: %s", userDID, s.pdsURL) 463 return s.pdsURL, nil 464 } 465 return identity.PDSURL, nil 466} 467 468// Update write-forward methods: 469func (s *communityService) createRecordOnPDSAs(ctx context.Context, repoDID, collection, rkey string, record map[string]interface{}, accessToken string) (string, string, error) { 470 // Resolve user's actual PDS (critical for federation) 471 pdsURL, err := s.resolveUserPDS(ctx, repoDID) 472 if err != nil { 473 return "", "", fmt.Errorf("failed to resolve user PDS: %w", err) 474 } 475 endpoint := fmt.Sprintf("%s/xrpc/com.atproto.repo.createRecord", strings.TrimSuffix(pdsURL, "/")) 476 // ... rest of method 477} 478``` 479 480**Files to Modify:** 481- `internal/core/communities/service.go` - Add resolver field + `resolveUserPDS` helper 482- `internal/core/communities/service.go` - Update `createRecordOnPDSAs`, `putRecordOnPDSAs`, `deleteRecordOnPDSAs` 483- `cmd/server/main.go` - Pass identity resolver to community service constructor 484- Tests - Add cross-PDS subscription/block scenarios 485 486**Testing:** 487- User on external PDS subscribes to community → writes to their PDS 488- User on external PDS blocks community → writes to their PDS 489- Community profile updates still work (writes to community's own PDS) 490 491**Related:** 492-**Vote Service**: Fixed in Alpha (2025-11-02) - users can vote from any PDS 493- 🔴 **Community Service**: Deferred to Beta (no federation in Alpha) 494 495--- 496 497## 🟢 P2: Nice-to-Have 498 499### Remove Categories from Community Lexicon 500**Added:** 2025-10-15 | **Effort:** 30 minutes | **Priority:** Cleanup 501 502**Problem:** Categories field exists in create/update lexicon but not in profile record. Adds complexity without clear value. 503 504**Solution:** 505- Remove `categories` from [create.json](../internal/atproto/lexicon/social/coves/community/create.json#L46-L54) 506- Remove `categories` from [update.json](../internal/atproto/lexicon/social/coves/community/update.json#L51-L59) 507- Remove from [community.go:91](../internal/core/communities/community.go#L91) 508- Remove from service layer ([service.go:109-110](../internal/core/communities/service.go#L109-L110)) 509 510**Impact:** Simplifies lexicon, removes unused feature 511 512--- 513 514### Improve .local TLD Error Messages 515**Added:** 2025-10-11 | **Effort:** 1 hour 516 517**Problem:** Generic error "TLD .local is not allowed" confuses developers. 518 519**Solution:** Enhance `InvalidHandleError` to explain root cause and suggest fixing `INSTANCE_DID`. 520 521--- 522 523### Self-Hosting Security Guide 524**Added:** 2025-10-11 | **Effort:** 1 day 525 526**Needed:** Document did:web setup, DNS config, secrets management, rate limiting, PostgreSQL hardening, monitoring. 527 528--- 529 530### OAuth Session Cleanup Race Condition 531**Added:** 2025-10-11 | **Effort:** 2 hours 532 533**Problem:** Cleanup goroutine doesn't handle graceful shutdown, may orphan DB connections. 534 535**Solution:** Pass cancellable context, handle SIGTERM, add cleanup timeout. 536 537--- 538 539### Jetstream Consumer Race Condition 540**Added:** 2025-10-11 | **Effort:** 1 hour 541 542**Problem:** Multiple goroutines can call `close(done)` concurrently in consumer shutdown. 543 544**Solution:** Use `sync.Once` for channel close or atomic flag for shutdown state. 545 546**Code:** TODO in [jetstream/user_consumer.go:114](../internal/atproto/jetstream/user_consumer.go#L114) 547 548--- 549 550## 🔵 P3: Technical Debt 551 552### Consolidate Environment Variable Validation 553**Added:** 2025-10-11 | **Effort:** 2-3 hours 554 555Create `internal/config` package with structured config validation. Fail fast with clear errors. 556 557--- 558 559### Add Connection Pooling for PDS HTTP Clients 560**Added:** 2025-10-11 | **Effort:** 2 hours 561 562Create shared `http.Client` with connection pooling instead of new client per request. 563 564--- 565 566### Architecture Decision Records (ADRs) 567**Added:** 2025-10-11 | **Effort:** Ongoing 568 569Document: did:plc choice, pgcrypto encryption, Jetstream vs firehose, write-forward pattern, single handle field. 570 571--- 572 573### Replace log Package with Structured Logger 574**Added:** 2025-10-11 | **Effort:** 1 day 575 576**Problem:** Using standard `log` package. Need structured logging (JSON) with levels. 577 578**Solution:** Switch to `slog`, `zap`, or `zerolog`. Add request IDs, context fields. 579 580**Code:** TODO in [community/errors.go:46](../internal/api/handlers/community/errors.go#L46) 581 582--- 583 584### PDS URL Resolution from DID 585**Added:** 2025-10-11 | **Effort:** 2-3 hours 586 587**Problem:** User consumer doesn't resolve PDS URL from DID document when missing. 588 589**Solution:** Query PLC directory for DID document, extract `serviceEndpoint`. 590 591**Code:** TODO in [jetstream/user_consumer.go:203](../internal/atproto/jetstream/user_consumer.go#L203) 592 593--- 594 595## Recent Completions 596 597### ✅ Token Refresh for Community Credentials (2025-10-17) 598**Completed:** Automatic token refresh prevents communities from breaking after 2 hours 599 600**Implementation:** 601- ✅ JWT expiration parsing and refresh detection (5-minute buffer) 602- ✅ Token refresh using Indigo SDK (`atproto.ServerRefreshSession`) 603- ✅ Password fallback when refresh tokens expire (`atproto.ServerCreateSession`) 604- ✅ Atomic credential updates in database (`UpdateCredentials`) 605- ✅ Concurrency-safe with per-community mutex locking 606- ✅ Structured logging for monitoring (`[TOKEN-REFRESH]` events) 607- ✅ Integration tests for expiration detection and credential updates 608 609**Files Created:** 610- [internal/core/communities/token_utils.go](../internal/core/communities/token_utils.go) 611- [internal/core/communities/token_refresh.go](../internal/core/communities/token_refresh.go) 612- [tests/integration/token_refresh_test.go](../tests/integration/token_refresh_test.go) 613 614**Files Modified:** 615- [internal/core/communities/service.go](../internal/core/communities/service.go) - Added `ensureFreshToken` method 616- [internal/core/communities/interfaces.go](../internal/core/communities/interfaces.go) - Added `UpdateCredentials` interface 617- [internal/db/postgres/community_repo.go](../internal/db/postgres/community_repo.go) - Implemented `UpdateCredentials` 618 619**Documentation:** [IMPLEMENTATION_TOKEN_REFRESH.md](../docs/IMPLEMENTATION_TOKEN_REFRESH.md) 620 621**Impact:** Communities now work indefinitely without manual token management 622 623--- 624 625### ✅ OAuth Authentication for Community Actions (2025-10-16) 626**Completed:** Full OAuth JWT authentication flow for protected endpoints 627 628**Implementation:** 629- ✅ JWT parser compatible with atProto PDS tokens (aud/iss handling) 630- ✅ Auth middleware protecting create/update/subscribe/unsubscribe endpoints 631- ✅ Handler-level DID extraction from JWT tokens via `middleware.GetUserDID(r)` 632- ✅ Removed all X-User-DID header placeholders 633- ✅ E2E tests validate complete OAuth flow with real PDS tokens 634- ✅ Security: Issuer validation supports both HTTPS URLs and DIDs 635 636**Files Modified:** 637- [internal/atproto/auth/jwt.go](../internal/atproto/auth/jwt.go) - JWT parsing with atProto compatibility 638- [internal/api/middleware/auth.go](../internal/api/middleware/auth.go) - Auth middleware 639- [internal/api/handlers/community/](../internal/api/handlers/community/) - All handlers updated 640- [tests/integration/community_e2e_test.go](../tests/integration/community_e2e_test.go) - OAuth E2E tests 641 642**Related:** Also implemented `hostedByDID` auto-population for security (see P1 item above) 643 644--- 645 646### ✅ Fix .local TLD Bug (2025-10-11) 647Changed default `INSTANCE_DID` from `did:web:coves.local``did:web:coves.social`. Fixed community creation failure due to disallowed `.local` TLD. 648 649--- 650 651## Prioritization 652 653- **P0:** Security vulns, data loss, prod blockers 654- **P1:** Major UX/reliability issues 655- **P2:** QOL improvements, minor bugs, docs 656- **P3:** Refactoring, code quality