Backlog PRD: Platform Improvements & Technical Debt#
Status: Ongoing Owner: Platform Team Last Updated: 2025-10-16
Overview#
Miscellaneous platform improvements, bug fixes, and technical debt that don't fit into feature-specific PRDs.
🟡 P1: Important (Alpha Blockers)#
did:web Domain Verification & hostedByDID Auto-Population#
Added: 2025-10-11 | Updated: 2025-10-16 | Effort: 2-3 days | Priority: ALPHA BLOCKER
Problem:
- Domain Impersonation: Self-hosters can set
INSTANCE_DID=did:web:nintendo.comwithout owning the domain, enabling attacks where communities appear hosted by trusted domains - hostedByDID Spoofing: Malicious instance operators can modify source code to claim communities are hosted by domains they don't own, enabling reputation hijacking and phishing
Attack Scenarios:
- Malicious instance sets
instanceDID="did:web:coves.social"→ communities show as hosted by official Coves - Federation partners can't verify instance authenticity
- AppView pollution with fake hosting claims
Solution:
- Basic Validation (Phase 1): Verify
did:web:domain matches configuredinstanceDomain - Cryptographic Verification (Phase 2): Fetch
https://domain/.well-known/did.jsonand verify:- DID document exists and is valid
- Domain ownership proven via HTTPS hosting
- DID document matches claimed
instanceDID
- Auto-populate hostedByDID: Remove from client API, derive from instance configuration in service layer
Current Status:
- ✅ Default changed from
coves.local→coves.social(fixes.localTLD bug) - ✅ TODO comment in cmd/server/main.go:126-131
- ✅ hostedByDID removed from client requests (2025-10-16)
- ✅ Service layer auto-populates
hostedByDIDfrominstanceDID(2025-10-16) - ✅ Handler rejects client-provided
hostedByDID(2025-10-16) - ✅ Basic validation: Logs warning if
did:web:domain ≠instanceDomain(2025-10-16) - ⚠️ REMAINING: Full DID document verification (cryptographic proof of ownership)
Implementation Notes:
- Phase 1 complete: Basic validation catches config errors, logs warnings
- Phase 2 needed: Fetch
https://domain/.well-known/did.jsonand verify ownership - Add
SKIP_DID_WEB_VERIFICATION=truefor dev mode - Full verification blocks startup if domain ownership cannot be proven
Token Refresh Logic for Community Credentials#
Added: 2025-10-11 | Effort: 1-2 days | Priority: ALPHA BLOCKER
Problem: Community PDS access tokens expire (~2hrs). Updates fail until manual intervention.
Solution: Auto-refresh tokens before PDS operations. Parse JWT exp claim, use refresh token when expired, update DB.
Code: TODO in communities/service.go:123
✅ Subscription Visibility Level (Feed Slider 1-5 Scale) - COMPLETE#
Added: 2025-10-15 | Completed: 2025-10-16 | Effort: 1 day | Status: ✅ DONE
Problem: Users couldn't control how much content they see from each community. Lexicon had contentVisibility (1-5 scale) but code didn't use it.
Solution Implemented:
- ✅ Updated subscribe handler to accept
contentVisibilityparameter (1-5, default 3) - ✅ Store in subscription record on PDS (
social.coves.community.subscription) - ✅ Migration 008 adds
content_visibilitycolumn to database with CHECK constraint - ✅ Clamping at all layers (handler, service, consumer) for defense in depth
- ✅ Atomic subscriber count updates (SubscribeWithCount/UnsubscribeWithCount)
- ✅ Idempotent operations (safe for Jetstream event replays)
- ✅ Fixed critical collection name bug (was using wrong namespace)
- ✅ Production Jetstream consumer now running
- ✅ 13 comprehensive integration tests - all passing
Files Modified:
- Lexicon: subscription.json ✅ Updated to atProto conventions
- Handler: community/subscribe.go ✅ Accepts contentVisibility
- Service: communities/service.go ✅ Clamps and passes to PDS
- Consumer: community_consumer.go ✅ Extracts and indexes
- Repository: community_repo_subscriptions.go ✅ All queries updated
- Migration: 008_add_content_visibility_to_subscriptions.sql ✅ Schema changes
- Tests: subscription_indexing_test.go ✅ Comprehensive coverage
Documentation: See IMPLEMENTATION_SUBSCRIPTION_INDEXING.md for full details
Impact: ✅ Users can now adjust feed volume per community (key feature from DOMAIN_KNOWLEDGE.md enabled)
Community Blocking#
Added: 2025-10-15 | Effort: 1 day | Priority: ALPHA BLOCKER
Problem: Users have no way to block unwanted communities from their feeds.
Solution:
- Lexicon: Extend
social.coves.actor.blockto support community DIDs (currently user-only) - Service: Implement
BlockCommunity(userDID, communityDID)andUnblockCommunity() - Handlers: Add XRPC endpoints
social.coves.community.blockandunblock - Repository: Add methods to track blocked communities
- Feed: Filter blocked communities from feed queries (beta work)
Code:
- Lexicon: actor/block.json - Currently only supports user DIDs
- Service: New methods needed
- Handlers: New files needed
Impact: Users can't avoid unwanted content without blocking
🟢 P2: Nice-to-Have#
Remove Categories from Community Lexicon#
Added: 2025-10-15 | Effort: 30 minutes | Priority: Cleanup
Problem: Categories field exists in create/update lexicon but not in profile record. Adds complexity without clear value.
Solution:
- Remove
categoriesfrom create.json - Remove
categoriesfrom update.json - Remove from community.go:91
- Remove from service layer (service.go:109-110)
Impact: Simplifies lexicon, removes unused feature
Improve .local TLD Error Messages#
Added: 2025-10-11 | Effort: 1 hour
Problem: Generic error "TLD .local is not allowed" confuses developers.
Solution: Enhance InvalidHandleError to explain root cause and suggest fixing INSTANCE_DID.
Self-Hosting Security Guide#
Added: 2025-10-11 | Effort: 1 day
Needed: Document did:web setup, DNS config, secrets management, rate limiting, PostgreSQL hardening, monitoring.
OAuth Session Cleanup Race Condition#
Added: 2025-10-11 | Effort: 2 hours
Problem: Cleanup goroutine doesn't handle graceful shutdown, may orphan DB connections.
Solution: Pass cancellable context, handle SIGTERM, add cleanup timeout.
Jetstream Consumer Race Condition#
Added: 2025-10-11 | Effort: 1 hour
Problem: Multiple goroutines can call close(done) concurrently in consumer shutdown.
Solution: Use sync.Once for channel close or atomic flag for shutdown state.
Code: TODO in jetstream/user_consumer.go:114
🔵 P3: Technical Debt#
Consolidate Environment Variable Validation#
Added: 2025-10-11 | Effort: 2-3 hours
Create internal/config package with structured config validation. Fail fast with clear errors.
Add Connection Pooling for PDS HTTP Clients#
Added: 2025-10-11 | Effort: 2 hours
Create shared http.Client with connection pooling instead of new client per request.
Architecture Decision Records (ADRs)#
Added: 2025-10-11 | Effort: Ongoing
Document: did:plc choice, pgcrypto encryption, Jetstream vs firehose, write-forward pattern, single handle field.
Replace log Package with Structured Logger#
Added: 2025-10-11 | Effort: 1 day
Problem: Using standard log package. Need structured logging (JSON) with levels.
Solution: Switch to slog, zap, or zerolog. Add request IDs, context fields.
Code: TODO in community/errors.go:46
PDS URL Resolution from DID#
Added: 2025-10-11 | Effort: 2-3 hours
Problem: User consumer doesn't resolve PDS URL from DID document when missing.
Solution: Query PLC directory for DID document, extract serviceEndpoint.
Code: TODO in jetstream/user_consumer.go:203
PLC Directory Registration (Production)#
Added: 2025-10-11 | Effort: 1 day
Problem: DID generator creates did:plc but doesn't register in prod mode.
Solution: Implement PLC registration API call when isDevEnv=false.
Code: TODO in did/generator.go:46
Recent Completions#
✅ OAuth Authentication for Community Actions (2025-10-16)#
Completed: Full OAuth JWT authentication flow for protected endpoints
Implementation:
- ✅ JWT parser compatible with atProto PDS tokens (aud/iss handling)
- ✅ Auth middleware protecting create/update/subscribe/unsubscribe endpoints
- ✅ Handler-level DID extraction from JWT tokens via
middleware.GetUserDID(r) - ✅ Removed all X-User-DID header placeholders
- ✅ E2E tests validate complete OAuth flow with real PDS tokens
- ✅ Security: Issuer validation supports both HTTPS URLs and DIDs
Files Modified:
- internal/atproto/auth/jwt.go - JWT parsing with atProto compatibility
- internal/api/middleware/auth.go - Auth middleware
- internal/api/handlers/community/ - All handlers updated
- tests/integration/community_e2e_test.go - OAuth E2E tests
Related: Also implemented hostedByDID auto-population for security (see P1 item above)
✅ Fix .local TLD Bug (2025-10-11)#
Changed default INSTANCE_DID from did:web:coves.local → did:web:coves.social. Fixed community creation failure due to disallowed .local TLD.
Prioritization#
- P0: Security vulns, data loss, prod blockers
- P1: Major UX/reliability issues
- P2: QOL improvements, minor bugs, docs
- P3: Refactoring, code quality