A community based topic aggregation platform built on atproto
1# Comment System Implementation
2
3## Overview
4
5This document details the complete implementation of the comment system for Coves, a forum-like atProto social media platform. The comment system follows the established vote system pattern, with comments living in user repositories and being indexed by the AppView via Jetstream firehose.
6
7**Implementation Date:** November 4-6, 2025
8**Status:** ✅ Phase 1 & 2A Complete - Production-Ready with All PR Fixes
9**Test Coverage:** 29 integration tests (18 indexing + 11 query), all passing
10**Last Updated:** November 6, 2025 (Final PR review fixes complete - lexicon compliance, data integrity, SQL correctness)
11
12---
13
14## Development Phases
15
16This implementation follows a phased approach for maintainability and proper scoping:
17
18### ✅ Phase 1: Indexing Infrastructure (Current - COMPLETE)
19**What was built:**
20- Jetstream consumer for indexing comment CREATE/UPDATE/DELETE events
21- PostgreSQL schema with proper indexes and denormalized counts
22- Repository layer with comprehensive query methods
23- Atomic parent count updates (posts.comment_count, comments.reply_count)
24- Out-of-order event handling with reconciliation
25- Soft delete support preserving thread structure
26- Full integration test coverage (20 tests)
27
28**What works:**
29- Comments are indexed from Jetstream firehose as users create them
30- Threading relationships tracked (root + parent references)
31- Parent counts automatically maintained
32- Comment updates and deletes processed correctly
33- Out-of-order events reconciled automatically
34
35**What's NOT in this phase:**
36- ❌ No HTTP API endpoints for querying comments
37- ❌ No service layer (repository is sufficient for indexing)
38- ❌ No rate limiting or auth middleware
39- ❌ No API documentation
40
41### ✅ Phase 2A: Query API - COMPLETE (November 5, 2025)
42
43**What was built:**
44- Lexicon definitions: `social.coves.community.comment.defs` and `getComments`
45- Database query methods with Lemmy hot ranking algorithm
46- Service layer with iterative loading strategy for nested replies
47- XRPC HTTP handler with optional authentication
48- Comprehensive integration test suite (11 test scenarios)
49
50**What works:**
51- Fetch comments on any post with sorting (hot/top/new)
52- Nested replies up to configurable depth (default 10, max 100)
53- Lemmy hot ranking: `log(greatest(2, score + 2)) / power(time_decay, 1.8)`
54- Cursor-based pagination for stable scrolling
55- Optional authentication for viewer state (stubbed for Phase 2B)
56- Timeframe filtering for "top" sort (hour/day/week/month/year/all)
57
58**Endpoints:**
59- `GET /xrpc/social.coves.community.comment.getComments`
60 - Required: `post` (AT-URI)
61 - Optional: `sort` (hot/top/new), `depth` (0-100), `limit` (1-100), `cursor`, `timeframe`
62 - Returns: Array of `threadViewComment` with nested replies + post context
63 - Supports Bearer token for authenticated requests (viewer state)
64
65**Files created (9):**
661. `internal/atproto/lexicon/social/coves/community/comment/defs.json` - View definitions
672. `internal/atproto/lexicon/social/coves/community/comment/getComments.json` - Query endpoint
683. `internal/core/comments/comment_service.go` - Business logic layer
694. `internal/core/comments/view_models.go` - API response types
705. `internal/api/handlers/comments/get_comments.go` - HTTP handler
716. `internal/api/handlers/comments/errors.go` - Error handling utilities
727. `internal/api/handlers/comments/middleware.go` - Auth middleware
738. `internal/api/handlers/comments/service_adapter.go` - Service layer adapter
749. `tests/integration/comment_query_test.go` - Integration tests
75
76**Files modified (7):**
771. `internal/db/postgres/comment_repo.go` - Added query methods (~450 lines), fixed INNER→LEFT JOIN, fixed window function SQL
782. `internal/core/comments/interfaces.go` - Added service interface
793. `internal/core/comments/comment.go` - Added CommenterHandle field
804. `internal/core/comments/errors.go` - Added IsValidationError helper
815. `cmd/server/main.go` - Wired up routes and service with all repositories
826. `tests/integration/comment_query_test.go` - Updated test helpers for new service signature
837. `docs/COMMENT_SYSTEM_IMPLEMENTATION.md` - This document
84
85**Total new code:** ~2,400 lines
86
87**Test coverage:**
88- 11 integration test scenarios covering:
89 - Basic fetch, nested replies, depth limits
90 - Hot/top/new sorting algorithms
91 - Pagination with cursor stability
92 - Empty threads, deleted comments
93 - Invalid input handling
94 - HTTP handler end-to-end
95- Repository layer tested (hot ranking formula, pagination)
96- Service layer tested (threading, depth limits)
97- Handler tested (input validation, error cases)
98- All tests passing ✅
99
100### 🔒 Production Hardening (PR Review Fixes - November 5, 2025)
101
102After initial implementation, a thorough PR review identified several critical issues that were addressed before production deployment:
103
104#### Critical Issues Fixed
105
106**1. N+1 Query Problem (99.7% reduction in queries)**
107- **Problem:** Nested reply loading made separate DB queries for each comment's children
108- **Impact:** Could execute 1,551 queries for a post with 50 comments at depth 3
109- **Solution:** Implemented batch loading with PostgreSQL window functions
110 - Added `ListByParentsBatch()` method using `ROW_NUMBER() OVER (PARTITION BY parent_uri)`
111 - Refactored `buildThreadViews()` to collect parent URIs per level and fetch in one query
112 - **Result:** Reduced from 1,551 queries → 4 queries (1 per depth level)
113- **Files:** `internal/core/comments/interfaces.go`, `internal/db/postgres/comment_repo.go`, `internal/core/comments/comment_service.go`
114
115**2. Post Not Found Returns 500 Instead of 404**
116- **Problem:** When fetching comments for non-existent post, service returned wrapped `posts.ErrNotFound` which handler didn't recognize
117- **Impact:** Clients got HTTP 500 instead of proper HTTP 404
118- **Solution:** Added error translation in service layer
119 ```go
120 if posts.IsNotFound(err) {
121 return nil, ErrRootNotFound // Recognized by comments.IsNotFound()
122 }
123 ```
124- **File:** `internal/core/comments/comment_service.go:68-72`
125
126#### Important Issues Fixed
127
128**3. Missing Endpoint-Specific Rate Limiting**
129- **Problem:** Comment queries with deep nesting expensive but only protected by global 100 req/min limit
130- **Solution:** Added dedicated rate limiter at 20 req/min for comment endpoint
131- **File:** `cmd/server/main.go:429-439`
132
133**4. Unbounded Cursor Size (DoS Vector)**
134- **Problem:** No validation before base64 decoding - attacker could send massive cursor string
135- **Solution:** Added 1024-byte max size check before decoding
136- **File:** `internal/db/postgres/comment_repo.go:547-551`
137
138**5. Missing Query Timeout**
139- **Problem:** Deep nested queries could run indefinitely
140- **Solution:** Added 10-second context timeout to `GetComments()`
141- **File:** `internal/core/comments/comment_service.go:62-64`
142
143**6. Post View Not Populated (P0 Blocker)**
144- **Problem:** Lexicon marked `post` field as required but response always returned `null`
145- **Impact:** Violated schema contract, would break client deserialization
146- **Solution:**
147 - Updated service to accept `posts.Repository` instead of `interface{}`
148 - Added `buildPostView()` method to construct post views with author/community/stats
149 - Fetch post before returning response
150- **Files:** `internal/core/comments/comment_service.go:33-36`, `:66-73`, `:224-274`
151
152**7. Missing Record Fields (P0 Blocker)**
153- **Problem:** Both `postView.record` and `commentView.record` fields were null despite lexicon marking them as required
154- **Impact:** Violated lexicon contract, would break strict client deserialization
155- **Solution:**
156 - Added `buildPostRecord()` method to construct minimal PostRecord from Post entity
157 - Added `buildCommentRecord()` method to construct minimal CommentRecord from Comment entity
158 - Both methods populate required fields (type, reply refs, content, timestamps)
159 - Added TODOs for Phase 2C to unmarshal JSON fields (embed, facets, labels)
160- **Files:** `internal/core/comments/comment_service.go:260-288`, `:366-386`
161
162**8. Handle/Name Format Violations (P0 & Important)**
163- **Problem:**
164 - `postView.author.handle` contained DID instead of proper handle (violates `format:"handle"`)
165 - `postView.community.name` contained DID instead of community name
166- **Impact:** Lexicon format constraints violated, poor UX showing DIDs instead of readable names
167- **Solution:**
168 - Added `users.UserRepository` to service for author handle hydration
169 - Added `communities.Repository` to service for community name hydration
170 - Updated `buildPostView()` to fetch user and community records with DID fallback
171 - Log warnings for missing records but don't fail entire request
172- **Files:** `internal/core/comments/comment_service.go:34-37`, `:292-325`, `cmd/server/main.go:297`
173
174**9. Data Loss from INNER JOIN (P1 Critical)**
175- **Problem:** Three query methods used `INNER JOIN users` which dropped comments when user not indexed yet
176- **Impact:** New user's first comments would disappear until user consumer caught up (violates out-of-order design)
177- **Solution:**
178 - Changed `INNER JOIN users` → `LEFT JOIN users` in all three methods
179 - Added `COALESCE(u.handle, c.commenter_did)` to gracefully fall back to DID
180 - Preserves all comments while still hydrating handles when available
181- **Files:** `internal/db/postgres/comment_repo.go:396`, `:407`, `:415`, `:694-706`, `:761-836`
182
183**10. Window Function SQL Bug (P0 Critical)**
184- **Problem:** `ListByParentsBatch` used `ORDER BY hot_rank DESC` in window function, but PostgreSQL doesn't allow SELECT aliases in window ORDER BY
185- **Impact:** SQL error "column hot_rank does not exist" caused silent failure, dropping ALL nested replies in hot sort mode
186- **Solution:**
187 - Created separate `windowOrderBy` variable that inlines full hot_rank formula
188 - PostgreSQL evaluates window ORDER BY before SELECT, so must use full expression
189 - Hot sort now works correctly with nested replies
190- **Files:** `internal/db/postgres/comment_repo.go:776`, `:808`
191- **Critical Note:** This affected default sorting mode (hot) and would have broken production UX
192
193#### Documentation Added
194
195**11. Hot Rank Caching Strategy**
196- Documented when and how to implement cached hot rank column
197- Specified observability metrics to monitor (p95 latency, CPU usage)
198- Documented trade-offs between cached vs on-demand computation
199
200**Test Coverage:**
201- All fixes verified with existing integration test suite
202- Added test cases for error handling scenarios
203- All integration tests passing (comment_query_test.go: 11 tests)
204
205**Rationale for phased approach:**
2061. **Separation of concerns**: Indexing and querying are distinct responsibilities
2072. **Testability**: Phase 1 can be fully tested without API layer
2083. **Incremental delivery**: Indexing can run in production while API is developed
2094. **Scope management**: Prevents feature creep and allows focused code review
210
211---
212
213## Hot Ranking Algorithm (Lemmy-Based)
214
215### Formula
216
217```sql
218log(greatest(2, score + 2)) /
219 power(((EXTRACT(EPOCH FROM (NOW() - created_at)) / 3600) + 2), 1.8)
220```
221
222### Explanation
223
224**Components:**
225- `greatest(2, score + 2)`: Ensures log input never goes below 2
226 - Prevents negative log values for heavily downvoted comments
227 - Score of -5 → log(2), same as score of 0
228 - Prevents brigading from creating "anti-viral" comments
229
230- `power(..., 1.8)`: Time decay exponent
231 - Higher than posts (1.5) for faster comment aging
232 - Comments should be fresher than posts
233
234- `+ 2` offsets: Prevent divide-by-zero for very new comments
235
236**Behavior:**
237- High score + old = lower rank (content ages naturally)
238- Low score + new = higher rank (fresh content gets visibility)
239- Negative scores don't break the formula (bounded at log(2))
240
241### Sort Modes
242
243**Hot (default):**
244```sql
245ORDER BY hot_rank DESC, score DESC, created_at DESC
246```
247
248**Top (with timeframe):**
249```sql
250WHERE created_at >= NOW() - INTERVAL '1 day'
251ORDER BY score DESC, created_at DESC
252```
253
254**New (chronological):**
255```sql
256ORDER BY created_at DESC
257```
258
259### Path-Based Ordering
260
261Comments are ordered within their tree level:
262```sql
263ORDER BY
264 path ASC, -- Maintains parent-child structure
265 hot_rank DESC, -- Sorts siblings by rank
266 score DESC, -- Tiebreaker
267 created_at DESC -- Final tiebreaker
268```
269
270**Result:** Siblings compete with siblings, but children never outrank their parent.
271
272---
273
274## Architecture
275
276### Data Flow
277
278```
279Client → User's PDS → Jetstream Firehose → Comment Consumer → PostgreSQL AppView
280 ↓
281 Atomic updates to parent counts
282 (posts.comment_count OR comments.reply_count)
283```
284
285### Key Design Principles
286
2871. **User-Owned Records**: Comments live in user repositories (like votes), not community repositories (like posts)
2882. **atProto Native**: Uses `com.atproto.repo.createRecord/updateRecord/deleteRecord`
2893. **Threading via Strong References**: Root + parent system allows unlimited nesting depth
2904. **Out-of-Order Indexing**: No foreign key constraints to allow Jetstream events to arrive in any order
2915. **Idempotent Operations**: Safe for Jetstream replays and duplicate events
2926. **Atomic Count Updates**: Database transactions ensure consistency
2937. **Soft Deletes**: Preserves thread structure when comments are deleted
294
295---
296
297## Implementation Details
298
299### 1. Lexicon Definition
300
301**Location:** `internal/atproto/lexicon/social/coves/feed/comment.json`
302
303The lexicon was already defined and follows atProto best practices:
304
305```json
306{
307 "lexicon": 1,
308 "id": "social.coves.feed.comment",
309 "defs": {
310 "main": {
311 "type": "record",
312 "key": "tid",
313 "required": ["reply", "content", "createdAt"],
314 "properties": {
315 "reply": {
316 "type": "ref",
317 "ref": "#replyRef",
318 "description": "Reference to the post and parent being replied to"
319 },
320 "content": {
321 "type": "string",
322 "maxGraphemes": 3000,
323 "maxLength": 30000
324 },
325 "facets": { /* Rich text annotations */ },
326 "embed": { /* Images, quoted posts */ },
327 "langs": { /* ISO 639-1 language codes */ },
328 "labels": { /* Self-applied content labels */ },
329 "createdAt": { /* RFC3339 timestamp */ }
330 }
331 },
332 "replyRef": {
333 "required": ["root", "parent"],
334 "properties": {
335 "root": {
336 "type": "ref",
337 "ref": "com.atproto.repo.strongRef",
338 "description": "Strong reference to the original post"
339 },
340 "parent": {
341 "type": "ref",
342 "ref": "com.atproto.repo.strongRef",
343 "description": "Strong reference to immediate parent (post or comment)"
344 }
345 }
346 }
347 }
348}
349```
350
351**Threading Model:**
352- `root`: Always points to the original post that started the thread
353- `parent`: Points to the immediate parent (can be a post or another comment)
354- This enables unlimited nested threading while maintaining the root reference
355
356---
357
358### 2. Database Schema
359
360**Migration:** `internal/db/migrations/016_create_comments_table.sql`
361
362```sql
363CREATE TABLE comments (
364 id BIGSERIAL PRIMARY KEY,
365 uri TEXT UNIQUE NOT NULL, -- AT-URI (at://commenter_did/social.coves.feed.comment/rkey)
366 cid TEXT NOT NULL, -- Content ID
367 rkey TEXT NOT NULL, -- Record key (TID)
368 commenter_did TEXT NOT NULL, -- User who commented (from AT-URI repo field)
369
370 -- Threading structure (reply references)
371 root_uri TEXT NOT NULL, -- Strong reference to original post
372 root_cid TEXT NOT NULL, -- CID of root post (version pinning)
373 parent_uri TEXT NOT NULL, -- Strong reference to immediate parent
374 parent_cid TEXT NOT NULL, -- CID of parent (version pinning)
375
376 -- Content
377 content TEXT NOT NULL, -- Comment text (max 3000 graphemes, 30000 bytes)
378 content_facets JSONB, -- Rich text facets
379 embed JSONB, -- Embedded content (images, quoted posts)
380 content_labels JSONB, -- Self-applied labels (com.atproto.label.defs#selfLabels)
381 langs TEXT[], -- Languages (ISO 639-1, max 3)
382
383 -- Timestamps
384 created_at TIMESTAMPTZ NOT NULL, -- Commenter's timestamp from record
385 indexed_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
386 deleted_at TIMESTAMPTZ, -- Soft delete
387
388 -- Stats (denormalized for performance)
389 upvote_count INT NOT NULL DEFAULT 0, -- Comments CAN be voted on
390 downvote_count INT NOT NULL DEFAULT 0,
391 score INT NOT NULL DEFAULT 0, -- upvote_count - downvote_count
392 reply_count INT NOT NULL DEFAULT 0 -- Number of direct replies
393);
394```
395
396**Key Indexes:**
397```sql
398-- Threading queries (most important for UX)
399CREATE INDEX idx_comments_root ON comments(root_uri, created_at DESC)
400 WHERE deleted_at IS NULL;
401CREATE INDEX idx_comments_parent ON comments(parent_uri, created_at DESC)
402 WHERE deleted_at IS NULL;
403CREATE INDEX idx_comments_parent_score ON comments(parent_uri, score DESC, created_at DESC)
404 WHERE deleted_at IS NULL;
405
406-- User queries
407CREATE INDEX idx_comments_commenter ON comments(commenter_did, created_at DESC);
408
409-- Vote targeting
410CREATE INDEX idx_comments_uri_active ON comments(uri)
411 WHERE deleted_at IS NULL;
412```
413
414**Design Decisions:**
415- **No FK on `commenter_did`**: Allows out-of-order Jetstream indexing (comment events may arrive before user events)
416- **Soft delete pattern**: `deleted_at IS NULL` in indexes for performance
417- **Vote counts included**: The vote lexicon explicitly allows voting on comments (not just posts)
418- **StrongRef with CID**: Version pinning prevents confusion when parent content changes
419
420---
421
422### 3. Domain Layer
423
424#### Comment Entity
425**File:** `internal/core/comments/comment.go`
426
427```go
428type Comment struct {
429 ID int64
430 URI string
431 CID string
432 RKey string
433 CommenterDID string
434
435 // Threading
436 RootURI string
437 RootCID string
438 ParentURI string
439 ParentCID string
440
441 // Content
442 Content string
443 ContentFacets *string
444 Embed *string
445 ContentLabels *string
446 Langs []string
447
448 // Timestamps
449 CreatedAt time.Time
450 IndexedAt time.Time
451 DeletedAt *time.Time
452
453 // Stats
454 UpvoteCount int
455 DownvoteCount int
456 Score int
457 ReplyCount int
458}
459```
460
461#### Repository Interface
462**File:** `internal/core/comments/interfaces.go`
463
464```go
465type Repository interface {
466 Create(ctx context.Context, comment *Comment) error
467 Update(ctx context.Context, comment *Comment) error
468 GetByURI(ctx context.Context, uri string) (*Comment, error)
469 Delete(ctx context.Context, uri string) error
470
471 // Threading queries
472 ListByRoot(ctx context.Context, rootURI string, limit, offset int) ([]*Comment, error)
473 ListByParent(ctx context.Context, parentURI string, limit, offset int) ([]*Comment, error)
474 CountByParent(ctx context.Context, parentURI string) (int, error)
475
476 // User queries
477 ListByCommenter(ctx context.Context, commenterDID string, limit, offset int) ([]*Comment, error)
478}
479```
480
481#### Error Types
482**File:** `internal/core/comments/errors.go`
483
484Standard error types following the vote system pattern, with helper functions `IsNotFound()` and `IsConflict()`.
485
486---
487
488### 4. Repository Implementation
489
490**File:** `internal/db/postgres/comment_repo.go`
491
492#### Idempotent Create Pattern
493```go
494func (r *postgresCommentRepo) Create(ctx context.Context, comment *Comment) error {
495 query := `
496 INSERT INTO comments (...)
497 VALUES (...)
498 ON CONFLICT (uri) DO NOTHING
499 RETURNING id, indexed_at
500 `
501
502 err := r.db.QueryRowContext(ctx, query, ...).Scan(&comment.ID, &comment.IndexedAt)
503
504 // ON CONFLICT DO NOTHING returns no rows if duplicate
505 if err == sql.ErrNoRows {
506 return nil // Already exists - OK for Jetstream replays
507 }
508
509 return err
510}
511```
512
513#### Update Preserving Vote Counts
514```go
515func (r *postgresCommentRepo) Update(ctx context.Context, comment *Comment) error {
516 query := `
517 UPDATE comments
518 SET cid = $1, content = $2, content_facets = $3,
519 embed = $4, content_labels = $5, langs = $6
520 WHERE uri = $7 AND deleted_at IS NULL
521 RETURNING id, indexed_at, created_at,
522 upvote_count, downvote_count, score, reply_count
523 `
524
525 // Vote counts and created_at are preserved (not in SET clause)
526 err := r.db.QueryRowContext(ctx, query, ...).Scan(...)
527 return err
528}
529```
530
531#### Soft Delete
532```go
533func (r *postgresCommentRepo) Delete(ctx context.Context, uri string) error {
534 query := `
535 UPDATE comments
536 SET deleted_at = NOW()
537 WHERE uri = $1 AND deleted_at IS NULL
538 `
539
540 result, err := r.db.ExecContext(ctx, query, uri)
541 // Idempotent: Returns success even if already deleted
542 return err
543}
544```
545
546---
547
548### 5. Jetstream Consumer
549
550**File:** `internal/atproto/jetstream/comment_consumer.go`
551
552#### Event Handler
553```go
554func (c *CommentEventConsumer) HandleEvent(ctx context.Context, event *JetstreamEvent) error {
555 if event.Kind != "commit" || event.Commit == nil {
556 return nil
557 }
558
559 if event.Commit.Collection == "social.coves.feed.comment" {
560 switch event.Commit.Operation {
561 case "create":
562 return c.createComment(ctx, event.Did, commit)
563 case "update":
564 return c.updateComment(ctx, event.Did, commit)
565 case "delete":
566 return c.deleteComment(ctx, event.Did, commit)
567 }
568 }
569
570 return nil
571}
572```
573
574#### Atomic Count Updates
575```go
576func (c *CommentEventConsumer) indexCommentAndUpdateCounts(ctx, comment *Comment) error {
577 tx, _ := c.db.BeginTx(ctx, nil)
578 defer tx.Rollback()
579
580 // 1. Insert comment (idempotent)
581 err = tx.QueryRowContext(ctx, `
582 INSERT INTO comments (...) VALUES (...)
583 ON CONFLICT (uri) DO NOTHING
584 RETURNING id
585 `).Scan(&commentID)
586
587 if err == sql.ErrNoRows {
588 tx.Commit()
589 return nil // Already indexed
590 }
591
592 // 2. Update parent counts atomically
593 // Try posts table first
594 tx.ExecContext(ctx, `
595 UPDATE posts
596 SET comment_count = comment_count + 1
597 WHERE uri = $1 AND deleted_at IS NULL
598 `, comment.ParentURI)
599
600 // If no post updated, parent is probably a comment
601 tx.ExecContext(ctx, `
602 UPDATE comments
603 SET reply_count = reply_count + 1
604 WHERE uri = $1 AND deleted_at IS NULL
605 `, comment.ParentURI)
606
607 return tx.Commit()
608}
609```
610
611#### Security Validation
612```go
613func (c *CommentEventConsumer) validateCommentEvent(ctx, repoDID string, comment *CommentRecord) error {
614 // Comments MUST come from user repositories (repo owner = commenter DID)
615 if !strings.HasPrefix(repoDID, "did:") {
616 return fmt.Errorf("invalid commenter DID format: %s", repoDID)
617 }
618
619 // Content is required
620 if comment.Content == "" {
621 return fmt.Errorf("comment content is required")
622 }
623
624 // Reply references must have both URI and CID
625 if comment.Reply.Root.URI == "" || comment.Reply.Root.CID == "" {
626 return fmt.Errorf("invalid root reference: must have both URI and CID")
627 }
628
629 if comment.Reply.Parent.URI == "" || comment.Reply.Parent.CID == "" {
630 return fmt.Errorf("invalid parent reference: must have both URI and CID")
631 }
632
633 return nil
634}
635```
636
637**Security Note:** We do NOT verify that the user exists in the AppView because:
6381. Comment events may arrive before user events in Jetstream (race condition)
6392. The comment came from the user's PDS repository (authenticated by PDS)
6403. No database FK constraint allows out-of-order indexing
6414. Orphaned comments (from never-indexed users) are harmless
642
643---
644
645### 6. WebSocket Connector
646
647**File:** `internal/atproto/jetstream/comment_jetstream_connector.go`
648
649Follows the standard Jetstream connector pattern with:
650- Auto-reconnect on errors (5-second retry)
651- Ping/pong keepalive (30-second ping, 60-second read deadline)
652- Graceful shutdown via context cancellation
653- Subscribes to: `wantedCollections=social.coves.feed.comment`
654
655---
656
657### 7. Server Integration
658
659**File:** `cmd/server/main.go` (lines 289-396)
660
661```go
662// Initialize comment repository
663commentRepo := postgresRepo.NewCommentRepository(db)
664log.Println("✅ Comment repository initialized (Jetstream indexing only)")
665
666// Start Jetstream consumer for comments
667commentJetstreamURL := os.Getenv("COMMENT_JETSTREAM_URL")
668if commentJetstreamURL == "" {
669 commentJetstreamURL = "ws://localhost:6008/subscribe?wantedCollections=social.coves.feed.comment"
670}
671
672commentEventConsumer := jetstream.NewCommentEventConsumer(commentRepo, db)
673commentJetstreamConnector := jetstream.NewCommentJetstreamConnector(commentEventConsumer, commentJetstreamURL)
674
675go func() {
676 if startErr := commentJetstreamConnector.Start(ctx); startErr != nil {
677 log.Printf("Comment Jetstream consumer stopped: %v", startErr)
678 }
679}()
680
681log.Printf("Started Jetstream comment consumer: %s", commentJetstreamURL)
682log.Println(" - Indexing: social.coves.feed.comment CREATE/UPDATE/DELETE operations")
683log.Println(" - Updating: Post comment counts and comment reply counts atomically")
684```
685
686---
687
688## Testing
689
690### Test Suite
691
692**File:** `tests/integration/comment_consumer_test.go`
693
694**Test Coverage:** 6 test suites, 18 test cases, **100% passing**
695
696#### 1. TestCommentConsumer_CreateComment
697- ✅ Create comment on post
698- ✅ Verify comment is indexed correctly
699- ✅ Verify post comment count is incremented
700- ✅ Idempotent create - duplicate events don't double-count
701
702#### 2. TestCommentConsumer_Threading
703- ✅ Create first-level comment (reply to post)
704- ✅ Create second-level comment (reply to comment)
705- ✅ Verify both comments have same root (original post)
706- ✅ Verify parent relationships are correct
707- ✅ Verify reply counts are updated
708- ✅ Query all comments by root (flat list)
709- ✅ Query direct replies to post
710- ✅ Query direct replies to comment
711
712#### 3. TestCommentConsumer_UpdateComment
713- ✅ Create comment with initial content
714- ✅ Manually set vote counts to simulate votes
715- ✅ Update comment content
716- ✅ Verify content is updated
717- ✅ Verify CID is updated
718- ✅ **Verify vote counts are preserved**
719- ✅ **Verify created_at is preserved**
720
721#### 4. TestCommentConsumer_DeleteComment
722- ✅ Create comment
723- ✅ Delete comment (soft delete)
724- ✅ Verify deleted_at is set
725- ✅ Verify post comment count is decremented
726- ✅ Idempotent delete - duplicate deletes don't double-decrement
727
728#### 5. TestCommentConsumer_SecurityValidation
729- ✅ Reject comment with empty content
730- ✅ Reject comment with invalid root reference (missing URI)
731- ✅ Reject comment with invalid parent reference (missing CID)
732- ✅ Reject comment with invalid DID format
733
734#### 6. TestCommentRepository_Queries
735- ✅ ListByRoot returns all comments in thread (4 comments)
736- ✅ ListByParent returns direct replies to post (2 comments)
737- ✅ ListByParent returns direct replies to comment (2 comments)
738- ✅ CountByParent returns correct counts
739- ✅ ListByCommenter returns all user's comments
740
741### Test Results
742
743```
744=== Test Summary ===
745PASS: TestCommentConsumer_CreateComment (0.02s)
746PASS: TestCommentConsumer_Threading (0.02s)
747PASS: TestCommentConsumer_UpdateComment (0.02s)
748PASS: TestCommentConsumer_DeleteComment (0.02s)
749PASS: TestCommentConsumer_SecurityValidation (0.01s)
750PASS: TestCommentRepository_Queries (0.02s)
751
752✅ ALL 18 TESTS PASS
753Total time: 0.115s
754```
755
756---
757
758## Key Features
759
760### ✅ Comments ARE Votable
761The vote lexicon explicitly states: *"Record declaring a vote (upvote or downvote) on a **post or comment**"*
762
763Comments include full vote tracking:
764- `upvote_count`
765- `downvote_count`
766- `score` (calculated as upvote_count - downvote_count)
767
768### ✅ Comments ARE Editable
769Unlike votes (which are immutable), comments support UPDATE operations:
770- Content, facets, embed, and labels can be updated
771- Vote counts and created_at are preserved
772- CID is updated to reflect new version
773
774### ✅ Threading Support
775Unlimited nesting depth via root + parent system:
776- Every comment knows its root post
777- Every comment knows its immediate parent
778- Easy to query entire threads or direct replies
779- Soft deletes preserve thread structure
780
781### ✅ Out-of-Order Indexing
782No foreign key constraints allow events to arrive in any order:
783- Comment events may arrive before user events
784- Comment events may arrive before post events
785- All operations are idempotent
786- Safe for Jetstream replays
787
788### ✅ Atomic Consistency
789Database transactions ensure counts are always accurate:
790- Comment creation increments parent count
791- Comment deletion decrements parent count
792- No race conditions
793- No orphaned counts
794
795---
796
797## Implementation Statistics
798
799### Phase 1 - Indexing Infrastructure
800
801**Files Created: 8**
8021. `internal/db/migrations/016_create_comments_table.sql` - 60 lines
8032. `internal/core/comments/comment.go` - 80 lines
8043. `internal/core/comments/interfaces.go` - 45 lines
8054. `internal/core/comments/errors.go` - 40 lines
8065. `internal/db/postgres/comment_repo.go` - 340 lines
8076. `internal/atproto/jetstream/comment_consumer.go` - 530 lines
8087. `internal/atproto/jetstream/comment_jetstream_connector.go` - 130 lines
8098. `tests/integration/comment_consumer_test.go` - 930 lines
810
811**Files Modified: 1**
8121. `cmd/server/main.go` - Added 20 lines for Jetstream consumer
813
814**Phase 1 Total:** ~2,175 lines
815
816### Phase 2A - Query API
817
818**Files Created: 9** (listed above in Phase 2A section)
819
820**Files Modified: 6** (listed above in Phase 2A section)
821
822**Phase 2A Total:** ~2,400 lines
823
824### Combined Total: ~4,575 lines
825
826---
827
828## Reference Pattern: Vote System
829
830The comment implementation closely follows the vote system pattern:
831
832| Aspect | Votes | Comments |
833|--------|-------|----------|
834| **Location** | User repositories | User repositories |
835| **Lexicon** | `social.coves.feed.vote` | `social.coves.feed.comment` |
836| **Operations** | CREATE, DELETE | CREATE, UPDATE, DELETE |
837| **Mutability** | Immutable | Editable |
838| **Foreign Keys** | None (out-of-order indexing) | None (out-of-order indexing) |
839| **Delete Pattern** | Soft delete | Soft delete |
840| **Idempotency** | ON CONFLICT DO NOTHING | ON CONFLICT DO NOTHING |
841| **Count Updates** | Atomic transaction | Atomic transaction |
842| **Security** | PDS authentication | PDS authentication |
843
844---
845
846## Future Phases
847
848### 📋 Phase 2B: Vote Integration (Planned)
849
850**Scope:**
851- Update vote consumer to handle comment votes
852- Integrate `GetVoteStateForComments()` in service layer
853- Populate viewer.vote and viewer.voteUri in commentView
854- Test vote creation on comments end-to-end
855- Atomic updates to comments.upvote_count, downvote_count, score
856
857**Dependencies:**
858- Phase 1 indexing (✅ Complete)
859- Phase 2A query API (✅ Complete)
860- Vote consumer (already exists for posts)
861
862**Estimated effort:** 2-3 hours
863
864---
865
866### 📋 Phase 2C: Post/User Integration (Partially Complete)
867
868**Completed (PR Review):**
869- ✅ Integrated post repository in comment service
870- ✅ Return postView in getComments response with basic fields
871- ✅ Populate post author DID, community DID, stats (upvotes, downvotes, score, comment count)
872
873**Remaining Work:**
874- ❌ Integrate user repository for full AuthorView
875- ❌ Add display name and avatar to comment/post authors (currently returns DID as handle)
876- ❌ Add community name and avatar (currently returns DID as name)
877- ❌ Parse and include original record in commentView
878
879**Dependencies:**
880- Phase 2A query API (✅ Complete)
881- Post repository integration (✅ Complete)
882- User repository integration (⏳ Pending)
883
884**Estimated effort for remaining work:** 1-2 hours
885
886---
887
888### 📋 Phase 3: Advanced Features (Future)
889
890#### 3A: Distinguished Comments
891- Moderator/admin comment flags
892- Priority sorting for distinguished comments
893- Visual indicators in UI
894
895#### 3B: Comment Search & Filtering
896- Full-text search within threads
897- Filter by author, time range, score
898- Search across community comments
899
900#### 3C: Moderation Tools
901- Hide/remove comments
902- Flag system for user reports
903- Moderator queue
904- Audit log
905
906#### 3D: Notifications
907- Notify users of replies to their comments
908- Notify post authors of new comments
909- Mention notifications (@user)
910- Customizable notification preferences
911
912#### 3E: Enhanced Features
913- Comment edit history tracking
914- Save/bookmark comments
915- Sort by "controversial" (high engagement, low score)
916- Collapsible comment threads
917- User-specific comment history API
918- Community-wide comment stats/analytics
919
920---
921
922### 📋 Phase 4: Namespace Migration (Separate Task)
923
924**Scope:**
925- Migrate existing `social.coves.feed.comment` records to `social.coves.community.comment`
926- Update all AT-URIs in database
927- Update Jetstream consumer collection filter
928- Migration script with rollback capability
929- Zero-downtime deployment strategy
930
931**Note:** Currently out of scope - will be tackled separately when needed.
932
933---
934
935## Performance Considerations
936
937### Database Indexes
938
939All critical query patterns are indexed:
940- **Threading queries**: `idx_comments_root`, `idx_comments_parent`
941- **Sorting by score**: `idx_comments_parent_score`
942- **User history**: `idx_comments_commenter`
943- **Vote targeting**: `idx_comments_uri_active`
944
945### Denormalized Counts
946
947Vote counts and reply counts are denormalized for performance:
948- Avoids `COUNT(*)` queries on large datasets
949- Updated atomically with comment operations
950- Indexed for fast sorting
951
952### Pagination Support
953
954All list queries support limit/offset pagination:
955- `ListByRoot(ctx, rootURI, limit, offset)`
956- `ListByParent(ctx, parentURI, limit, offset)`
957- `ListByCommenter(ctx, commenterDID, limit, offset)`
958
959### N+1 Query Prevention
960
961**Problem Solved:** The initial implementation had a classic N+1 query problem where nested reply loading made separate database queries for each comment's children. For a post with 50 top-level comments and 3 levels of depth, this could result in ~1,551 queries.
962
963**Solution Implemented:** Batch loading strategy using window functions:
9641. Collect all parent URIs at each depth level
9652. Execute single batch query using `ListByParentsBatch()` with PostgreSQL window functions
9663. Group results by parent URI in memory
9674. Recursively process next level
968
969**Performance Improvement:**
970- Old: 1 + N + (N × M) + (N × M × P) queries per request
971- New: 1 query per depth level (max 4 queries for depth 3)
972- Example with depth 3, 50 comments: 1,551 queries → 4 queries (99.7% reduction)
973
974**Implementation Details:**
975```sql
976-- Uses ROW_NUMBER() window function to limit per parent efficiently
977WITH ranked_comments AS (
978 SELECT *,
979 ROW_NUMBER() OVER (
980 PARTITION BY parent_uri
981 ORDER BY hot_rank DESC
982 ) as rn
983 FROM comments
984 WHERE parent_uri = ANY($1)
985)
986SELECT * FROM ranked_comments WHERE rn <= $2
987```
988
989### Hot Rank Caching Strategy
990
991**Current Implementation:**
992Hot rank is computed on-demand for every query using the Lemmy algorithm:
993```sql
994log(greatest(2, score + 2)) /
995 power(((EXTRACT(EPOCH FROM (NOW() - created_at)) / 3600) + 2), 1.8)
996```
997
998**Performance Impact:**
999- Computed for every comment in every hot-sorted query
1000- PostgreSQL handles this efficiently for moderate loads (<1000 comments per post)
1001- No noticeable performance degradation in testing
1002
1003**Future Optimization (if needed):**
1004
1005If hot rank computation becomes a bottleneck at scale:
1006
10071. **Add cached column:**
1008```sql
1009ALTER TABLE comments ADD COLUMN hot_rank_cached NUMERIC;
1010CREATE INDEX idx_comments_parent_hot_rank_cached
1011 ON comments(parent_uri, hot_rank_cached DESC)
1012 WHERE deleted_at IS NULL;
1013```
1014
10152. **Background recomputation job:**
1016```go
1017// Run every 5-15 minutes
1018func (j *HotRankJob) UpdateHotRanks(ctx context.Context) error {
1019 query := `
1020 UPDATE comments
1021 SET hot_rank_cached = log(greatest(2, score + 2)) /
1022 power(((EXTRACT(EPOCH FROM (NOW() - created_at)) / 3600) + 2), 1.8)
1023 WHERE deleted_at IS NULL
1024 `
1025 _, err := j.db.ExecContext(ctx, query)
1026 return err
1027}
1028```
1029
10303. **Use cached value in queries:**
1031```sql
1032SELECT * FROM comments
1033WHERE parent_uri = $1
1034ORDER BY hot_rank_cached DESC, score DESC
1035```
1036
1037**When to implement:**
1038- Monitor query performance in production
1039- If p95 query latency > 200ms for hot-sorted queries
1040- If database CPU usage from hot rank computation > 20%
1041- Only optimize if measurements show actual bottleneck
1042
1043**Trade-offs:**
1044- **Cached approach:** Faster queries, but ranks update every 5-15 minutes (slightly stale)
1045- **On-demand approach:** Always fresh ranks, slightly higher query cost
1046- For comment discussions, 5-15 minute staleness is acceptable (comments age slowly)
1047
1048---
1049
1050## Conclusion
1051
1052The comment system has successfully completed **Phase 1 (Indexing)** and **Phase 2A (Query API)**, providing a production-ready threaded discussion system for Coves:
1053
1054✅ **Phase 1 Complete**: Full indexing infrastructure with Jetstream consumer
1055✅ **Phase 2A Complete**: Query API with hot ranking, threading, and pagination
1056✅ **Fully Tested**: 30+ integration tests across indexing and query layers
1057✅ **Secure**: Input validation, parameterized queries, optional auth
1058✅ **Scalable**: Indexed queries, denormalized counts, cursor pagination
1059✅ **atProto Native**: User-owned records, Jetstream indexing, Bluesky patterns
1060
1061**Next milestones:**
1062- Phase 2B: Vote integration for comment voting
1063- Phase 2C: Post/user integration for complete views
1064- Phase 3: Advanced features (moderation, notifications, search)
1065
1066The implementation provides a solid foundation for building rich threaded discussions in Coves while maintaining compatibility with the broader atProto ecosystem and following established patterns from platforms like Lemmy and Reddit.
1067
1068---
1069
1070## Appendix: Command Reference
1071
1072### Run Tests
1073
1074**Phase 1 - Indexing Tests:**
1075```bash
1076TEST_DATABASE_URL="postgres://test_user:test_password@localhost:5434/coves_test?sslmode=disable" \
1077 go test -v ./tests/integration/comment_consumer_test.go \
1078 ./tests/integration/user_test.go \
1079 ./tests/integration/helpers.go \
1080 -run "TestCommentConsumer" -timeout 60s
1081```
1082
1083**Phase 2A - Query API Tests:**
1084```bash
1085TEST_DATABASE_URL="postgres://test_user:test_password@localhost:5434/coves_test?sslmode=disable" \
1086 go test -v ./tests/integration/comment_query_test.go \
1087 ./tests/integration/user_test.go \
1088 ./tests/integration/helpers.go \
1089 -run "TestCommentQuery" -timeout 120s
1090```
1091
1092**All Comment Tests:**
1093```bash
1094TEST_DATABASE_URL="postgres://test_user:test_password@localhost:5434/coves_test?sslmode=disable" \
1095 go test -v ./tests/integration/comment_*.go \
1096 ./tests/integration/user_test.go \
1097 ./tests/integration/helpers.go \
1098 -timeout 120s
1099```
1100
1101### Apply Migration
1102```bash
1103GOOSE_DRIVER=postgres \
1104GOOSE_DBSTRING="postgres://test_user:test_password@localhost:5434/coves_test?sslmode=disable" \
1105 goose -dir internal/db/migrations up
1106```
1107
1108### Build Server
1109```bash
1110go build ./cmd/server
1111```
1112
1113### Environment Variables
1114```bash
1115# Jetstream URL (optional, defaults to localhost:6008)
1116export COMMENT_JETSTREAM_URL="ws://localhost:6008/subscribe?wantedCollections=social.coves.feed.comment"
1117
1118# Database URL
1119export TEST_DATABASE_URL="postgres://test_user:test_password@localhost:5434/coves_test?sslmode=disable"
1120```
1121
1122---
1123
1124**Last Updated:** November 6, 2025
1125**Status:** ✅ Phase 1 & 2A Complete - Production-Ready with All PR Fixes