A community based topic aggregation platform built on atproto
1# Aggregators PRD: Automated Content Posting System
2
3**Status:** In Development - Phase 1 (Core Infrastructure)
4**Owner:** Platform Team
5**Last Updated:** 2025-10-20
6
7---
8
9## Overview
10
11Coves Aggregators are autonomous services that automatically post content to communities. Each aggregator is identified by its own DID and operates as a specialized actor within the atProto ecosystem. This enables communities to have automated content feeds (RSS, sports results, TV/movie discussion threads, Bluesky mirrors, etc.) while maintaining full community control.
12
13**Key Differentiator:** Unlike other platforms where users manually aggregate content, Coves communities can enable automated aggregators to handle routine posting tasks, creating a more dynamic and up-to-date community experience.
14
15---
16
17## Architecture Principles
18
19### ✅ atProto-Compliant Design
20
21Aggregators follow established atProto patterns for autonomous services (Feed Generators + Labelers model):
22
231. **Aggregators are Actors, Not a Separate System**
24 - Each aggregator has its own DID
25 - Authenticate as themselves via JWT
26 - Use existing `social.coves.community.post.create` endpoint
27 - Post record's `author` field = aggregator DID (server-populated)
28 - No separate posting API needed
29
302. **Community Authorization Model**
31 - Communities create `social.coves.aggregator.authorization` records in their repo
32 - Records grant specific aggregators permission to post
33 - Include aggregator-specific configuration
34 - Can be enabled/disabled without deletion
35
363. **Hybrid Hosting**
37 - Coves can host official aggregators
38 - Third parties can build and host their own
39 - All use same authorization system
40
41---
42
43## Core Components
44
45### 1. Service Declaration Record
46**Lexicon:** `social.coves.aggregator.service`
47**Location:** Aggregator's repository
48**Key:** `literal:self`
49
50Declares aggregator existence and provides metadata for discovery.
51
52**Required Fields:**
53- `did` - Aggregator's DID (must match repo)
54- `displayName` - Human-readable name
55- `createdAt` - Creation timestamp
56
57**Optional Fields:**
58- `description` - What this aggregator does
59- `avatar` - Avatar image blob
60- `configSchema` - JSON Schema for community config validation
61- `sourceUrl` - Link to source code (transparency)
62- `maintainer` - DID of maintainer
63
64---
65
66### 2. Authorization Record
67**Lexicon:** `social.coves.aggregator.authorization`
68**Location:** Community's repository
69**Key:** `any`
70
71Grants an aggregator permission to post with specific configuration.
72
73**Required Fields:**
74- `aggregatorDid` - DID of authorized aggregator
75- `communityDid` - DID of community (must match repo)
76- `enabled` - Active status (toggleable)
77- `createdAt` - When authorized
78
79**Optional Fields:**
80- `config` - Aggregator-specific config (validated against schema)
81- `createdBy` - Moderator who authorized
82- `disabledAt` / `disabledBy` - Audit trail
83
84---
85
86## Data Flow
87
88```
89Aggregator Service (External)
90 │
91 │ 1. Authenticates as aggregator DID (JWT)
92 │ 2. Calls social.coves.community.post.create
93 ▼
94Coves AppView Handler
95 │
96 │ 1. Extract DID from JWT
97 │ 2. Check if DID is registered aggregator
98 │ 3. Validate authorization exists & enabled
99 │ 4. Apply aggregator rate limits
100 │ 5. Create post with author = aggregator DID
101 ▼
102Jetstream → AppView Indexing
103 │
104 │ Post indexed with aggregator attribution
105 │ UI shows: "🤖 Posted by [Aggregator Name]"
106 ▼
107Community Feed
108```
109
110---
111
112## XRPC Methods
113
114### For Communities (Moderators)
115
116- **`social.coves.aggregator.enable`** - Create authorization record
117- **`social.coves.aggregator.disable`** - Set enabled=false
118- **`social.coves.aggregator.updateConfig`** - Update config
119- **`social.coves.aggregator.listForCommunity`** - List aggregators for community
120
121### For Aggregators
122
123- **`social.coves.community.post.create`** - Modified to handle aggregator auth
124- **`social.coves.aggregator.getAuthorizations`** - Query authorized communities
125
126### For Discovery
127
128- **`social.coves.aggregator.getServices`** - Fetch aggregator details by DID(s)
129
130---
131
132## Database Schema
133
134### `aggregators` Table
135Indexes aggregator service declarations from Jetstream.
136
137**Key Columns:**
138- `did` (PK) - Aggregator DID
139- `display_name`, `description` - Service metadata
140- `config_schema` - JSON Schema for config validation
141- `avatar_url`, `source_url`, `maintainer_did` - Metadata
142- `record_uri`, `record_cid` - atProto record metadata
143- `communities_using`, `posts_created` - Cached stats (updated by triggers)
144
145### `aggregator_authorizations` Table
146Indexes community authorization records from Jetstream.
147
148**Key Columns:**
149- `aggregator_did`, `community_did` - Authorization pair (unique together)
150- `enabled` - Active status
151- `config` - Community-specific JSON config
152- `created_by`, `disabled_by` - Audit trail
153- `record_uri`, `record_cid` - atProto record metadata
154
155**Critical Indexes:**
156- `idx_aggregator_auth_lookup` - Fast (aggregator_did, community_did, enabled) lookups for post creation
157
158### `aggregator_posts` Table
159AppView-only tracking for rate limiting and stats (not from lexicon).
160
161**Key Columns:**
162- `aggregator_did`, `community_did`, `post_uri`
163- `created_at` - For rate limit calculations
164
165---
166
167## Security
168
169### Authentication
170- DID-based authentication via JWT signatures
171- No shared secrets or API keys
172- Aggregators can only post to authorized communities
173
174### Authorization Checks
175- Server validates aggregator status (not client-provided)
176- Checks `aggregator_authorizations` table on every post
177- Config validated against aggregator's JSON schema
178
179### Rate Limiting
180- Aggregators: 10 posts/hour per community
181- Tracked via `aggregator_posts` table
182- Prevents spam
183
184### Audit Trail
185- `created_by` / `disabled_by` track moderator actions
186- Full history preserved in authorization records
187
188---
189
190## Implementation Phases
191
192### ✅ Phase 1: Core Infrastructure (COMPLETE)
193**Status:** ✅ COMPLETE - All components implemented and tested
194**Goal:** Enable aggregator authentication and authorization
195
196**Components:**
197- ✅ Lexicon schemas (9 files)
198- ✅ Database migrations (2 migrations: 3 tables, 2 triggers, indexes)
199- ✅ Repository layer (CRUD operations, bulk queries, optimized indexes)
200- ✅ Service layer (business logic, validation, rate limiting)
201- ✅ Modified post creation handler (aggregator authentication & authorization)
202- ✅ XRPC query handlers (getServices, getAuthorizations, listForCommunity)
203- ✅ Jetstream consumer (indexes service & authorization records from firehose)
204- ✅ Integration tests (10+ test suites, E2E validation)
205- ✅ E2E test validation (verified records exist in both PDS and AppView)
206
207**Milestone:** ✅ ACHIEVED - Aggregators can authenticate and post to authorized communities
208
209**Deferred to Phase 2:**
210- Write-forward operations (enable, disable, updateConfig) - require PDS integration
211- Moderator permission checks - require communities ownership validation
212
213---
214
215## 🚨 Alpha Blockers
216
217### Aggregator User Registration
218**Status:** ❌ BLOCKING ALPHA - Must implement before aggregators can post
219**Priority:** CRITICAL
220**Discovered:** 2025-10-24 during Kagi News aggregator E2E testing
221
222**Problem:**
223Aggregators cannot create posts because they aren't indexed as users in the AppView database. The post consumer rejects posts with:
224```
225🚨 SECURITY: Rejecting post event: author not found: <aggregator-did> - cannot index post before author
226```
227
228This security check (in `post_consumer.go:181-196`) ensures referential integrity by requiring all post authors to exist as users before posts can be indexed.
229
230**Root Cause:**
231Users are normally indexed through Jetstream identity events when they create accounts on a PDS. Aggregators don't have PDSs connected to Jetstream, so they never emit identity events and are never automatically indexed.
232
233**Solution: Aggregator Registration Endpoint**
234
235Implement `social.coves.aggregator.register` XRPC endpoint to allow aggregators to self-register as users.
236
237**Implementation:**
238```go
239// Handler: internal/api/handlers/aggregator/register.go
240// POST /xrpc/social.coves.aggregator.register
241
242type RegisterRequest struct {
243 AggregatorDID string `json:"aggregatorDid"`
244 Handle string `json:"handle"`
245}
246
247func (h *Handler) Register(ctx context.Context, req *RegisterRequest) error {
248 // 1. Validate aggregator DID format
249 // 2. Validate handle is available
250 // 3. Verify aggregator controls the DID (via DID document)
251 // 4. Create user entry in database
252 _, err := h.userService.CreateUser(ctx, users.CreateUserRequest{
253 DID: req.AggregatorDID,
254 Handle: req.Handle,
255 PDSURL: "https://api.coves.social", // Aggregators "hosted" by Coves
256 })
257 return err
258}
259```
260
261**Acceptance Criteria:**
262- [ ] Endpoint implemented and tested
263- [ ] Aggregator can register with DID + handle
264- [ ] Registration validates DID ownership
265- [ ] Duplicate registrations handled gracefully
266- [ ] Kagi News aggregator can successfully post after registration
267- [ ] Documentation updated with registration flow
268
269**Alternative (Quick Fix for Testing):**
270Manual SQL insert for known aggregators during bootstrap:
271```sql
272INSERT INTO users (did, handle, pds_url, created_at, updated_at)
273VALUES ('did:plc:...', 'aggregator-name.coves.social', 'https://api.coves.social', NOW(), NOW());
274```
275
276---
277
278### Phase 2: Aggregator SDK (Post-Alpha)
279**Deferred** - Will build SDK after Phase 1 is validated in production.
280
281Core functionality works without SDK - aggregators just need to:
2821. Create atProto account (get DID)
2832. Publish service declaration record
2843. Sign JWTs with their DID keys
2854. Call existing XRPC endpoints
286
287---
288
289### Phase 3: Reference Implementation (Future)
290**Deferred** - First aggregator will likely be built inline to validate the system.
291
292Potential first aggregator: RSS news bot for select communities.
293
294---
295
296## Key Design Decisions
297
298### 2025-10-20: Remove `aggregatorType` Field
299**Decision:** Removed `aggregatorType` enum from service declaration and database.
300
301**Rationale:**
302- Pre-production - can break things
303- Over-engineering for alpha
304- Description field is sufficient for discovery
305- Avoids rigid categorization
306- Can add tags later if needed
307
308**Impact:**
309- Simplified lexicons
310- Removed database constraint
311- More flexible for third-party developers
312
313---
314
315### 2025-10-19: Reuse `social.coves.community.post.create` Endpoint
316**Decision:** Aggregators use existing post creation endpoint.
317
318**Rationale:**
319- Post record already server-populates `author` from JWT
320- Simpler: one code path for all post creation
321- Follows atProto principle: actors are actors
322- `federatedFrom` field handles external content attribution
323
324**Implementation:**
325- Add branching logic in post handler: if aggregator, check authorization; else check membership
326- Apply different rate limits based on actor type
327
328---
329
330### 2025-10-19: Config as JSON Schema
331**Decision:** Aggregators declare `configSchema` in service record.
332
333**Rationale:**
334- Communities need to know what config options are available
335- JSON Schema is standard and well-supported
336- Enables UI auto-generation (forms from schema)
337- Validation at authorization creation time
338- Flexible: each aggregator has different config needs
339
340---
341
342## Use Cases
343
344### RSS News Aggregator
345Watches configured RSS feeds, uses LLM for deduplication, posts news articles to community.
346
347**Community Config Example:**
348```json
349{
350 "feeds": ["https://techcrunch.com/feed"],
351 "topics": ["technology"],
352 "dedupeWindow": "6h"
353}
354```
355
356---
357
358### Bluesky Post Mirror
359Monitors specific users/hashtags on Bluesky, creates posts in community with original author metadata.
360
361**Community Config Example:**
362```json
363{
364 "mirrorUsers": ["alice.bsky.social"],
365 "hashtags": ["covesalpha"],
366 "minLikes": 10
367}
368```
369
370---
371
372### Sports Results
373Monitors sports APIs, creates post-game threads with scores and stats.
374
375**Community Config Example:**
376```json
377{
378 "league": "NBA",
379 "teams": ["Lakers", "Warriors"],
380 "includeStats": true
381}
382```
383
384---
385
386## Success Metrics
387
388### Alpha Goals
389- ✅ Lexicons validated
390- ✅ Database migrations tested
391- ✅ Jetstream consumer indexes records
392- ✅ Post creation validates aggregator auth
393- ✅ Rate limiting prevents spam
394- ✅ Integration tests passing
395- ❌ **BLOCKER:** Aggregator registration endpoint (see Alpha Blockers section)
396
397### Beta Goals (Future)
398- First aggregator deployed in production
399- 3+ communities using aggregators
400- < 0.1% spam posts
401- Third-party developer documentation
402
403---
404
405## Out of Scope (Future)
406
407- Aggregator marketplace with ratings/reviews
408- UI for aggregator management (alpha uses XRPC only)
409- Scheduled posts
410- Interactive aggregators (respond to comments)
411- Cross-instance aggregator discovery
412- SDK (deferred until post-alpha)
413- LLM features (deferred)
414
415---
416
417## References
418
419- atProto Lexicon Spec: https://atproto.com/specs/lexicon
420- Feed Generator Pattern: https://github.com/bluesky-social/feed-generator
421- Labeler Pattern: https://github.com/bluesky-social/atproto/tree/main/packages/ozone
422- JSON Schema: https://json-schema.org/