A community based topic aggregation platform built on atproto
1# Aggregators PRD: Automated Content Posting System 2 3**Status:** In Development - Phase 1 (Core Infrastructure) 4**Owner:** Platform Team 5**Last Updated:** 2025-10-20 6 7--- 8 9## Overview 10 11Coves Aggregators are autonomous services that automatically post content to communities. Each aggregator is identified by its own DID and operates as a specialized actor within the atProto ecosystem. This enables communities to have automated content feeds (RSS, sports results, TV/movie discussion threads, Bluesky mirrors, etc.) while maintaining full community control. 12 13**Key Differentiator:** Unlike other platforms where users manually aggregate content, Coves communities can enable automated aggregators to handle routine posting tasks, creating a more dynamic and up-to-date community experience. 14 15--- 16 17## Architecture Principles 18 19### ✅ atProto-Compliant Design 20 21Aggregators follow established atProto patterns for autonomous services (Feed Generators + Labelers model): 22 231. **Aggregators are Actors, Not a Separate System** 24 - Each aggregator has its own DID 25 - Authenticate as themselves via JWT 26 - Use existing `social.coves.community.post.create` endpoint 27 - Post record's `author` field = aggregator DID (server-populated) 28 - No separate posting API needed 29 302. **Community Authorization Model** 31 - Communities create `social.coves.aggregator.authorization` records in their repo 32 - Records grant specific aggregators permission to post 33 - Include aggregator-specific configuration 34 - Can be enabled/disabled without deletion 35 363. **Hybrid Hosting** 37 - Coves can host official aggregators 38 - Third parties can build and host their own 39 - All use same authorization system 40 41--- 42 43## Core Components 44 45### 1. Service Declaration Record 46**Lexicon:** `social.coves.aggregator.service` 47**Location:** Aggregator's repository 48**Key:** `literal:self` 49 50Declares aggregator existence and provides metadata for discovery. 51 52**Required Fields:** 53- `did` - Aggregator's DID (must match repo) 54- `displayName` - Human-readable name 55- `createdAt` - Creation timestamp 56 57**Optional Fields:** 58- `description` - What this aggregator does 59- `avatar` - Avatar image blob 60- `configSchema` - JSON Schema for community config validation 61- `sourceUrl` - Link to source code (transparency) 62- `maintainer` - DID of maintainer 63 64--- 65 66### 2. Authorization Record 67**Lexicon:** `social.coves.aggregator.authorization` 68**Location:** Community's repository 69**Key:** `any` 70 71Grants an aggregator permission to post with specific configuration. 72 73**Required Fields:** 74- `aggregatorDid` - DID of authorized aggregator 75- `communityDid` - DID of community (must match repo) 76- `enabled` - Active status (toggleable) 77- `createdAt` - When authorized 78 79**Optional Fields:** 80- `config` - Aggregator-specific config (validated against schema) 81- `createdBy` - Moderator who authorized 82- `disabledAt` / `disabledBy` - Audit trail 83 84--- 85 86## Data Flow 87 88``` 89Aggregator Service (External) 90 91 │ 1. Authenticates as aggregator DID (JWT) 92 │ 2. Calls social.coves.community.post.create 93 94Coves AppView Handler 95 96 │ 1. Extract DID from JWT 97 │ 2. Check if DID is registered aggregator 98 │ 3. Validate authorization exists & enabled 99 │ 4. Apply aggregator rate limits 100 │ 5. Create post with author = aggregator DID 101102Jetstream → AppView Indexing 103104 │ Post indexed with aggregator attribution 105 │ UI shows: "🤖 Posted by [Aggregator Name]" 106107Community Feed 108``` 109 110--- 111 112## XRPC Methods 113 114### For Communities (Moderators) 115 116- **`social.coves.aggregator.enable`** - Create authorization record 117- **`social.coves.aggregator.disable`** - Set enabled=false 118- **`social.coves.aggregator.updateConfig`** - Update config 119- **`social.coves.aggregator.listForCommunity`** - List aggregators for community 120 121### For Aggregators 122 123- **`social.coves.community.post.create`** - Modified to handle aggregator auth 124- **`social.coves.aggregator.getAuthorizations`** - Query authorized communities 125 126### For Discovery 127 128- **`social.coves.aggregator.getServices`** - Fetch aggregator details by DID(s) 129 130--- 131 132## Database Schema 133 134### `aggregators` Table 135Indexes aggregator service declarations from Jetstream. 136 137**Key Columns:** 138- `did` (PK) - Aggregator DID 139- `display_name`, `description` - Service metadata 140- `config_schema` - JSON Schema for config validation 141- `avatar_url`, `source_url`, `maintainer_did` - Metadata 142- `record_uri`, `record_cid` - atProto record metadata 143- `communities_using`, `posts_created` - Cached stats (updated by triggers) 144 145### `aggregator_authorizations` Table 146Indexes community authorization records from Jetstream. 147 148**Key Columns:** 149- `aggregator_did`, `community_did` - Authorization pair (unique together) 150- `enabled` - Active status 151- `config` - Community-specific JSON config 152- `created_by`, `disabled_by` - Audit trail 153- `record_uri`, `record_cid` - atProto record metadata 154 155**Critical Indexes:** 156- `idx_aggregator_auth_lookup` - Fast (aggregator_did, community_did, enabled) lookups for post creation 157 158### `aggregator_posts` Table 159AppView-only tracking for rate limiting and stats (not from lexicon). 160 161**Key Columns:** 162- `aggregator_did`, `community_did`, `post_uri` 163- `created_at` - For rate limit calculations 164 165--- 166 167## Security 168 169### Authentication 170- DID-based authentication via JWT signatures 171- No shared secrets or API keys 172- Aggregators can only post to authorized communities 173 174### Authorization Checks 175- Server validates aggregator status (not client-provided) 176- Checks `aggregator_authorizations` table on every post 177- Config validated against aggregator's JSON schema 178 179### Rate Limiting 180- Aggregators: 10 posts/hour per community 181- Tracked via `aggregator_posts` table 182- Prevents spam 183 184### Audit Trail 185- `created_by` / `disabled_by` track moderator actions 186- Full history preserved in authorization records 187 188--- 189 190## Implementation Phases 191 192### ✅ Phase 1: Core Infrastructure (COMPLETE) 193**Status:** ✅ COMPLETE - All components implemented and tested 194**Goal:** Enable aggregator authentication and authorization 195 196**Components:** 197- ✅ Lexicon schemas (9 files) 198- ✅ Database migrations (2 migrations: 3 tables, 2 triggers, indexes) 199- ✅ Repository layer (CRUD operations, bulk queries, optimized indexes) 200- ✅ Service layer (business logic, validation, rate limiting) 201- ✅ Modified post creation handler (aggregator authentication & authorization) 202- ✅ XRPC query handlers (getServices, getAuthorizations, listForCommunity) 203- ✅ Jetstream consumer (indexes service & authorization records from firehose) 204- ✅ Integration tests (10+ test suites, E2E validation) 205- ✅ E2E test validation (verified records exist in both PDS and AppView) 206 207**Milestone:** ✅ ACHIEVED - Aggregators can authenticate and post to authorized communities 208 209**Deferred to Phase 2:** 210- Write-forward operations (enable, disable, updateConfig) - require PDS integration 211- Moderator permission checks - require communities ownership validation 212 213--- 214 215## 🚨 Alpha Blockers 216 217### Aggregator User Registration 218**Status:** ❌ BLOCKING ALPHA - Must implement before aggregators can post 219**Priority:** CRITICAL 220**Discovered:** 2025-10-24 during Kagi News aggregator E2E testing 221 222**Problem:** 223Aggregators cannot create posts because they aren't indexed as users in the AppView database. The post consumer rejects posts with: 224``` 225🚨 SECURITY: Rejecting post event: author not found: <aggregator-did> - cannot index post before author 226``` 227 228This security check (in `post_consumer.go:181-196`) ensures referential integrity by requiring all post authors to exist as users before posts can be indexed. 229 230**Root Cause:** 231Users are normally indexed through Jetstream identity events when they create accounts on a PDS. Aggregators don't have PDSs connected to Jetstream, so they never emit identity events and are never automatically indexed. 232 233**Solution: Aggregator Registration Endpoint** 234 235Implement `social.coves.aggregator.register` XRPC endpoint to allow aggregators to self-register as users. 236 237**Implementation:** 238```go 239// Handler: internal/api/handlers/aggregator/register.go 240// POST /xrpc/social.coves.aggregator.register 241 242type RegisterRequest struct { 243 AggregatorDID string `json:"aggregatorDid"` 244 Handle string `json:"handle"` 245} 246 247func (h *Handler) Register(ctx context.Context, req *RegisterRequest) error { 248 // 1. Validate aggregator DID format 249 // 2. Validate handle is available 250 // 3. Verify aggregator controls the DID (via DID document) 251 // 4. Create user entry in database 252 _, err := h.userService.CreateUser(ctx, users.CreateUserRequest{ 253 DID: req.AggregatorDID, 254 Handle: req.Handle, 255 PDSURL: "https://api.coves.social", // Aggregators "hosted" by Coves 256 }) 257 return err 258} 259``` 260 261**Acceptance Criteria:** 262- [ ] Endpoint implemented and tested 263- [ ] Aggregator can register with DID + handle 264- [ ] Registration validates DID ownership 265- [ ] Duplicate registrations handled gracefully 266- [ ] Kagi News aggregator can successfully post after registration 267- [ ] Documentation updated with registration flow 268 269**Alternative (Quick Fix for Testing):** 270Manual SQL insert for known aggregators during bootstrap: 271```sql 272INSERT INTO users (did, handle, pds_url, created_at, updated_at) 273VALUES ('did:plc:...', 'aggregator-name.coves.social', 'https://api.coves.social', NOW(), NOW()); 274``` 275 276--- 277 278### Phase 2: Aggregator SDK (Post-Alpha) 279**Deferred** - Will build SDK after Phase 1 is validated in production. 280 281Core functionality works without SDK - aggregators just need to: 2821. Create atProto account (get DID) 2832. Publish service declaration record 2843. Sign JWTs with their DID keys 2854. Call existing XRPC endpoints 286 287--- 288 289### Phase 3: Reference Implementation (Future) 290**Deferred** - First aggregator will likely be built inline to validate the system. 291 292Potential first aggregator: RSS news bot for select communities. 293 294--- 295 296## Key Design Decisions 297 298### 2025-10-20: Remove `aggregatorType` Field 299**Decision:** Removed `aggregatorType` enum from service declaration and database. 300 301**Rationale:** 302- Pre-production - can break things 303- Over-engineering for alpha 304- Description field is sufficient for discovery 305- Avoids rigid categorization 306- Can add tags later if needed 307 308**Impact:** 309- Simplified lexicons 310- Removed database constraint 311- More flexible for third-party developers 312 313--- 314 315### 2025-10-19: Reuse `social.coves.community.post.create` Endpoint 316**Decision:** Aggregators use existing post creation endpoint. 317 318**Rationale:** 319- Post record already server-populates `author` from JWT 320- Simpler: one code path for all post creation 321- Follows atProto principle: actors are actors 322- `federatedFrom` field handles external content attribution 323 324**Implementation:** 325- Add branching logic in post handler: if aggregator, check authorization; else check membership 326- Apply different rate limits based on actor type 327 328--- 329 330### 2025-10-19: Config as JSON Schema 331**Decision:** Aggregators declare `configSchema` in service record. 332 333**Rationale:** 334- Communities need to know what config options are available 335- JSON Schema is standard and well-supported 336- Enables UI auto-generation (forms from schema) 337- Validation at authorization creation time 338- Flexible: each aggregator has different config needs 339 340--- 341 342## Use Cases 343 344### RSS News Aggregator 345Watches configured RSS feeds, uses LLM for deduplication, posts news articles to community. 346 347**Community Config Example:** 348```json 349{ 350 "feeds": ["https://techcrunch.com/feed"], 351 "topics": ["technology"], 352 "dedupeWindow": "6h" 353} 354``` 355 356--- 357 358### Bluesky Post Mirror 359Monitors specific users/hashtags on Bluesky, creates posts in community with original author metadata. 360 361**Community Config Example:** 362```json 363{ 364 "mirrorUsers": ["alice.bsky.social"], 365 "hashtags": ["covesalpha"], 366 "minLikes": 10 367} 368``` 369 370--- 371 372### Sports Results 373Monitors sports APIs, creates post-game threads with scores and stats. 374 375**Community Config Example:** 376```json 377{ 378 "league": "NBA", 379 "teams": ["Lakers", "Warriors"], 380 "includeStats": true 381} 382``` 383 384--- 385 386## Success Metrics 387 388### Alpha Goals 389- ✅ Lexicons validated 390- ✅ Database migrations tested 391- ✅ Jetstream consumer indexes records 392- ✅ Post creation validates aggregator auth 393- ✅ Rate limiting prevents spam 394- ✅ Integration tests passing 395-**BLOCKER:** Aggregator registration endpoint (see Alpha Blockers section) 396 397### Beta Goals (Future) 398- First aggregator deployed in production 399- 3+ communities using aggregators 400- < 0.1% spam posts 401- Third-party developer documentation 402 403--- 404 405## Out of Scope (Future) 406 407- Aggregator marketplace with ratings/reviews 408- UI for aggregator management (alpha uses XRPC only) 409- Scheduled posts 410- Interactive aggregators (respond to comments) 411- Cross-instance aggregator discovery 412- SDK (deferred until post-alpha) 413- LLM features (deferred) 414 415--- 416 417## References 418 419- atProto Lexicon Spec: https://atproto.com/specs/lexicon 420- Feed Generator Pattern: https://github.com/bluesky-social/feed-generator 421- Labeler Pattern: https://github.com/bluesky-social/atproto/tree/main/packages/ozone 422- JSON Schema: https://json-schema.org/