A community based topic aggregation platform built on atproto
1# Aggregators PRD: Automated Content Posting System 2 3**Status:** In Development - Phase 1 (Core Infrastructure) 4**Owner:** Platform Team 5**Last Updated:** 2025-10-20 6 7--- 8 9## Overview 10 11Coves Aggregators are autonomous services that automatically post content to communities. Each aggregator is identified by its own DID and operates as a specialized actor within the atProto ecosystem. This enables communities to have automated content feeds (RSS, sports results, TV/movie discussion threads, Bluesky mirrors, etc.) while maintaining full community control. 12 13**Key Differentiator:** Unlike other platforms where users manually aggregate content, Coves communities can enable automated aggregators to handle routine posting tasks, creating a more dynamic and up-to-date community experience. 14 15--- 16 17## Architecture Principles 18 19### ✅ atProto-Compliant Design 20 21Aggregators follow established atProto patterns for autonomous services (Feed Generators + Labelers model): 22 231. **Aggregators are Actors, Not a Separate System** 24 - Each aggregator has its own DID 25 - Authenticate as themselves via JWT 26 - Use existing `social.coves.post.create` endpoint 27 - Post record's `author` field = aggregator DID (server-populated) 28 - No separate posting API needed 29 302. **Community Authorization Model** 31 - Communities create `social.coves.aggregator.authorization` records in their repo 32 - Records grant specific aggregators permission to post 33 - Include aggregator-specific configuration 34 - Can be enabled/disabled without deletion 35 363. **Hybrid Hosting** 37 - Coves can host official aggregators 38 - Third parties can build and host their own 39 - All use same authorization system 40 41--- 42 43## Core Components 44 45### 1. Service Declaration Record 46**Lexicon:** `social.coves.aggregator.service` 47**Location:** Aggregator's repository 48**Key:** `literal:self` 49 50Declares aggregator existence and provides metadata for discovery. 51 52**Required Fields:** 53- `did` - Aggregator's DID (must match repo) 54- `displayName` - Human-readable name 55- `createdAt` - Creation timestamp 56 57**Optional Fields:** 58- `description` - What this aggregator does 59- `avatar` - Avatar image blob 60- `configSchema` - JSON Schema for community config validation 61- `sourceUrl` - Link to source code (transparency) 62- `maintainer` - DID of maintainer 63 64--- 65 66### 2. Authorization Record 67**Lexicon:** `social.coves.aggregator.authorization` 68**Location:** Community's repository 69**Key:** `any` 70 71Grants an aggregator permission to post with specific configuration. 72 73**Required Fields:** 74- `aggregatorDid` - DID of authorized aggregator 75- `communityDid` - DID of community (must match repo) 76- `enabled` - Active status (toggleable) 77- `createdAt` - When authorized 78 79**Optional Fields:** 80- `config` - Aggregator-specific config (validated against schema) 81- `createdBy` - Moderator who authorized 82- `disabledAt` / `disabledBy` - Audit trail 83 84--- 85 86## Data Flow 87 88``` 89Aggregator Service (External) 90 91 │ 1. Authenticates as aggregator DID (JWT) 92 │ 2. Calls social.coves.post.create 93 94Coves AppView Handler 95 96 │ 1. Extract DID from JWT 97 │ 2. Check if DID is registered aggregator 98 │ 3. Validate authorization exists & enabled 99 │ 4. Apply aggregator rate limits 100 │ 5. Create post with author = aggregator DID 101102Jetstream → AppView Indexing 103104 │ Post indexed with aggregator attribution 105 │ UI shows: "🤖 Posted by [Aggregator Name]" 106107Community Feed 108``` 109 110--- 111 112## XRPC Methods 113 114### For Communities (Moderators) 115 116- **`social.coves.aggregator.enable`** - Create authorization record 117- **`social.coves.aggregator.disable`** - Set enabled=false 118- **`social.coves.aggregator.updateConfig`** - Update config 119- **`social.coves.aggregator.listForCommunity`** - List aggregators for community 120 121### For Aggregators 122 123- **`social.coves.post.create`** - Modified to handle aggregator auth 124- **`social.coves.aggregator.getAuthorizations`** - Query authorized communities 125 126### For Discovery 127 128- **`social.coves.aggregator.getServices`** - Fetch aggregator details by DID(s) 129 130--- 131 132## Database Schema 133 134### `aggregators` Table 135Indexes aggregator service declarations from Jetstream. 136 137**Key Columns:** 138- `did` (PK) - Aggregator DID 139- `display_name`, `description` - Service metadata 140- `config_schema` - JSON Schema for config validation 141- `avatar_url`, `source_url`, `maintainer_did` - Metadata 142- `record_uri`, `record_cid` - atProto record metadata 143- `communities_using`, `posts_created` - Cached stats (updated by triggers) 144 145### `aggregator_authorizations` Table 146Indexes community authorization records from Jetstream. 147 148**Key Columns:** 149- `aggregator_did`, `community_did` - Authorization pair (unique together) 150- `enabled` - Active status 151- `config` - Community-specific JSON config 152- `created_by`, `disabled_by` - Audit trail 153- `record_uri`, `record_cid` - atProto record metadata 154 155**Critical Indexes:** 156- `idx_aggregator_auth_lookup` - Fast (aggregator_did, community_did, enabled) lookups for post creation 157 158### `aggregator_posts` Table 159AppView-only tracking for rate limiting and stats (not from lexicon). 160 161**Key Columns:** 162- `aggregator_did`, `community_did`, `post_uri` 163- `created_at` - For rate limit calculations 164 165--- 166 167## Security 168 169### Authentication 170- DID-based authentication via JWT signatures 171- No shared secrets or API keys 172- Aggregators can only post to authorized communities 173 174### Authorization Checks 175- Server validates aggregator status (not client-provided) 176- Checks `aggregator_authorizations` table on every post 177- Config validated against aggregator's JSON schema 178 179### Rate Limiting 180- Aggregators: 10 posts/hour per community 181- Tracked via `aggregator_posts` table 182- Prevents spam 183 184### Audit Trail 185- `created_by` / `disabled_by` track moderator actions 186- Full history preserved in authorization records 187 188--- 189 190## Implementation Phases 191 192### ✅ Phase 1: Core Infrastructure (COMPLETE) 193**Status:** ✅ COMPLETE - All components implemented and tested 194**Goal:** Enable aggregator authentication and authorization 195 196**Components:** 197- ✅ Lexicon schemas (9 files) 198- ✅ Database migrations (2 migrations: 3 tables, 2 triggers, indexes) 199- ✅ Repository layer (CRUD operations, bulk queries, optimized indexes) 200- ✅ Service layer (business logic, validation, rate limiting) 201- ✅ Modified post creation handler (aggregator authentication & authorization) 202- ✅ XRPC query handlers (getServices, getAuthorizations, listForCommunity) 203- ✅ Jetstream consumer (indexes service & authorization records from firehose) 204- ✅ Integration tests (10+ test suites, E2E validation) 205- ✅ E2E test validation (verified records exist in both PDS and AppView) 206 207**Milestone:** ✅ ACHIEVED - Aggregators can authenticate and post to authorized communities 208 209**Deferred to Phase 2:** 210- Write-forward operations (enable, disable, updateConfig) - require PDS integration 211- Moderator permission checks - require communities ownership validation 212 213--- 214 215### Phase 2: Aggregator SDK (Post-Alpha) 216**Deferred** - Will build SDK after Phase 1 is validated in production. 217 218Core functionality works without SDK - aggregators just need to: 2191. Create atProto account (get DID) 2202. Publish service declaration record 2213. Sign JWTs with their DID keys 2224. Call existing XRPC endpoints 223 224--- 225 226### Phase 3: Reference Implementation (Future) 227**Deferred** - First aggregator will likely be built inline to validate the system. 228 229Potential first aggregator: RSS news bot for select communities. 230 231--- 232 233## Key Design Decisions 234 235### 2025-10-20: Remove `aggregatorType` Field 236**Decision:** Removed `aggregatorType` enum from service declaration and database. 237 238**Rationale:** 239- Pre-production - can break things 240- Over-engineering for alpha 241- Description field is sufficient for discovery 242- Avoids rigid categorization 243- Can add tags later if needed 244 245**Impact:** 246- Simplified lexicons 247- Removed database constraint 248- More flexible for third-party developers 249 250--- 251 252### 2025-10-19: Reuse `social.coves.post.create` Endpoint 253**Decision:** Aggregators use existing post creation endpoint. 254 255**Rationale:** 256- Post record already server-populates `author` from JWT 257- Simpler: one code path for all post creation 258- Follows atProto principle: actors are actors 259- `federatedFrom` field handles external content attribution 260 261**Implementation:** 262- Add branching logic in post handler: if aggregator, check authorization; else check membership 263- Apply different rate limits based on actor type 264 265--- 266 267### 2025-10-19: Config as JSON Schema 268**Decision:** Aggregators declare `configSchema` in service record. 269 270**Rationale:** 271- Communities need to know what config options are available 272- JSON Schema is standard and well-supported 273- Enables UI auto-generation (forms from schema) 274- Validation at authorization creation time 275- Flexible: each aggregator has different config needs 276 277--- 278 279## Use Cases 280 281### RSS News Aggregator 282Watches configured RSS feeds, uses LLM for deduplication, posts news articles to community. 283 284**Community Config Example:** 285```json 286{ 287 "feeds": ["https://techcrunch.com/feed"], 288 "topics": ["technology"], 289 "dedupeWindow": "6h" 290} 291``` 292 293--- 294 295### Bluesky Post Mirror 296Monitors specific users/hashtags on Bluesky, creates posts in community with original author metadata. 297 298**Community Config Example:** 299```json 300{ 301 "mirrorUsers": ["alice.bsky.social"], 302 "hashtags": ["covesalpha"], 303 "minLikes": 10 304} 305``` 306 307--- 308 309### Sports Results 310Monitors sports APIs, creates post-game threads with scores and stats. 311 312**Community Config Example:** 313```json 314{ 315 "league": "NBA", 316 "teams": ["Lakers", "Warriors"], 317 "includeStats": true 318} 319``` 320 321--- 322 323## Success Metrics 324 325### Alpha Goals 326- ✅ Lexicons validated 327- ✅ Database migrations tested 328- ⏳ Jetstream consumer indexes records 329- ⏳ Post creation validates aggregator auth 330- ⏳ Rate limiting prevents spam 331- ⏳ Integration tests passing 332 333### Beta Goals (Future) 334- First aggregator deployed in production 335- 3+ communities using aggregators 336- < 0.1% spam posts 337- Third-party developer documentation 338 339--- 340 341## Out of Scope (Future) 342 343- Aggregator marketplace with ratings/reviews 344- UI for aggregator management (alpha uses XRPC only) 345- Scheduled posts 346- Interactive aggregators (respond to comments) 347- Cross-instance aggregator discovery 348- SDK (deferred until post-alpha) 349- LLM features (deferred) 350 351--- 352 353## References 354 355- atProto Lexicon Spec: https://atproto.com/specs/lexicon 356- Feed Generator Pattern: https://github.com/bluesky-social/feed-generator 357- Labeler Pattern: https://github.com/bluesky-social/atproto/tree/main/packages/ozone 358- JSON Schema: https://json-schema.org/