Aggregators PRD: Automated Content Posting System#
Status: In Development - Phase 1 (Core Infrastructure) Owner: Platform Team Last Updated: 2025-10-20
Overview#
Coves Aggregators are autonomous services that automatically post content to communities. Each aggregator is identified by its own DID and operates as a specialized actor within the atProto ecosystem. This enables communities to have automated content feeds (RSS, sports results, TV/movie discussion threads, Bluesky mirrors, etc.) while maintaining full community control.
Key Differentiator: Unlike other platforms where users manually aggregate content, Coves communities can enable automated aggregators to handle routine posting tasks, creating a more dynamic and up-to-date community experience.
Architecture Principles#
✅ atProto-Compliant Design#
Aggregators follow established atProto patterns for autonomous services (Feed Generators + Labelers model):
-
Aggregators are Actors, Not a Separate System
- Each aggregator has its own DID
- Authenticate as themselves via JWT
- Use existing
social.coves.community.post.createendpoint - Post record's
authorfield = aggregator DID (server-populated) - No separate posting API needed
-
Community Authorization Model
- Communities create
social.coves.aggregator.authorizationrecords in their repo - Records grant specific aggregators permission to post
- Include aggregator-specific configuration
- Can be enabled/disabled without deletion
- Communities create
-
Hybrid Hosting
- Coves can host official aggregators
- Third parties can build and host their own
- All use same authorization system
Core Components#
1. Service Declaration Record#
Lexicon: social.coves.aggregator.service
Location: Aggregator's repository
Key: literal:self
Declares aggregator existence and provides metadata for discovery.
Required Fields:
did- Aggregator's DID (must match repo)displayName- Human-readable namecreatedAt- Creation timestamp
Optional Fields:
description- What this aggregator doesavatar- Avatar image blobconfigSchema- JSON Schema for community config validationsourceUrl- Link to source code (transparency)maintainer- DID of maintainer
2. Authorization Record#
Lexicon: social.coves.aggregator.authorization
Location: Community's repository
Key: any
Grants an aggregator permission to post with specific configuration.
Required Fields:
aggregatorDid- DID of authorized aggregatorcommunityDid- DID of community (must match repo)enabled- Active status (toggleable)createdAt- When authorized
Optional Fields:
config- Aggregator-specific config (validated against schema)createdBy- Moderator who authorizeddisabledAt/disabledBy- Audit trail
Data Flow#
Aggregator Service (External)
│
│ 1. Authenticates as aggregator DID (JWT)
│ 2. Calls social.coves.community.post.create
▼
Coves AppView Handler
│
│ 1. Extract DID from JWT
│ 2. Check if DID is registered aggregator
│ 3. Validate authorization exists & enabled
│ 4. Apply aggregator rate limits
│ 5. Create post with author = aggregator DID
▼
Jetstream → AppView Indexing
│
│ Post indexed with aggregator attribution
│ UI shows: "🤖 Posted by [Aggregator Name]"
▼
Community Feed
XRPC Methods#
For Communities (Moderators)#
social.coves.aggregator.enable- Create authorization recordsocial.coves.aggregator.disable- Set enabled=falsesocial.coves.aggregator.updateConfig- Update configsocial.coves.aggregator.listForCommunity- List aggregators for community
For Aggregators#
social.coves.community.post.create- Modified to handle aggregator authsocial.coves.aggregator.getAuthorizations- Query authorized communities
For Discovery#
social.coves.aggregator.getServices- Fetch aggregator details by DID(s)
Database Schema#
aggregators Table#
Indexes aggregator service declarations from Jetstream.
Key Columns:
did(PK) - Aggregator DIDdisplay_name,description- Service metadataconfig_schema- JSON Schema for config validationavatar_url,source_url,maintainer_did- Metadatarecord_uri,record_cid- atProto record metadatacommunities_using,posts_created- Cached stats (updated by triggers)
aggregator_authorizations Table#
Indexes community authorization records from Jetstream.
Key Columns:
aggregator_did,community_did- Authorization pair (unique together)enabled- Active statusconfig- Community-specific JSON configcreated_by,disabled_by- Audit trailrecord_uri,record_cid- atProto record metadata
Critical Indexes:
idx_aggregator_auth_lookup- Fast (aggregator_did, community_did, enabled) lookups for post creation
aggregator_posts Table#
AppView-only tracking for rate limiting and stats (not from lexicon).
Key Columns:
aggregator_did,community_did,post_uricreated_at- For rate limit calculations
Security#
Authentication#
- DID-based authentication via JWT signatures
- No shared secrets or API keys
- Aggregators can only post to authorized communities
Authorization Checks#
- Server validates aggregator status (not client-provided)
- Checks
aggregator_authorizationstable on every post - Config validated against aggregator's JSON schema
Rate Limiting#
- Aggregators: 10 posts/hour per community
- Tracked via
aggregator_poststable - Prevents spam
Audit Trail#
created_by/disabled_bytrack moderator actions- Full history preserved in authorization records
Implementation Phases#
✅ Phase 1: Core Infrastructure (COMPLETE)#
Status: ✅ COMPLETE - All components implemented and tested Goal: Enable aggregator authentication and authorization
Components:
- ✅ Lexicon schemas (9 files)
- ✅ Database migrations (2 migrations: 3 tables, 2 triggers, indexes)
- ✅ Repository layer (CRUD operations, bulk queries, optimized indexes)
- ✅ Service layer (business logic, validation, rate limiting)
- ✅ Modified post creation handler (aggregator authentication & authorization)
- ✅ XRPC query handlers (getServices, getAuthorizations, listForCommunity)
- ✅ Jetstream consumer (indexes service & authorization records from firehose)
- ✅ Integration tests (10+ test suites, E2E validation)
- ✅ E2E test validation (verified records exist in both PDS and AppView)
Milestone: ✅ ACHIEVED - Aggregators can authenticate and post to authorized communities
Deferred to Phase 2:
- Write-forward operations (enable, disable, updateConfig) - require PDS integration
- Moderator permission checks - require communities ownership validation
🚨 Alpha Blockers#
Aggregator User Registration#
Status: ❌ BLOCKING ALPHA - Must implement before aggregators can post Priority: CRITICAL Discovered: 2025-10-24 during Kagi News aggregator E2E testing
Problem: Aggregators cannot create posts because they aren't indexed as users in the AppView database. The post consumer rejects posts with:
🚨 SECURITY: Rejecting post event: author not found: <aggregator-did> - cannot index post before author
This security check (in post_consumer.go:181-196) ensures referential integrity by requiring all post authors to exist as users before posts can be indexed.
Root Cause: Users are normally indexed through Jetstream identity events when they create accounts on a PDS. Aggregators don't have PDSs connected to Jetstream, so they never emit identity events and are never automatically indexed.
Solution: Aggregator Registration Endpoint
Implement social.coves.aggregator.register XRPC endpoint to allow aggregators to self-register as users.
Implementation:
// Handler: internal/api/handlers/aggregator/register.go
// POST /xrpc/social.coves.aggregator.register
type RegisterRequest struct {
AggregatorDID string `json:"aggregatorDid"`
Handle string `json:"handle"`
}
func (h *Handler) Register(ctx context.Context, req *RegisterRequest) error {
// 1. Validate aggregator DID format
// 2. Validate handle is available
// 3. Verify aggregator controls the DID (via DID document)
// 4. Create user entry in database
_, err := h.userService.CreateUser(ctx, users.CreateUserRequest{
DID: req.AggregatorDID,
Handle: req.Handle,
PDSURL: "https://api.coves.social", // Aggregators "hosted" by Coves
})
return err
}
Acceptance Criteria:
- Endpoint implemented and tested
- Aggregator can register with DID + handle
- Registration validates DID ownership
- Duplicate registrations handled gracefully
- Kagi News aggregator can successfully post after registration
- Documentation updated with registration flow
Alternative (Quick Fix for Testing): Manual SQL insert for known aggregators during bootstrap:
INSERT INTO users (did, handle, pds_url, created_at, updated_at)
VALUES ('did:plc:...', 'aggregator-name.coves.social', 'https://api.coves.social', NOW(), NOW());
Phase 2: Aggregator SDK (Post-Alpha)#
Deferred - Will build SDK after Phase 1 is validated in production.
Core functionality works without SDK - aggregators just need to:
- Create atProto account (get DID)
- Publish service declaration record
- Sign JWTs with their DID keys
- Call existing XRPC endpoints
Phase 3: Reference Implementation (Future)#
Deferred - First aggregator will likely be built inline to validate the system.
Potential first aggregator: RSS news bot for select communities.
Key Design Decisions#
2025-10-20: Remove aggregatorType Field#
Decision: Removed aggregatorType enum from service declaration and database.
Rationale:
- Pre-production - can break things
- Over-engineering for alpha
- Description field is sufficient for discovery
- Avoids rigid categorization
- Can add tags later if needed
Impact:
- Simplified lexicons
- Removed database constraint
- More flexible for third-party developers
2025-10-19: Reuse social.coves.community.post.create Endpoint#
Decision: Aggregators use existing post creation endpoint.
Rationale:
- Post record already server-populates
authorfrom JWT - Simpler: one code path for all post creation
- Follows atProto principle: actors are actors
federatedFromfield handles external content attribution
Implementation:
- Add branching logic in post handler: if aggregator, check authorization; else check membership
- Apply different rate limits based on actor type
2025-10-19: Config as JSON Schema#
Decision: Aggregators declare configSchema in service record.
Rationale:
- Communities need to know what config options are available
- JSON Schema is standard and well-supported
- Enables UI auto-generation (forms from schema)
- Validation at authorization creation time
- Flexible: each aggregator has different config needs
Use Cases#
RSS News Aggregator#
Watches configured RSS feeds, uses LLM for deduplication, posts news articles to community.
Community Config Example:
{
"feeds": ["https://techcrunch.com/feed"],
"topics": ["technology"],
"dedupeWindow": "6h"
}
Bluesky Post Mirror#
Monitors specific users/hashtags on Bluesky, creates posts in community with original author metadata.
Community Config Example:
{
"mirrorUsers": ["alice.bsky.social"],
"hashtags": ["covesalpha"],
"minLikes": 10
}
Sports Results#
Monitors sports APIs, creates post-game threads with scores and stats.
Community Config Example:
{
"league": "NBA",
"teams": ["Lakers", "Warriors"],
"includeStats": true
}
Success Metrics#
Alpha Goals#
- ✅ Lexicons validated
- ✅ Database migrations tested
- ✅ Jetstream consumer indexes records
- ✅ Post creation validates aggregator auth
- ✅ Rate limiting prevents spam
- ✅ Integration tests passing
- ❌ BLOCKER: Aggregator registration endpoint (see Alpha Blockers section)
Beta Goals (Future)#
- First aggregator deployed in production
- 3+ communities using aggregators
- < 0.1% spam posts
- Third-party developer documentation
Out of Scope (Future)#
- Aggregator marketplace with ratings/reviews
- UI for aggregator management (alpha uses XRPC only)
- Scheduled posts
- Interactive aggregators (respond to comments)
- Cross-instance aggregator discovery
- SDK (deferred until post-alpha)
- LLM features (deferred)
References#
- atProto Lexicon Spec: https://atproto.com/specs/lexicon
- Feed Generator Pattern: https://github.com/bluesky-social/feed-generator
- Labeler Pattern: https://github.com/bluesky-social/atproto/tree/main/packages/ozone
- JSON Schema: https://json-schema.org/