A community based topic aggregation platform built on atproto
Kagi News RSS Aggregator#
A Python-based RSS aggregator that posts Kagi News stories to Coves communities using rich text formatting.
Overview#
This aggregator:
- Fetches RSS feeds from Kagi News daily via CRON
- Parses HTML descriptions to extract structured content (highlights, perspectives, sources)
- Formats posts using Coves rich text with facets (bold, italic, links)
- Hot-links images from Kagi's proxy (no blob upload)
- Posts to configured communities via XRPC
Project Structure#
aggregators/kagi-news/
├── src/
│ ├── models.py # Data models (KagiStory, Perspective, etc.)
│ ├── rss_fetcher.py # RSS feed fetching with retry logic
│ ├── html_parser.py # Parse Kagi HTML to structured data
│ ├── richtext_formatter.py # Format content with rich text facets (TODO)
│ ├── atproto_client.py # ATProto authentication and operations (TODO)
│ ├── state_manager.py # Deduplication state tracking (TODO)
│ ├── config.py # Configuration loading (TODO)
│ └── main.py # Entry point (TODO)
├── tests/
│ ├── test_rss_fetcher.py # RSS fetcher tests ✓
│ ├── test_html_parser.py # HTML parser tests ✓
│ └── fixtures/
│ ├── sample_rss_item.xml
│ └── world.xml
├── scripts/
│ └── generate_did.py # Helper to generate aggregator DID (TODO)
├── requirements.txt # Python dependencies
├── config.example.yaml # Example configuration
├── .env.example # Environment variables template
├── crontab # CRON schedule
└── README.md
Setup#
Prerequisites#
- Python 3.11+
- python3-venv package (
apt install python3.12-venv)
Installation#
-
Create virtual environment:
python3 -m venv venv source venv/bin/activate -
Install dependencies:
pip install -r requirements.txt -
Copy configuration templates:
cp config.example.yaml config.yaml cp .env.example .env -
Edit
config.yamlto map RSS feeds to communities -
Set environment variables in
.env(aggregator DID and private key)
Running Tests#
# Activate virtual environment
source venv/bin/activate
# Run all tests
pytest -v
# Run specific test file
pytest tests/test_html_parser.py -v
# Run with coverage
pytest --cov=src --cov-report=html
Development Status#
✅ Phase 1-2 Complete (Oct 24, 2025)#
- Project structure created
- Data models defined (KagiStory, Perspective, Quote, Source)
- RSS fetcher with retry logic and tests
- HTML parser extracting all sections (summary, highlights, perspectives, sources, quote, image)
- Test fixtures from real Kagi News feed
🚧 Next Steps (Phase 3-4)#
- Rich text formatter (convert to Coves format with facets)
- State manager for deduplication
- Configuration loader
- ATProto client for post creation
- Main orchestration script
- End-to-end tests
Configuration#
Edit config.yaml to define feed-to-community mappings:
coves_api_url: "https://api.coves.social"
feeds:
- name: "World News"
url: "https://news.kagi.com/world.xml"
community_handle: "world-news.coves.social"
enabled: true
- name: "Tech News"
url: "https://news.kagi.com/tech.xml"
community_handle: "tech.coves.social"
enabled: true
Architecture#
Data Flow#
Kagi RSS Feed
↓ (HTTP GET)
RSS Fetcher
↓ (feedparser)
Parsed RSS Items
↓ (for each item)
HTML Parser
↓ (BeautifulSoup)
Structured KagiStory
↓
Rich Text Formatter
↓ (with facets)
Post Record
↓ (XRPC)
Coves Community
Rich Text Format#
Posts use Coves rich text with UTF-8 byte-positioned facets:
{
"content": "Summary text...\n\nHighlights:\n• Point 1\n...",
"facets": [
{
"index": {"byteStart": 20, "byteEnd": 31},
"features": [{"$type": "social.coves.richtext.facet#bold"}]
},
{
"index": {"byteStart": 50, "byteEnd": 75},
"features": [{"$type": "social.coves.richtext.facet#link", "uri": "https://..."}]
}
]
}
License#
See parent Coves project license.