···
1
+
# Kagi News RSS Aggregator
3
+
A Python-based RSS aggregator that posts Kagi News stories to Coves communities using rich text formatting.
8
+
- Fetches RSS feeds from Kagi News daily via CRON
9
+
- Parses HTML descriptions to extract structured content (highlights, perspectives, sources)
10
+
- Formats posts using Coves rich text with facets (bold, italic, links)
11
+
- Hot-links images from Kagi's proxy (no blob upload)
12
+
- Posts to configured communities via XRPC
14
+
## Project Structure
17
+
aggregators/kagi-news/
19
+
│ ├── models.py # Data models (KagiStory, Perspective, etc.)
20
+
│ ├── rss_fetcher.py # RSS feed fetching with retry logic
21
+
│ ├── html_parser.py # Parse Kagi HTML to structured data
22
+
│ ├── richtext_formatter.py # Format content with rich text facets (TODO)
23
+
│ ├── atproto_client.py # ATProto authentication and operations (TODO)
24
+
│ ├── state_manager.py # Deduplication state tracking (TODO)
25
+
│ ├── config.py # Configuration loading (TODO)
26
+
│ └── main.py # Entry point (TODO)
28
+
│ ├── test_rss_fetcher.py # RSS fetcher tests ✓
29
+
│ ├── test_html_parser.py # HTML parser tests ✓
31
+
│ ├── sample_rss_item.xml
34
+
│ └── generate_did.py # Helper to generate aggregator DID (TODO)
35
+
├── requirements.txt # Python dependencies
36
+
├── config.example.yaml # Example configuration
37
+
├── .env.example # Environment variables template
38
+
├── crontab # CRON schedule
47
+
- python3-venv package (`apt install python3.12-venv`)
51
+
1. Create virtual environment:
53
+
python3 -m venv venv
54
+
source venv/bin/activate
57
+
2. Install dependencies:
59
+
pip install -r requirements.txt
62
+
3. Copy configuration templates:
64
+
cp config.example.yaml config.yaml
65
+
cp .env.example .env
68
+
4. Edit `config.yaml` to map RSS feeds to communities
69
+
5. Set environment variables in `.env` (aggregator DID and private key)
74
+
# Activate virtual environment
75
+
source venv/bin/activate
80
+
# Run specific test file
81
+
pytest tests/test_html_parser.py -v
84
+
pytest --cov=src --cov-report=html
87
+
## Development Status
89
+
### ✅ Phase 1-2 Complete (Oct 24, 2025)
90
+
- [x] Project structure created
91
+
- [x] Data models defined (KagiStory, Perspective, Quote, Source)
92
+
- [x] RSS fetcher with retry logic and tests
93
+
- [x] HTML parser extracting all sections (summary, highlights, perspectives, sources, quote, image)
94
+
- [x] Test fixtures from real Kagi News feed
96
+
### 🚧 Next Steps (Phase 3-4)
97
+
- [ ] Rich text formatter (convert to Coves format with facets)
98
+
- [ ] State manager for deduplication
99
+
- [ ] Configuration loader
100
+
- [ ] ATProto client for post creation
101
+
- [ ] Main orchestration script
102
+
- [ ] End-to-end tests
106
+
Edit `config.yaml` to define feed-to-community mappings:
109
+
coves_api_url: "https://api.coves.social"
112
+
- name: "World News"
113
+
url: "https://news.kagi.com/world.xml"
114
+
community_handle: "world-news.coves.social"
117
+
- name: "Tech News"
118
+
url: "https://news.kagi.com/tech.xml"
119
+
community_handle: "tech.coves.social"
136
+
Structured KagiStory
138
+
Rich Text Formatter
145
+
### Rich Text Format
147
+
Posts use Coves rich text with UTF-8 byte-positioned facets:
151
+
"content": "Summary text...\n\nHighlights:\n• Point 1\n...",
154
+
"index": {"byteStart": 20, "byteEnd": 31},
155
+
"features": [{"$type": "social.coves.richtext.facet#bold"}]
158
+
"index": {"byteStart": 50, "byteEnd": 75},
159
+
"features": [{"$type": "social.coves.richtext.facet#link", "uri": "https://..."}]
167
+
See parent Coves project license.
169
+
## Related Documentation
171
+
- [PRD: Kagi News Aggregator](../../docs/aggregators/PRD_KAGI_NEWS_RSS.md)
172
+
- [PRD: Aggregator System](../../docs/aggregators/PRD_AGGREGATORS.md)
173
+
- [Coves Rich Text Lexicon](../../internal/atproto/lexicon/social/coves/richtext/README.md)