Manage Atom feeds in a persistent git repository

Compare changes

Choose any two refs to compare.

+5
.gitignore
···
.streamlit/secrets.toml
thicket.yaml
+
+
# Bot configuration files with secrets
+
bot-config/zuliprc
+
bot-config/*.key
+
bot-config/*.secret
+3 -5
ARCH.md
···
"links": {
"https://example.com/post/123": {
"referencing_entries": ["https://blog.user.com/entry/456"],
-
"is_tracked_post": true,
"target_username": "user2"
},
"https://external-site.com/article": {
-
"referencing_entries": ["https://blog.user.com/entry/789"],
-
"is_tracked_post": false
+
"referencing_entries": ["https://blog.user.com/entry/789"]
}
},
"reverse_mapping": {
···
```
This unified structure eliminates duplication by:
-
- Storing each URL only once with metadata flags
+
- Storing each URL only once with minimal metadata
- Including all link data, reference data, and mappings in one file
-
- Using `is_tracked_post` to identify internal vs external links
+
- Using presence of `target_username` to identify tracked vs external links
- Providing bidirectional mappings for efficient queries
### Unified Structure Benefits
+26
README.md
···
- **Duplicate Management**: Manual curation of duplicate entries across feeds
- **Modern CLI**: Built with Typer and Rich for beautiful terminal output
- **Comprehensive Parsing**: Supports RSS 0.9x, RSS 1.0, RSS 2.0, and Atom feeds
+
- **Zulip Bot Integration**: Automatically post new feed articles to Zulip chat
- **Cron-Friendly**: Designed for scheduled execution
## Installation
···
# Remove duplicate mapping
thicket duplicates remove "https://example.com/dup"
```
+
+
### Zulip Bot Integration
+
```bash
+
# Test bot functionality
+
thicket bot test
+
+
# Show bot status
+
thicket bot status
+
+
# Run bot (requires configuration)
+
thicket bot run --config bot-config/zuliprc
+
```
+
+
**Bot Setup:**
+
1. Create a Zulip bot in your organization
+
2. Copy `bot-config/zuliprc.template` to `bot-config/zuliprc`
+
3. Configure with your bot's credentials
+
4. Run the bot and configure via Zulip chat:
+
```
+
@thicket config path /path/to/thicket.yaml
+
@thicket config stream general
+
@thicket config topic "Feed Updates"
+
```
+
+
See [docs/ZULIP_BOT.md](docs/ZULIP_BOT.md) for detailed setup instructions.
## Configuration
+400
SPEC.md
···
+
# Thicket Git Store Specification
+
+
This document comprehensively defines the JSON format and structure of the Thicket Git repository, enabling third-party clients to read and write to the store while leveraging Thicket's existing Python classes for data validation and business logic.
+
+
## Overview
+
+
The Thicket Git store is a structured repository that persists Atom/RSS feed entries in JSON format. The store is designed to be both human-readable and machine-parseable, with a clear directory structure and standardized JSON schemas.
+
+
## Repository Structure
+
+
```
+
<git_store>/
+
โ”œโ”€โ”€ index.json # Main index of all users and metadata
+
โ”œโ”€โ”€ duplicates.json # Maps duplicate entry IDs to canonical IDs
+
โ”œโ”€โ”€ index.opml # OPML export of all feeds (generated)
+
โ”œโ”€โ”€ <username1>/ # User directory (sanitized username)
+
โ”‚ โ”œโ”€โ”€ <entry_id1>.json # Individual feed entry
+
โ”‚ โ”œโ”€โ”€ <entry_id2>.json # Individual feed entry
+
โ”‚ โ””โ”€โ”€ ...
+
โ”œโ”€โ”€ <username2>/
+
โ”‚ โ”œโ”€โ”€ <entry_id3>.json
+
โ”‚ โ””โ”€โ”€ ...
+
โ””โ”€โ”€ ...
+
```
+
+
## JSON Schemas
+
+
### 1. Index File (`index.json`)
+
+
The main index tracks all users, their metadata, and repository statistics.
+
+
**Schema:**
+
```json
+
{
+
"users": {
+
"<username>": {
+
"username": "string",
+
"display_name": "string | null",
+
"email": "string | null",
+
"homepage": "string (URL) | null",
+
"icon": "string (URL) | null",
+
"feeds": ["string (URL)", ...],
+
"zulip_associations": [
+
{
+
"server": "string",
+
"user_id": "string"
+
},
+
...
+
],
+
"directory": "string",
+
"created": "string (ISO 8601 datetime)",
+
"last_updated": "string (ISO 8601 datetime)",
+
"entry_count": "integer"
+
}
+
},
+
"created": "string (ISO 8601 datetime)",
+
"last_updated": "string (ISO 8601 datetime)",
+
"total_entries": "integer"
+
}
+
```
+
+
**Example:**
+
```json
+
{
+
"users": {
+
"johndoe": {
+
"username": "johndoe",
+
"display_name": "John Doe",
+
"email": "john@example.com",
+
"homepage": "https://johndoe.blog",
+
"icon": "https://johndoe.blog/avatar.png",
+
"feeds": [
+
"https://johndoe.blog/feed.xml",
+
"https://johndoe.blog/categories/tech/feed.xml"
+
],
+
"zulip_associations": [
+
{
+
"server": "myorg.zulipchat.com",
+
"user_id": "john.doe"
+
},
+
{
+
"server": "community.zulipchat.com",
+
"user_id": "johndoe@example.com"
+
}
+
],
+
"directory": "johndoe",
+
"created": "2024-01-15T10:30:00",
+
"last_updated": "2024-01-20T14:22:00",
+
"entry_count": 42
+
}
+
},
+
"created": "2024-01-15T10:30:00",
+
"last_updated": "2024-01-20T14:22:00",
+
"total_entries": 42
+
}
+
```
+
+
### 2. Duplicates File (`duplicates.json`)
+
+
Maps duplicate entry IDs to their canonical representations to handle feed entries that appear with different IDs but identical content.
+
+
**Schema:**
+
```json
+
{
+
"duplicates": {
+
"<duplicate_id>": "<canonical_id>"
+
},
+
"comment": "Entry IDs that map to the same canonical content"
+
}
+
```
+
+
**Example:**
+
```json
+
{
+
"duplicates": {
+
"https://example.com/posts/123?utm_source=rss": "https://example.com/posts/123",
+
"https://example.com/feed/item-duplicate": "https://example.com/feed/item-original"
+
},
+
"comment": "Entry IDs that map to the same canonical content"
+
}
+
```
+
+
### 3. Feed Entry Files (`<username>/<entry_id>.json`)
+
+
Individual feed entries are stored as normalized Atom entries, regardless of their original format (RSS/Atom).
+
+
**Schema:**
+
```json
+
{
+
"id": "string",
+
"title": "string",
+
"link": "string (URL)",
+
"updated": "string (ISO 8601 datetime)",
+
"published": "string (ISO 8601 datetime) | null",
+
"summary": "string | null",
+
"content": "string | null",
+
"content_type": "html | text | xhtml",
+
"author": {
+
"name": "string | null",
+
"email": "string | null",
+
"uri": "string (URL) | null"
+
} | null,
+
"categories": ["string", ...],
+
"rights": "string | null",
+
"source": "string (URL) | null"
+
}
+
```
+
+
**Example:**
+
```json
+
{
+
"id": "https://johndoe.blog/posts/my-first-post",
+
"title": "My First Blog Post",
+
"link": "https://johndoe.blog/posts/my-first-post",
+
"updated": "2024-01-20T14:22:00",
+
"published": "2024-01-20T09:00:00",
+
"summary": "This is a summary of my first blog post.",
+
"content": "<p>This is the full content of my <strong>first</strong> blog post with HTML formatting.</p>",
+
"content_type": "html",
+
"author": {
+
"name": "John Doe",
+
"email": "john@example.com",
+
"uri": "https://johndoe.blog"
+
},
+
"categories": ["blogging", "personal"],
+
"rights": "Copyright 2024 John Doe",
+
"source": "https://johndoe.blog/feed.xml"
+
}
+
```
+
+
## Python Class Integration
+
+
To leverage Thicket's existing validation and business logic, third-party clients should use the following Python classes from the `thicket.models` package:
+
+
### Core Data Models
+
+
```python
+
from thicket.models import (
+
AtomEntry, # Feed entry representation
+
GitStoreIndex, # Repository index
+
UserMetadata, # User information
+
DuplicateMap, # Duplicate ID mappings
+
FeedMetadata, # Feed-level metadata
+
ThicketConfig, # Configuration
+
UserConfig, # User configuration
+
ZulipAssociation # Zulip server/user_id pairs
+
)
+
```
+
+
### Repository Operations
+
+
```python
+
from thicket.core.git_store import GitStore
+
from thicket.core.feed_parser import FeedParser
+
+
# Initialize git store
+
store = GitStore(Path("/path/to/git/store"))
+
+
# Read data
+
index = store._load_index() # Load index.json
+
user = store.get_user("username") # Get user metadata
+
entries = store.list_entries("username", limit=10)
+
entry = store.get_entry("username", "entry_id")
+
duplicates = store.get_duplicates() # Load duplicates.json
+
+
# Write data
+
store.add_user("username", display_name="Display Name")
+
store.store_entry("username", atom_entry)
+
store.add_duplicate("duplicate_id", "canonical_id")
+
store.commit_changes("Commit message")
+
+
# Zulip associations
+
store.add_zulip_association("username", "myorg.zulipchat.com", "user@example.com")
+
store.remove_zulip_association("username", "myorg.zulipchat.com", "user@example.com")
+
associations = store.get_zulip_associations("username")
+
+
# Search and statistics
+
results = store.search_entries("query", username="optional")
+
stats = store.get_stats()
+
```
+
+
### Feed Processing
+
+
```python
+
from thicket.core.feed_parser import FeedParser
+
from pydantic import HttpUrl
+
+
parser = FeedParser()
+
+
# Fetch and parse feeds
+
content = await parser.fetch_feed(HttpUrl("https://example.com/feed.xml"))
+
feed_metadata, entries = parser.parse_feed(content, source_url)
+
+
# Entry ID sanitization for filenames
+
safe_filename = parser.sanitize_entry_id(entry.id)
+
```
+
+
## File Naming and ID Sanitization
+
+
Entry IDs from feeds are sanitized to create safe filenames using `FeedParser.sanitize_entry_id()`:
+
+
- URLs are parsed and the path component is used as the base
+
- Characters are limited to alphanumeric, hyphens, underscores, and periods
+
- Other characters are replaced with underscores
+
- Maximum length is 200 characters
+
- Empty results default to "entry"
+
+
**Examples:**
+
- `https://example.com/posts/my-post` โ†’ `posts_my-post.json`
+
- `https://blog.com/2024/01/title?utm=source` โ†’ `2024_01_title.json`
+
+
## Data Validation
+
+
All JSON data should be validated using Pydantic models before writing to the store:
+
+
```python
+
from thicket.models import AtomEntry
+
from pydantic import ValidationError
+
+
try:
+
entry = AtomEntry(**json_data)
+
# Data is valid, safe to store
+
store.store_entry(username, entry)
+
except ValidationError as e:
+
# Handle validation errors
+
print(f"Invalid entry data: {e}")
+
```
+
+
## Timestamps
+
+
All timestamps use ISO 8601 format in UTC:
+
- `created`: When the record was first created
+
- `last_updated`: When the record was last modified
+
- `updated`: When the feed entry was last updated (from feed)
+
- `published`: When the feed entry was originally published (from feed)
+
+
## Content Sanitization
+
+
HTML content in entries is sanitized using the `FeedParser._sanitize_html()` method to prevent XSS attacks. Allowed tags and attributes are strictly controlled.
+
+
**Allowed HTML tags:**
+
`a`, `abbr`, `acronym`, `b`, `blockquote`, `br`, `code`, `em`, `i`, `li`, `ol`, `p`, `pre`, `strong`, `ul`, `h1`-`h6`, `img`, `div`, `span`
+
+
**Allowed attributes:**
+
- `a`: `href`, `title`
+
- `img`: `src`, `alt`, `title`, `width`, `height`
+
- `blockquote`: `cite`
+
- `abbr`/`acronym`: `title`
+
+
## Error Handling and Robustness
+
+
The store is designed to be fault-tolerant:
+
+
- Invalid entries are skipped during processing with error logging
+
- Malformed JSON files are ignored in listings
+
- Missing files return `None` rather than raising exceptions
+
- Git operations are atomic where possible
+
+
## Example Usage
+
+
### Reading the Store
+
+
```python
+
from pathlib import Path
+
from thicket.core.git_store import GitStore
+
+
# Initialize
+
store = GitStore(Path("/path/to/thicket/store"))
+
+
# Get all users
+
index = store._load_index()
+
for username, user_metadata in index.users.items():
+
print(f"User: {user_metadata.display_name} ({username})")
+
print(f" Feeds: {user_metadata.feeds}")
+
print(f" Entries: {user_metadata.entry_count}")
+
+
# Get recent entries for a user
+
entries = store.list_entries("johndoe", limit=5)
+
for entry in entries:
+
print(f" - {entry.title} ({entry.updated})")
+
```
+
+
### Adding Data
+
+
```python
+
from thicket.models import AtomEntry
+
from datetime import datetime
+
from pydantic import HttpUrl
+
+
# Create entry
+
entry = AtomEntry(
+
id="https://example.com/new-post",
+
title="New Post",
+
link=HttpUrl("https://example.com/new-post"),
+
updated=datetime.now(),
+
content="<p>Post content</p>",
+
content_type="html"
+
)
+
+
# Store entry
+
store.store_entry("johndoe", entry)
+
store.commit_changes("Add new blog post")
+
```
+
+
## Zulip Integration
+
+
The Thicket Git store supports Zulip bot integration for automatic feed posting with user mentions.
+
+
### Zulip Associations
+
+
Users can be associated with their Zulip identities to enable @mentions:
+
+
```python
+
# UserMetadata includes zulip_associations field
+
user.zulip_associations = [
+
ZulipAssociation(server="myorg.zulipchat.com", user_id="alice"),
+
ZulipAssociation(server="other.zulipchat.com", user_id="alice@example.com")
+
]
+
+
# Methods for managing associations
+
user.add_zulip_association("myorg.zulipchat.com", "alice")
+
user.get_zulip_mention("myorg.zulipchat.com") # Returns "alice"
+
user.remove_zulip_association("myorg.zulipchat.com", "alice")
+
```
+
+
### CLI Management
+
+
```bash
+
# Add association
+
thicket zulip-add alice myorg.zulipchat.com alice@example.com
+
+
# Remove association
+
thicket zulip-remove alice myorg.zulipchat.com alice@example.com
+
+
# List associations
+
thicket zulip-list # All users
+
thicket zulip-list alice # Specific user
+
+
# Bulk import from CSV
+
thicket zulip-import associations.csv
+
```
+
+
### Bot Behavior
+
+
When the Thicket Zulip bot posts articles:
+
+
1. It checks for Zulip associations matching the current server
+
2. If found, adds @mention to the post: `@**alice** posted:`
+
3. The mentioned user receives a notification in Zulip
+
+
This enables automatic notifications when someone's blog post is shared.
+
+
## Versioning and Compatibility
+
+
This specification describes version 1.1 of the Thicket Git store format. Changes from 1.0:
+
- Added `zulip_associations` field to UserMetadata (backwards compatible - defaults to empty list)
+
+
Future versions will maintain backward compatibility where possible, with migration tools provided for breaking changes.
+
+
To check the store format version, examine the repository structure and JSON schemas. Stores created by Thicket 0.1.0+ follow this specification.
+97
bot-config/README.md
···
+
# Thicket Bot Configuration
+
+
This directory contains configuration files for the Thicket Zulip bot.
+
+
## Setup Instructions
+
+
### 1. Zulip Bot Configuration
+
+
1. Copy `zuliprc.template` to `zuliprc`:
+
```bash
+
cp bot-config/zuliprc.template bot-config/zuliprc
+
```
+
+
2. Create a bot in your Zulip organization:
+
- Go to Settings > Your bots > Add a new bot
+
- Choose "Generic bot" type
+
- Give it a name like "Thicket" and username like "thicket"
+
- Copy the bot's email and API key
+
+
3. Edit `bot-config/zuliprc` with your bot's credentials:
+
```ini
+
[api]
+
email=thicket-bot@your-org.zulipchat.com
+
key=your-actual-api-key-here
+
site=https://your-org.zulipchat.com
+
```
+
+
### 2. Bot Behavior Configuration (Optional)
+
+
1. Copy `botrc.template` to `botrc` to customize bot behavior:
+
```bash
+
cp bot-config/botrc.template bot-config/botrc
+
```
+
+
2. Edit `bot-config/botrc` to customize:
+
- Sync intervals and batch sizes
+
- Default stream/topic settings
+
- Rate limiting parameters
+
- Notification preferences
+
+
**Note**: The bot will work with default settings if no `botrc` file exists.
+
+
## File Descriptions
+
+
### `zuliprc` (Required)
+
Contains Zulip API credentials for the bot. This file should **never** be committed to version control.
+
+
### `botrc` (Optional)
+
Contains bot behavior configuration and defaults. This file can be committed to version control as it contains no secrets.
+
+
### Template Files
+
- `zuliprc.template` - Template for Zulip credentials
+
- `botrc.template` - Template for bot behavior settings
+
+
## Running the Bot
+
+
Once configured, run the bot with:
+
+
```bash
+
# Run in foreground
+
thicket bot run
+
+
# Run in background (daemon mode)
+
thicket bot run --daemon
+
+
# Debug mode (sends DMs instead of stream posts)
+
thicket bot run --debug-user your-thicket-username
+
+
# Custom config paths
+
thicket bot run --config bot-config/zuliprc --botrc bot-config/botrc
+
```
+
+
## Bot Commands
+
+
Once running, interact with the bot in Zulip:
+
+
- `@thicket help` - Show available commands
+
- `@thicket status` - Show bot status and configuration
+
- `@thicket sync now` - Force immediate sync
+
- `@thicket schedule` - Show sync schedule
+
- `@thicket claim <username>` - Claim a thicket username
+
- `@thicket config <setting> <value>` - Change bot settings
+
+
## Security Notes
+
+
- **Never commit `zuliprc` with real credentials**
+
- Add `bot-config/zuliprc` to `.gitignore`
+
- The `botrc` file contains no secrets and can be safely committed
+
- Bot settings changed via chat are stored in Zulip's persistent storage
+
+
## Troubleshooting
+
+
- Check bot status: `thicket bot status`
+
- View bot logs when running in foreground mode
+
- Verify Zulip credentials are correct
+
- Ensure thicket.yaml configuration exists
+
- Test bot functionality: `thicket bot test`
+28
bot-config/botrc
···
+
[bot]
+
# Default RSS feed polling interval in seconds (minimum 60)
+
sync_interval = 300
+
+
# Maximum number of entries to post per sync cycle
+
max_entries_per_sync = 10
+
+
# Default stream and topic for posting (can be overridden via chat commands)
+
# Leave empty to require configuration via chat
+
default_stream =
+
default_topic =
+
+
# Rate limiting: seconds to wait between batches of posts
+
rate_limit_delay = 5
+
+
# Number of posts per batch before applying rate limit
+
posts_per_batch = 5
+
+
[catchup]
+
# Number of entries to post on first run (catchup mode)
+
catchup_entries = 5
+
+
[notifications]
+
# Whether to send notifications when bot configuration changes
+
config_change_notifications = true
+
+
# Whether to send notifications when users claim usernames
+
username_claim_notifications = true
+34
bot-config/botrc.template
···
+
[bot]
+
# Default RSS feed polling interval in seconds (minimum 60)
+
sync_interval = 300
+
+
# Maximum number of entries to post per sync cycle (1-50)
+
max_entries_per_sync = 10
+
+
# Default stream and topic for posting (can be overridden via chat commands)
+
# Leave empty to require configuration via chat
+
default_stream =
+
default_topic =
+
+
# Rate limiting: seconds to wait between batches of posts
+
rate_limit_delay = 5
+
+
# Number of posts per batch before applying rate limit
+
posts_per_batch = 5
+
+
[catchup]
+
# Number of entries to post on first run (catchup mode)
+
catchup_entries = 5
+
+
[notifications]
+
# Whether to send notifications when bot configuration changes
+
config_change_notifications = true
+
+
# Whether to send notifications when users claim usernames
+
username_claim_notifications = true
+
+
# Instructions:
+
# 1. Copy this file to botrc (without .template extension) to customize bot behavior
+
# 2. The bot will use these defaults if no botrc file is found
+
# 3. All settings can be overridden via chat commands (e.g., @mention config interval 600)
+
# 4. Settings changed via chat are persisted in Zulip storage and take precedence
+16
bot-config/zuliprc.template
···
+
[api]
+
# Your bot's email address (create this in Zulip Settings > Bots)
+
email=your-bot@your-organization.zulipchat.com
+
+
# Your bot's API key (found in Zulip Settings > Bots)
+
key=YOUR_BOT_API_KEY_HERE
+
+
# Your Zulip server URL
+
site=https://your-organization.zulipchat.com
+
+
# Instructions:
+
# 1. Copy this file to zuliprc (without .template extension)
+
# 2. Replace the placeholder values with your actual bot credentials
+
# 3. Create a bot in your Zulip organization at Settings > Bots
+
# 4. Use the bot's email and API key from the Zulip interface
+
# 5. Never commit the actual zuliprc file with real credentials to version control
+12 -5
pyproject.toml
···
"bleach>=6.0.0",
"platformdirs>=4.0.0",
"pyyaml>=6.0.0",
-
"email_validator"
+
"email_validator",
+
"typesense>=1.1.1",
+
"zulip>=0.9.0",
+
"zulip-bots>=0.9.0",
+
"importlib-metadata>=8.7.0",
+
"markdownify>=1.2.0",
]
[project.optional-dependencies]
···
"-ra",
"--strict-markers",
"--strict-config",
-
"--cov=src/thicket",
-
"--cov-report=term-missing",
-
"--cov-report=html",
-
"--cov-report=xml",
]
filterwarnings = [
"error",
···
"class .*\\bProtocol\\):",
"@(abc\\.)?abstractmethod",
]
+
+
[dependency-groups]
+
dev = [
+
"mypy>=1.17.0",
+
"pytest>=8.4.1",
+
]
+5
src/thicket/bots/__init__.py
···
+
"""Zulip bot integration for thicket."""
+
+
from .thicket_bot import ThicketBotHandler
+
+
__all__ = ["ThicketBotHandler"]
+7
src/thicket/bots/requirements.txt
···
+
# Requirements for Thicket Zulip bot
+
# These are already included in the main thicket package
+
pydantic>=2.11.0
+
GitPython>=3.1.40
+
feedparser>=6.0.11
+
httpx>=0.28.0
+
pyyaml>=6.0.0
+201
src/thicket/bots/test_bot.py
···
+
"""Test utilities for the Thicket Zulip bot."""
+
+
import json
+
from pathlib import Path
+
from typing import Any, Optional
+
+
from ..models import AtomEntry
+
from .thicket_bot import ThicketBotHandler
+
+
+
class MockBotHandler:
+
"""Mock BotHandler for testing the Thicket bot."""
+
+
def __init__(self) -> None:
+
"""Initialize mock bot handler."""
+
self.storage_data: dict[str, str] = {}
+
self.sent_messages: list[dict[str, Any]] = []
+
self.config_info = {
+
"full_name": "Thicket Bot",
+
"email": "thicket-bot@example.com",
+
}
+
+
def get_config_info(self) -> dict[str, str]:
+
"""Return bot configuration info."""
+
return self.config_info
+
+
def send_reply(self, message: dict[str, Any], content: str) -> None:
+
"""Mock sending a reply."""
+
reply = {
+
"type": "reply",
+
"to": message.get("sender_id"),
+
"content": content,
+
"original_message": message,
+
}
+
self.sent_messages.append(reply)
+
+
def send_message(self, message: dict[str, Any]) -> None:
+
"""Mock sending a message."""
+
self.sent_messages.append(message)
+
+
@property
+
def storage(self) -> "MockStorage":
+
"""Return mock storage."""
+
return MockStorage(self.storage_data)
+
+
+
class MockStorage:
+
"""Mock storage for bot state."""
+
+
def __init__(self, storage_data: dict[str, str]) -> None:
+
"""Initialize with storage data."""
+
self.storage_data = storage_data
+
+
def __enter__(self) -> "MockStorage":
+
"""Context manager entry."""
+
return self
+
+
def __exit__(self, exc_type: Any, exc_val: Any, exc_tb: Any) -> None:
+
"""Context manager exit."""
+
pass
+
+
def get(self, key: str) -> Optional[str]:
+
"""Get value from storage."""
+
return self.storage_data.get(key)
+
+
def put(self, key: str, value: str) -> None:
+
"""Put value in storage."""
+
self.storage_data[key] = value
+
+
def contains(self, key: str) -> bool:
+
"""Check if key exists in storage."""
+
return key in self.storage_data
+
+
+
def create_test_message(
+
content: str,
+
sender: str = "Test User",
+
sender_id: int = 12345,
+
message_type: str = "stream",
+
) -> dict[str, Any]:
+
"""Create a test message for bot testing."""
+
return {
+
"content": content,
+
"sender_full_name": sender,
+
"sender_id": sender_id,
+
"type": message_type,
+
"timestamp": 1642694400, # 2022-01-20 12:00:00 UTC
+
"stream_id": 1,
+
"subject": "test topic",
+
}
+
+
+
def create_test_entry(
+
entry_id: str = "test-entry-1",
+
title: str = "Test Article",
+
link: str = "https://example.com/test-article",
+
) -> AtomEntry:
+
"""Create a test AtomEntry for testing."""
+
from datetime import datetime
+
+
from pydantic import HttpUrl
+
+
return AtomEntry(
+
id=entry_id,
+
title=title,
+
link=HttpUrl(link),
+
updated=datetime(2024, 1, 20, 12, 0, 0),
+
published=datetime(2024, 1, 20, 10, 0, 0),
+
summary="This is a test article summary",
+
content="<p>This is test article content</p>",
+
author={"name": "Test Author", "email": "author@example.com"},
+
)
+
+
+
class BotTester:
+
"""Helper class for testing bot functionality."""
+
+
def __init__(self, config_path: Optional[Path] = None) -> None:
+
"""Initialize bot tester."""
+
self.bot = ThicketBotHandler()
+
self.handler = MockBotHandler()
+
+
if config_path:
+
# Configure bot with test config
+
self.configure_bot(config_path, "test-stream", "test-topic")
+
+
def configure_bot(
+
self, config_path: Path, stream: str = "test-stream", topic: str = "test-topic"
+
) -> None:
+
"""Configure the bot for testing."""
+
# Set bot configuration
+
config_data = {
+
"stream_name": stream,
+
"topic_name": topic,
+
"sync_interval": 300,
+
"max_entries_per_sync": 10,
+
"config_path": str(config_path),
+
}
+
+
self.handler.storage_data["bot_config"] = json.dumps(config_data)
+
+
# Initialize bot
+
self.bot._load_bot_config(self.handler)
+
+
def send_command(
+
self, command: str, sender: str = "Test User"
+
) -> list[dict[str, Any]]:
+
"""Send a command to the bot and return responses."""
+
message = create_test_message(f"@thicket {command}", sender)
+
+
# Clear previous messages
+
self.handler.sent_messages.clear()
+
+
# Send command
+
self.bot.handle_message(message, self.handler)
+
+
return self.handler.sent_messages.copy()
+
+
def get_last_response_content(self) -> Optional[str]:
+
"""Get the content of the last bot response."""
+
if self.handler.sent_messages:
+
return self.handler.sent_messages[-1].get("content")
+
return None
+
+
def get_last_message(self) -> Optional[dict[str, Any]]:
+
"""Get the last sent message."""
+
if self.handler.sent_messages:
+
return self.handler.sent_messages[-1]
+
return None
+
+
def assert_response_contains(self, text: str) -> None:
+
"""Assert that the last response contains specific text."""
+
content = self.get_last_response_content()
+
assert content is not None, "No response received"
+
assert text in content, f"Response does not contain '{text}': {content}"
+
+
+
# Example usage for testing
+
if __name__ == "__main__":
+
# Create a test config file
+
test_config = Path("/tmp/test_thicket.yaml")
+
+
# Create bot tester
+
tester = BotTester()
+
+
# Test help command
+
responses = tester.send_command("help")
+
print(f"Help response: {tester.get_last_response_content()}")
+
+
# Test status command
+
responses = tester.send_command("status")
+
print(f"Status response: {tester.get_last_response_content()}")
+
+
# Test configuration
+
responses = tester.send_command("config stream general")
+
tester.assert_response_contains("Stream set to")
+
+
responses = tester.send_command("config topic 'Feed Updates'")
+
tester.assert_response_contains("Topic set to")
+
+
print("All tests passed!")
+1257
src/thicket/bots/thicket_bot.py
···
+
"""Zulip bot for automatically posting thicket feed updates."""
+
+
import asyncio
+
import json
+
import logging
+
import os
+
import time
+
from pathlib import Path
+
from typing import Any, Optional
+
+
from zulip_bots.lib import BotHandler
+
+
# Handle imports for both direct execution and package import
+
try:
+
from ..cli.commands.sync import sync_feed
+
from ..core.git_store import GitStore
+
from ..models import AtomEntry, ThicketConfig
+
except ImportError:
+
# When run directly by zulip-bots, add the package to path
+
import sys
+
+
src_dir = Path(__file__).parent.parent.parent
+
if str(src_dir) not in sys.path:
+
sys.path.insert(0, str(src_dir))
+
+
from thicket.cli.commands.sync import sync_feed
+
from thicket.core.git_store import GitStore
+
from thicket.models import AtomEntry, ThicketConfig
+
+
+
class ThicketBotHandler:
+
"""Zulip bot that monitors thicket feeds and posts new articles."""
+
+
def __init__(self) -> None:
+
"""Initialize the thicket bot."""
+
self.logger = logging.getLogger(__name__)
+
self.git_store: Optional[GitStore] = None
+
self.config: Optional[ThicketConfig] = None
+
self.posted_entries: set[str] = set()
+
+
# Bot configuration from storage
+
self.stream_name: Optional[str] = None
+
self.topic_name: Optional[str] = None
+
self.sync_interval: int = 300 # 5 minutes default
+
self.max_entries_per_sync: int = 10
+
self.config_path: Optional[Path] = None
+
+
# Bot behavior settings (loaded from botrc)
+
self.rate_limit_delay: int = 5
+
self.posts_per_batch: int = 5
+
self.catchup_entries: int = 5
+
self.config_change_notifications: bool = True
+
self.username_claim_notifications: bool = True
+
+
# Track last sync time for schedule queries
+
self.last_sync_time: Optional[float] = None
+
+
# Debug mode configuration
+
self.debug_user: Optional[str] = None
+
self.debug_zulip_user_id: Optional[str] = None
+
+
def usage(self) -> str:
+
"""Return bot usage instructions."""
+
return """
+
**Thicket Feed Bot**
+
+
This bot automatically monitors thicket feeds and posts new articles.
+
+
Commands:
+
- `@mention status` - Show current bot status and configuration
+
- `@mention sync now` - Force an immediate sync
+
- `@mention reset` - Clear posting history (will repost recent entries)
+
- `@mention config stream <stream_name>` - Set target stream
+
- `@mention config topic <topic_name>` - Set target topic
+
- `@mention config interval <seconds>` - Set sync interval
+
- `@mention schedule` - Show sync schedule and next run time
+
- `@mention claim <username>` - Claim a thicket username for your Zulip account
+
- `@mention help` - Show this help message
+
"""
+
+
def initialize(self, bot_handler: BotHandler) -> None:
+
"""Initialize the bot with persistent storage."""
+
self.logger.info("Initializing ThicketBot")
+
+
# Get configuration from environment (set by CLI)
+
self.debug_user = os.getenv("THICKET_DEBUG_USER")
+
config_path_env = os.getenv("THICKET_CONFIG_PATH")
+
if config_path_env:
+
self.config_path = Path(config_path_env)
+
self.logger.info(f"Using thicket config: {self.config_path}")
+
+
# Load default configuration from botrc file
+
self._load_botrc_defaults()
+
+
# Load bot configuration from persistent storage
+
self._load_bot_config(bot_handler)
+
+
# Initialize thicket components
+
if self.config_path:
+
try:
+
self._initialize_thicket()
+
self._load_posted_entries(bot_handler)
+
+
# Validate debug mode if enabled
+
if self.debug_user:
+
self._validate_debug_mode(bot_handler)
+
+
except Exception as e:
+
self.logger.error(f"Failed to initialize thicket: {e}")
+
+
# Start background sync loop
+
self._schedule_sync(bot_handler)
+
+
def handle_message(self, message: dict[str, Any], bot_handler: BotHandler) -> None:
+
"""Handle incoming Zulip messages."""
+
content = message["content"].strip()
+
sender = message["sender_full_name"]
+
+
# Only respond to mentions
+
if not self._is_mentioned(content, bot_handler):
+
return
+
+
# Parse command
+
cleaned_content = self._clean_mention(content, bot_handler)
+
command_parts = cleaned_content.split()
+
+
if not command_parts:
+
self._send_help(message, bot_handler)
+
return
+
+
command = command_parts[0].lower()
+
+
try:
+
if command == "help":
+
self._send_help(message, bot_handler)
+
elif command == "status":
+
self._send_status(message, bot_handler, sender)
+
elif (
+
command == "sync"
+
and len(command_parts) > 1
+
and command_parts[1] == "now"
+
):
+
self._handle_force_sync(message, bot_handler, sender)
+
elif command == "reset":
+
self._handle_reset_command(message, bot_handler, sender)
+
elif command == "config":
+
self._handle_config_command(
+
message, bot_handler, command_parts[1:], sender
+
)
+
elif command == "schedule":
+
self._handle_schedule_command(message, bot_handler, sender)
+
elif command == "claim":
+
self._handle_claim_command(
+
message, bot_handler, command_parts[1:], sender
+
)
+
else:
+
bot_handler.send_reply(
+
message,
+
f"Unknown command: {command}. Type `@mention help` for usage.",
+
)
+
except Exception as e:
+
self.logger.error(f"Error handling command '{command}': {e}")
+
bot_handler.send_reply(message, f"Error processing command: {str(e)}")
+
+
def _is_mentioned(self, content: str, bot_handler: BotHandler) -> bool:
+
"""Check if the bot is mentioned in the message."""
+
try:
+
# Get bot's actual name from Zulip
+
bot_info = bot_handler._client.get_profile()
+
if bot_info.get("result") == "success":
+
bot_name = bot_info.get("full_name", "").lower()
+
if bot_name:
+
return (
+
f"@{bot_name}" in content.lower()
+
or f"@**{bot_name}**" in content.lower()
+
)
+
except Exception as e:
+
self.logger.debug(f"Could not get bot profile: {e}")
+
+
# Fallback to generic check
+
return "@thicket" in content.lower()
+
+
def _clean_mention(self, content: str, bot_handler: BotHandler) -> str:
+
"""Remove bot mention from message content."""
+
import re
+
+
try:
+
# Get bot's actual name from Zulip
+
bot_info = bot_handler._client.get_profile()
+
if bot_info.get("result") == "success":
+
bot_name = bot_info.get("full_name", "")
+
if bot_name:
+
# Remove @bot_name or @**bot_name**
+
escaped_name = re.escape(bot_name)
+
content = re.sub(
+
rf"@(?:\*\*)?{escaped_name}(?:\*\*)?",
+
"",
+
content,
+
flags=re.IGNORECASE,
+
).strip()
+
return content
+
except Exception as e:
+
self.logger.debug(f"Could not get bot profile for mention cleaning: {e}")
+
+
# Fallback to removing @thicket
+
content = re.sub(
+
r"@(?:\*\*)?thicket(?:\*\*)?", "", content, flags=re.IGNORECASE
+
).strip()
+
return content
+
+
def _send_help(self, message: dict[str, Any], bot_handler: BotHandler) -> None:
+
"""Send help message."""
+
bot_handler.send_reply(message, self.usage())
+
+
def _send_status(
+
self, message: dict[str, Any], bot_handler: BotHandler, sender: str
+
) -> None:
+
"""Send bot status information."""
+
status_lines = [
+
f"**Thicket Bot Status** (requested by {sender})",
+
"",
+
]
+
+
# Debug mode status
+
if self.debug_user:
+
status_lines.extend(
+
[
+
"๐Ÿ› **Debug Mode:** ENABLED",
+
f"๐ŸŽฏ **Debug User:** {self.debug_user}",
+
"",
+
]
+
)
+
else:
+
status_lines.extend(
+
[
+
f"๐Ÿ“ **Stream:** {self.stream_name or 'Not configured'}",
+
f"๐Ÿ“ **Topic:** {self.topic_name or 'Not configured'}",
+
"",
+
]
+
)
+
+
status_lines.extend(
+
[
+
f"โฑ๏ธ **Sync Interval:** {self.sync_interval}s ({self.sync_interval // 60}m {self.sync_interval % 60}s)",
+
f"๐Ÿ“Š **Max Entries/Sync:** {self.max_entries_per_sync}",
+
f"๐Ÿ“ **Config Path:** {self.config_path or 'Not configured'}",
+
"",
+
f"๐Ÿ“„ **Tracked Entries:** {len(self.posted_entries)}",
+
f"๐Ÿ”„ **Catchup Mode:** {'Active (first run)' if len(self.posted_entries) == 0 else 'Inactive'}",
+
f"โœ… **Thicket Initialized:** {'Yes' if self.git_store else 'No'}",
+
"",
+
self._get_schedule_info(),
+
]
+
)
+
+
bot_handler.send_reply(message, "\n".join(status_lines))
+
+
def _handle_force_sync(
+
self, message: dict[str, Any], bot_handler: BotHandler, sender: str
+
) -> None:
+
"""Handle immediate sync request."""
+
if not self._check_initialization(message, bot_handler):
+
return
+
+
bot_handler.send_reply(
+
message, f"๐Ÿ”„ Starting immediate sync... (requested by {sender})"
+
)
+
+
try:
+
new_entries = self._perform_sync(bot_handler)
+
bot_handler.send_reply(
+
message, f"โœ… Sync completed! Found {len(new_entries)} new entries."
+
)
+
except Exception as e:
+
self.logger.error(f"Force sync failed: {e}")
+
bot_handler.send_reply(message, f"โŒ Sync failed: {str(e)}")
+
+
def _handle_reset_command(
+
self, message: dict[str, Any], bot_handler: BotHandler, sender: str
+
) -> None:
+
"""Handle reset command to clear posted entries tracking."""
+
try:
+
self.posted_entries.clear()
+
self._save_posted_entries(bot_handler)
+
bot_handler.send_reply(
+
message,
+
f"โœ… Posting history reset! Recent entries will be posted on next sync. (requested by {sender})",
+
)
+
self.logger.info(f"Posted entries tracking reset by {sender}")
+
except Exception as e:
+
self.logger.error(f"Reset failed: {e}")
+
bot_handler.send_reply(message, f"โŒ Reset failed: {str(e)}")
+
+
def _handle_schedule_command(
+
self, message: dict[str, Any], bot_handler: BotHandler, sender: str
+
) -> None:
+
"""Handle schedule query command."""
+
schedule_info = self._get_schedule_info()
+
bot_handler.send_reply(
+
message,
+
f"**Thicket Bot Schedule** (requested by {sender})\n\n{schedule_info}",
+
)
+
+
def _handle_claim_command(
+
self,
+
message: dict[str, Any],
+
bot_handler: BotHandler,
+
args: list[str],
+
sender: str,
+
) -> None:
+
"""Handle username claiming command."""
+
if not args:
+
bot_handler.send_reply(message, "Usage: `@mention claim <username>`")
+
return
+
+
if not self._check_initialization(message, bot_handler):
+
return
+
+
username = args[0].strip()
+
+
# Get sender's Zulip user info
+
sender_user_id = message.get("sender_id")
+
sender_email = message.get("sender_email")
+
+
if not sender_user_id or not sender_email:
+
bot_handler.send_reply(
+
message, "โŒ Could not determine your Zulip user information."
+
)
+
return
+
+
try:
+
# Get current Zulip server from environment
+
zulip_site_url = os.getenv("THICKET_ZULIP_SITE_URL", "")
+
server_url = zulip_site_url.replace("https://", "").replace("http://", "")
+
+
if not server_url:
+
bot_handler.send_reply(
+
message, "โŒ Could not determine Zulip server URL."
+
)
+
return
+
+
# Check if username exists in thicket
+
user = self.git_store.get_user(username)
+
if not user:
+
bot_handler.send_reply(
+
message,
+
f"โŒ Username `{username}` not found in thicket. Available users: {', '.join(self.git_store.list_users())}",
+
)
+
return
+
+
# Check if username is already claimed for this server
+
existing_zulip_id = user.get_zulip_mention(server_url)
+
if existing_zulip_id:
+
# Check if it's claimed by the same user
+
if existing_zulip_id == sender_email or str(existing_zulip_id) == str(
+
sender_user_id
+
):
+
bot_handler.send_reply(
+
message,
+
f"โœ… Username `{username}` is already claimed by you on {server_url}!",
+
)
+
else:
+
bot_handler.send_reply(
+
message,
+
f"โŒ Username `{username}` is already claimed by another user on {server_url}.",
+
)
+
return
+
+
# Claim the username - prefer email for consistency
+
success = self.git_store.add_zulip_association(
+
username, server_url, sender_email
+
)
+
+
if success:
+
reply_msg = (
+
f"๐ŸŽ‰ Successfully claimed username `{username}` for **{sender}** on {server_url}!\n"
+
+ "You will now be mentioned when new articles are posted from this user's feeds."
+
)
+
bot_handler.send_reply(message, reply_msg)
+
+
# Send notification to configured stream if enabled and not in debug mode
+
if (
+
self.username_claim_notifications
+
and not self.debug_user
+
and self.stream_name
+
and self.topic_name
+
):
+
try:
+
notification_msg = f"๐Ÿ‘‹ **{sender}** claimed thicket username `{username}` on {server_url}"
+
bot_handler.send_message(
+
{
+
"type": "stream",
+
"to": self.stream_name,
+
"subject": self.topic_name,
+
"content": notification_msg,
+
}
+
)
+
except Exception as e:
+
self.logger.error(
+
f"Failed to send username claim notification: {e}"
+
)
+
+
self.logger.info(
+
f"User {sender} ({sender_email}) claimed username {username} on {server_url}"
+
)
+
else:
+
bot_handler.send_reply(
+
message,
+
f"โŒ Failed to claim username `{username}`. This shouldn't happen - please contact an administrator.",
+
)
+
+
except Exception as e:
+
self.logger.error(f"Error processing claim for {username} by {sender}: {e}")
+
bot_handler.send_reply(message, f"โŒ Error processing claim: {str(e)}")
+
+
def _handle_config_command(
+
self,
+
message: dict[str, Any],
+
bot_handler: BotHandler,
+
args: list[str],
+
sender: str,
+
) -> None:
+
"""Handle configuration commands."""
+
if len(args) < 2:
+
bot_handler.send_reply(
+
message, "Usage: `@mention config <setting> <value>`"
+
)
+
return
+
+
setting = args[0].lower()
+
value = " ".join(args[1:])
+
+
if setting == "stream":
+
old_value = self.stream_name
+
self.stream_name = value
+
self._save_bot_config(bot_handler)
+
bot_handler.send_reply(
+
message, f"โœ… Stream set to: **{value}** (by {sender})"
+
)
+
self._send_config_change_notification(
+
bot_handler, sender, "stream", old_value, value
+
)
+
+
elif setting == "topic":
+
old_value = self.topic_name
+
self.topic_name = value
+
self._save_bot_config(bot_handler)
+
bot_handler.send_reply(
+
message, f"โœ… Topic set to: **{value}** (by {sender})"
+
)
+
self._send_config_change_notification(
+
bot_handler, sender, "topic", old_value, value
+
)
+
+
elif setting == "interval":
+
try:
+
interval = int(value)
+
if interval < 60:
+
bot_handler.send_reply(
+
message, "โŒ Interval must be at least 60 seconds"
+
)
+
return
+
old_value = self.sync_interval
+
self.sync_interval = interval
+
self._save_bot_config(bot_handler)
+
bot_handler.send_reply(
+
message, f"โœ… Sync interval set to: **{interval}s** (by {sender})"
+
)
+
self._send_config_change_notification(
+
bot_handler,
+
sender,
+
"sync interval",
+
f"{old_value}s",
+
f"{interval}s",
+
)
+
except ValueError:
+
bot_handler.send_reply(
+
message, "โŒ Invalid interval value. Must be a number of seconds."
+
)
+
+
elif setting == "max_entries":
+
try:
+
max_entries = int(value)
+
if max_entries < 1 or max_entries > 50:
+
bot_handler.send_reply(
+
message, "โŒ Max entries must be between 1 and 50"
+
)
+
return
+
old_value = self.max_entries_per_sync
+
self.max_entries_per_sync = max_entries
+
self._save_bot_config(bot_handler)
+
bot_handler.send_reply(
+
message,
+
f"โœ… Max entries per sync set to: **{max_entries}** (by {sender})",
+
)
+
self._send_config_change_notification(
+
bot_handler,
+
sender,
+
"max entries per sync",
+
str(old_value),
+
str(max_entries),
+
)
+
except ValueError:
+
bot_handler.send_reply(
+
message, "โŒ Invalid max entries value. Must be a number."
+
)
+
+
else:
+
bot_handler.send_reply(
+
message,
+
f"โŒ Unknown setting: {setting}. Available: stream, topic, interval, max_entries",
+
)
+
+
def _load_bot_config(self, bot_handler: BotHandler) -> None:
+
"""Load bot configuration from persistent storage."""
+
try:
+
config_data = bot_handler.storage.get("bot_config")
+
if config_data:
+
config = json.loads(config_data)
+
self.stream_name = config.get("stream_name")
+
self.topic_name = config.get("topic_name")
+
self.sync_interval = config.get("sync_interval", 300)
+
self.max_entries_per_sync = config.get("max_entries_per_sync", 10)
+
self.last_sync_time = config.get("last_sync_time")
+
except Exception:
+
# Bot config not found on first run is expected
+
pass
+
+
def _save_bot_config(self, bot_handler: BotHandler) -> None:
+
"""Save bot configuration to persistent storage."""
+
try:
+
config_data = {
+
"stream_name": self.stream_name,
+
"topic_name": self.topic_name,
+
"sync_interval": self.sync_interval,
+
"max_entries_per_sync": self.max_entries_per_sync,
+
"last_sync_time": self.last_sync_time,
+
}
+
bot_handler.storage.put("bot_config", json.dumps(config_data))
+
except Exception as e:
+
self.logger.error(f"Error saving bot config: {e}")
+
+
def _load_botrc_defaults(self) -> None:
+
"""Load default configuration from botrc file."""
+
try:
+
import configparser
+
from pathlib import Path
+
+
botrc_path = Path("bot-config/botrc")
+
if not botrc_path.exists():
+
self.logger.info("No botrc file found, using hardcoded defaults")
+
return
+
+
config = configparser.ConfigParser()
+
config.read(botrc_path)
+
+
if "bot" in config:
+
bot_section = config["bot"]
+
self.sync_interval = bot_section.getint("sync_interval", 300)
+
self.max_entries_per_sync = bot_section.getint(
+
"max_entries_per_sync", 10
+
)
+
self.rate_limit_delay = bot_section.getint("rate_limit_delay", 5)
+
self.posts_per_batch = bot_section.getint("posts_per_batch", 5)
+
+
# Set defaults only if not already configured
+
default_stream = bot_section.get("default_stream", "").strip()
+
default_topic = bot_section.get("default_topic", "").strip()
+
if default_stream:
+
self.stream_name = default_stream
+
if default_topic:
+
self.topic_name = default_topic
+
+
if "catchup" in config:
+
catchup_section = config["catchup"]
+
self.catchup_entries = catchup_section.getint("catchup_entries", 5)
+
+
if "notifications" in config:
+
notifications_section = config["notifications"]
+
self.config_change_notifications = notifications_section.getboolean(
+
"config_change_notifications", True
+
)
+
self.username_claim_notifications = notifications_section.getboolean(
+
"username_claim_notifications", True
+
)
+
+
self.logger.info(f"Loaded configuration from {botrc_path}")
+
+
except Exception as e:
+
self.logger.error(f"Error loading botrc defaults: {e}")
+
self.logger.info("Using hardcoded defaults")
+
+
def _initialize_thicket(self) -> None:
+
"""Initialize thicket components."""
+
if not self.config_path or not self.config_path.exists():
+
raise ValueError("Thicket config file not found")
+
+
# Load thicket configuration
+
import yaml
+
+
with open(self.config_path) as f:
+
config_data = yaml.safe_load(f)
+
self.config = ThicketConfig(**config_data)
+
+
# Initialize git store
+
self.git_store = GitStore(self.config.git_store)
+
+
self.logger.info("Thicket components initialized successfully")
+
+
def _validate_debug_mode(self, bot_handler: BotHandler) -> None:
+
"""Validate debug mode configuration."""
+
if not self.debug_user or not self.git_store:
+
return
+
+
# Get current Zulip server from environment
+
zulip_site_url = os.getenv("THICKET_ZULIP_SITE_URL", "")
+
server_url = zulip_site_url.replace("https://", "").replace("http://", "")
+
+
# Check if debug user exists in thicket
+
user = self.git_store.get_user(self.debug_user)
+
if not user:
+
raise ValueError(f"Debug user '{self.debug_user}' not found in thicket")
+
+
# Check if user has Zulip association for this server
+
if not server_url:
+
raise ValueError("Could not determine Zulip server URL")
+
+
zulip_user_id = user.get_zulip_mention(server_url)
+
if not zulip_user_id:
+
raise ValueError(
+
f"User '{self.debug_user}' has no Zulip association for server '{server_url}'"
+
)
+
+
# Try to look up the actual Zulip user ID from the email address
+
# But don't fail if we can't - we'll try again when sending messages
+
actual_user_id = self._lookup_zulip_user_id(bot_handler, zulip_user_id)
+
if actual_user_id and actual_user_id != zulip_user_id:
+
# Successfully resolved to numeric ID
+
self.debug_zulip_user_id = actual_user_id
+
self.logger.info(
+
f"Debug mode enabled: Will send DMs to {self.debug_user} (email: {zulip_user_id}, user_id: {actual_user_id}) on {server_url}"
+
)
+
else:
+
# Keep the email address, will resolve later when sending
+
self.debug_zulip_user_id = zulip_user_id
+
self.logger.info(
+
f"Debug mode enabled: Will send DMs to {self.debug_user} ({zulip_user_id}) on {server_url} (will resolve user ID when sending)"
+
)
+
+
def _lookup_zulip_user_id(
+
self, bot_handler: BotHandler, email_or_id: str
+
) -> Optional[str]:
+
"""Look up Zulip user ID from email address or return the ID if it's already numeric."""
+
# If it's already a numeric user ID, return it
+
if email_or_id.isdigit():
+
return email_or_id
+
+
try:
+
client = bot_handler._client
+
if not client:
+
self.logger.error("No Zulip client available for user lookup")
+
return None
+
+
# First try the get_user_by_email API if available
+
try:
+
user_result = client.get_user_by_email(email_or_id)
+
if user_result.get("result") == "success":
+
user_data = user_result.get("user", {})
+
user_id = user_data.get("user_id")
+
if user_id:
+
self.logger.info(
+
f"Found user ID {user_id} for '{email_or_id}' via get_user_by_email API"
+
)
+
return str(user_id)
+
except (AttributeError, Exception):
+
pass
+
+
# Fallback: Get all users and search through them
+
users_result = client.get_users()
+
if users_result.get("result") == "success":
+
for user in users_result["members"]:
+
user_email = user.get("email", "")
+
delivery_email = user.get("delivery_email", "")
+
+
if (
+
user_email == email_or_id
+
or delivery_email == email_or_id
+
or str(user.get("user_id")) == email_or_id
+
):
+
user_id = user.get("user_id")
+
return str(user_id)
+
+
self.logger.error(
+
f"No user found with identifier '{email_or_id}'. Searched {len(users_result['members'])} users."
+
)
+
return None
+
else:
+
self.logger.error(
+
f"Failed to get users: {users_result.get('msg', 'Unknown error')}"
+
)
+
return None
+
+
except Exception as e:
+
self.logger.error(f"Error looking up user ID for '{email_or_id}': {e}")
+
return None
+
+
def _lookup_zulip_user_info(
+
self, bot_handler: BotHandler, email_or_id: str
+
) -> tuple[Optional[str], Optional[str]]:
+
"""Look up both Zulip user ID and full name from email address."""
+
if email_or_id.isdigit():
+
return email_or_id, None
+
+
try:
+
client = bot_handler._client
+
if not client:
+
return None, None
+
+
# Try get_user_by_email API first
+
try:
+
user_result = client.get_user_by_email(email_or_id)
+
if user_result.get("result") == "success":
+
user_data = user_result.get("user", {})
+
user_id = user_data.get("user_id")
+
full_name = user_data.get("full_name", "")
+
if user_id:
+
return str(user_id), full_name
+
except AttributeError:
+
pass
+
+
# Fallback: search all users
+
users_result = client.get_users()
+
if users_result.get("result") == "success":
+
for user in users_result["members"]:
+
if (
+
user.get("email") == email_or_id
+
or user.get("delivery_email") == email_or_id
+
):
+
return str(user.get("user_id")), user.get("full_name", "")
+
+
return None, None
+
+
except Exception as e:
+
self.logger.error(f"Error looking up user info for '{email_or_id}': {e}")
+
return None, None
+
+
def _load_posted_entries(self, bot_handler: BotHandler) -> None:
+
"""Load the set of already posted entries."""
+
try:
+
posted_data = bot_handler.storage.get("posted_entries")
+
if posted_data:
+
self.posted_entries = set(json.loads(posted_data))
+
except Exception:
+
# Empty set on first run is expected
+
self.posted_entries = set()
+
+
def _save_posted_entries(self, bot_handler: BotHandler) -> None:
+
"""Save the set of posted entries."""
+
try:
+
bot_handler.storage.put(
+
"posted_entries", json.dumps(list(self.posted_entries))
+
)
+
except Exception as e:
+
self.logger.error(f"Error saving posted entries: {e}")
+
+
def _check_initialization(
+
self, message: dict[str, Any], bot_handler: BotHandler
+
) -> bool:
+
"""Check if thicket is properly initialized."""
+
if not self.git_store or not self.config:
+
bot_handler.send_reply(
+
message, "โŒ Thicket not initialized. Please check configuration."
+
)
+
return False
+
+
# In debug mode, we don't need stream/topic configuration
+
if self.debug_user:
+
return True
+
+
if not self.stream_name or not self.topic_name:
+
bot_handler.send_reply(
+
message,
+
"โŒ Stream and topic must be configured first. Use `@mention config stream <name>` and `@mention config topic <name>`",
+
)
+
return False
+
+
return True
+
+
def _schedule_sync(self, bot_handler: BotHandler) -> None:
+
"""Schedule periodic sync operations."""
+
+
def sync_loop():
+
while True:
+
try:
+
# Check if we can sync
+
can_sync = self.git_store and (
+
(self.stream_name and self.topic_name) or self.debug_user
+
)
+
+
if can_sync:
+
self._perform_sync(bot_handler)
+
+
time.sleep(self.sync_interval)
+
except Exception as e:
+
self.logger.error(f"Error in sync loop: {e}")
+
time.sleep(60) # Wait before retrying
+
+
# Start background thread
+
import threading
+
+
sync_thread = threading.Thread(target=sync_loop, daemon=True)
+
sync_thread.start()
+
+
def _perform_sync(self, bot_handler: BotHandler) -> list[AtomEntry]:
+
"""Perform thicket sync and return new entries."""
+
if not self.config or not self.git_store:
+
return []
+
+
new_entries: list[tuple[AtomEntry, str]] = [] # (entry, username) pairs
+
is_first_run = len(self.posted_entries) == 0
+
+
# Get all users and their feeds from git store
+
users_with_feeds = self.git_store.list_all_users_with_feeds()
+
+
# Sync each user's feeds
+
for username, feed_urls in users_with_feeds:
+
for feed_url in feed_urls:
+
try:
+
# Run async sync function
+
loop = asyncio.new_event_loop()
+
asyncio.set_event_loop(loop)
+
try:
+
new_count, _ = loop.run_until_complete(
+
sync_feed(
+
self.git_store, username, str(feed_url), dry_run=False
+
)
+
)
+
+
entries_to_check = []
+
+
if new_count > 0:
+
# Get the newly added entries
+
entries_to_check = self.git_store.list_entries(
+
username, limit=new_count
+
)
+
+
# Always check for catchup mode on first run
+
if is_first_run:
+
# Catchup mode: get configured number of entries on first run
+
catchup_entries = self.git_store.list_entries(
+
username, limit=self.catchup_entries
+
)
+
entries_to_check = (
+
catchup_entries
+
if not entries_to_check
+
else entries_to_check
+
)
+
+
for entry in entries_to_check:
+
entry_key = f"{username}:{entry.id}"
+
if entry_key not in self.posted_entries:
+
new_entries.append((entry, username))
+
if len(new_entries) >= self.max_entries_per_sync:
+
break
+
+
finally:
+
loop.close()
+
+
except Exception as e:
+
self.logger.error(
+
f"Error syncing feed {feed_url} for user {username}: {e}"
+
)
+
+
if len(new_entries) >= self.max_entries_per_sync:
+
break
+
+
# Post new entries to Zulip with rate limiting
+
if new_entries:
+
posted_count = 0
+
+
for i, (entry, username) in enumerate(new_entries):
+
self._post_entry_to_zulip(entry, bot_handler, username)
+
self.posted_entries.add(f"{username}:{entry.id}")
+
posted_count += 1
+
+
# Rate limiting: pause after configured number of messages
+
if (
+
posted_count % self.posts_per_batch == 0
+
and i < len(new_entries) - 1
+
):
+
time.sleep(self.rate_limit_delay)
+
+
self._save_posted_entries(bot_handler)
+
+
# Update last sync time
+
self.last_sync_time = time.time()
+
+
return [entry for entry, _ in new_entries]
+
+
def _post_entry_to_zulip(
+
self, entry: AtomEntry, bot_handler: BotHandler, username: str
+
) -> None:
+
"""Post a single entry to the configured Zulip stream/topic or debug user DM."""
+
try:
+
# Get current Zulip server from environment
+
zulip_site_url = os.getenv("THICKET_ZULIP_SITE_URL", "")
+
server_url = zulip_site_url.replace("https://", "").replace("http://", "")
+
+
# Build author/date info consistently
+
mention_info = ""
+
if server_url and self.git_store:
+
user = self.git_store.get_user(username)
+
if user:
+
zulip_user_id = user.get_zulip_mention(server_url)
+
if zulip_user_id:
+
# Look up the actual Zulip full name for proper @mention
+
_, zulip_full_name = self._lookup_zulip_user_info(
+
bot_handler, zulip_user_id
+
)
+
display_name = zulip_full_name or user.display_name or username
+
+
# Check if author is different from the user - avoid redundancy
+
author_name = entry.author and entry.author.get("name")
+
if author_name and author_name.lower() != display_name.lower():
+
author_info = f" (by {author_name})"
+
else:
+
author_info = ""
+
+
published_info = ""
+
if entry.published:
+
published_info = (
+
f" โ€ข {entry.published.strftime('%Y-%m-%d')}"
+
)
+
+
mention_info = f"@**{display_name}** posted{author_info}{published_info}:\n\n"
+
+
# If no Zulip user found, use consistent format without @mention
+
if not mention_info:
+
user = self.git_store.get_user(username) if self.git_store else None
+
display_name = user.display_name if user else username
+
+
author_name = entry.author and entry.author.get("name")
+
if author_name and author_name.lower() != display_name.lower():
+
author_info = f" (by {author_name})"
+
else:
+
author_info = ""
+
+
published_info = ""
+
if entry.published:
+
published_info = f" โ€ข {entry.published.strftime('%Y-%m-%d')}"
+
+
mention_info = (
+
f"**{display_name}** posted{author_info}{published_info}:\n\n"
+
)
+
+
# Format the message with HTML processing
+
message_lines = [
+
f"**{entry.title}**",
+
f"๐Ÿ”— {entry.link}",
+
]
+
+
if entry.summary:
+
# Process HTML in summary and truncate if needed
+
processed_summary = self._process_html_content(entry.summary)
+
if len(processed_summary) > 400:
+
processed_summary = processed_summary[:397] + "..."
+
message_lines.append(f"\n{processed_summary}")
+
+
message_content = mention_info + "\n".join(message_lines)
+
+
# Choose destination based on mode
+
if self.debug_user and self.debug_zulip_user_id:
+
# Debug mode: send DM
+
debug_message = f"๐Ÿ› **DEBUG:** New article from thicket user `{username}`:\n\n{message_content}"
+
+
# Ensure we have the numeric user ID
+
user_id_to_use = self.debug_zulip_user_id
+
if not user_id_to_use.isdigit():
+
# Need to look up the numeric ID
+
resolved_id = self._lookup_zulip_user_id(
+
bot_handler, user_id_to_use
+
)
+
if resolved_id:
+
user_id_to_use = resolved_id
+
self.logger.debug(
+
f"Resolved {self.debug_zulip_user_id} to user ID {user_id_to_use}"
+
)
+
else:
+
self.logger.error(
+
f"Could not resolve user ID for {self.debug_zulip_user_id}"
+
)
+
return
+
+
try:
+
# For private messages, user_id needs to be an integer, not string
+
user_id_int = int(user_id_to_use)
+
bot_handler.send_message(
+
{
+
"type": "private",
+
"to": [user_id_int], # Use integer user ID
+
"content": debug_message,
+
}
+
)
+
except ValueError:
+
# If conversion to int fails, user_id_to_use might be an email
+
try:
+
bot_handler.send_message(
+
{
+
"type": "private",
+
"to": [user_id_to_use], # Try as string (email)
+
"content": debug_message,
+
}
+
)
+
except Exception as e2:
+
self.logger.error(
+
f"Failed to send DM to {self.debug_user} (tried both int and string): {e2}"
+
)
+
return
+
except Exception as e:
+
self.logger.error(
+
f"Failed to send DM to {self.debug_user} ({user_id_to_use}): {e}"
+
)
+
return
+
self.logger.info(
+
f"Posted entry to debug user {self.debug_user}: {entry.title}"
+
)
+
else:
+
# Normal mode: send to stream/topic
+
bot_handler.send_message(
+
{
+
"type": "stream",
+
"to": self.stream_name,
+
"subject": self.topic_name,
+
"content": message_content,
+
}
+
)
+
self.logger.info(
+
f"Posted entry to stream: {entry.title} (user: {username})"
+
)
+
+
except Exception as e:
+
self.logger.error(f"Error posting entry to Zulip: {e}")
+
+
def _process_html_content(self, html_content: str) -> str:
+
"""Process HTML content from feeds to clean Zulip-compatible markdown."""
+
if not html_content:
+
return ""
+
+
try:
+
# Try to use markdownify for proper HTML to Markdown conversion
+
from markdownify import markdownify as md
+
+
# Convert HTML to Markdown with compact settings for summaries
+
markdown = md(
+
html_content,
+
heading_style="ATX", # Use # for headings (but we'll post-process these)
+
bullets="-", # Use - for bullets
+
convert=[
+
"a",
+
"b",
+
"strong",
+
"i",
+
"em",
+
"code",
+
"pre",
+
"p",
+
"br",
+
"ul",
+
"ol",
+
"li",
+
"h1",
+
"h2",
+
"h3",
+
"h4",
+
"h5",
+
"h6",
+
],
+
).strip()
+
+
# Post-process to convert headings to bold for compact summaries
+
import re
+
+
# Convert markdown headers to bold with period
+
markdown = re.sub(
+
r"^#{1,6}\s*(.+)$", r"**\1.**", markdown, flags=re.MULTILINE
+
)
+
+
# Clean up excessive newlines and make more compact
+
markdown = re.sub(
+
r"\n\s*\n\s*\n+", " ", markdown
+
) # Multiple newlines become space
+
markdown = re.sub(
+
r"\n\s*\n", ". ", markdown
+
) # Double newlines become sentence breaks
+
markdown = re.sub(r"\n", " ", markdown) # Single newlines become spaces
+
+
# Clean up double periods and excessive whitespace
+
markdown = re.sub(r"\.\.+", ".", markdown)
+
markdown = re.sub(r"\s+", " ", markdown)
+
return markdown.strip()
+
+
except ImportError:
+
# Fallback: manual HTML processing
+
import re
+
+
content = html_content
+
+
# Convert headings to bold with periods for compact summaries
+
content = re.sub(
+
r"<h[1-6](?:\s[^>]*)?>([^<]*)</h[1-6]>",
+
r"**\1.** ",
+
content,
+
flags=re.IGNORECASE,
+
)
+
+
# Convert common HTML elements to Markdown
+
content = re.sub(
+
r"<(?:strong|b)(?:\s[^>]*)?>([^<]*)</(?:strong|b)>",
+
r"**\1**",
+
content,
+
flags=re.IGNORECASE,
+
)
+
content = re.sub(
+
r"<(?:em|i)(?:\s[^>]*)?>([^<]*)</(?:em|i)>",
+
r"*\1*",
+
content,
+
flags=re.IGNORECASE,
+
)
+
content = re.sub(
+
r"<code(?:\s[^>]*)?>([^<]*)</code>",
+
r"`\1`",
+
content,
+
flags=re.IGNORECASE,
+
)
+
content = re.sub(
+
r'<a(?:\s[^>]*?)?\s*href=["\']([^"\']*)["\'](?:\s[^>]*)?>([^<]*)</a>',
+
r"[\2](\1)",
+
content,
+
flags=re.IGNORECASE,
+
)
+
+
# Convert block elements to spaces instead of newlines for compactness
+
content = re.sub(r"<br\s*/?>", " ", content, flags=re.IGNORECASE)
+
content = re.sub(r"</p>\s*<p>", ". ", content, flags=re.IGNORECASE)
+
content = re.sub(
+
r"</?(?:p|div)(?:\s[^>]*)?>", " ", content, flags=re.IGNORECASE
+
)
+
+
# Remove remaining HTML tags
+
content = re.sub(r"<[^>]+>", "", content)
+
+
# Clean up whitespace and make compact
+
content = re.sub(
+
r"\s+", " ", content
+
) # Multiple whitespace becomes single space
+
content = re.sub(
+
r"\.\.+", ".", content
+
) # Multiple periods become single period
+
return content.strip()
+
+
except Exception as e:
+
self.logger.error(f"Error processing HTML content: {e}")
+
# Last resort: just strip HTML tags
+
import re
+
+
return re.sub(r"<[^>]+>", "", html_content).strip()
+
+
def _get_schedule_info(self) -> str:
+
"""Get schedule information string."""
+
lines = []
+
+
if self.last_sync_time:
+
import datetime
+
+
last_sync = datetime.datetime.fromtimestamp(self.last_sync_time)
+
next_sync = last_sync + datetime.timedelta(seconds=self.sync_interval)
+
now = datetime.datetime.now()
+
+
# Calculate time until next sync
+
time_until_next = next_sync - now
+
+
if time_until_next.total_seconds() > 0:
+
minutes, seconds = divmod(int(time_until_next.total_seconds()), 60)
+
hours, minutes = divmod(minutes, 60)
+
+
if hours > 0:
+
time_str = f"{hours}h {minutes}m {seconds}s"
+
elif minutes > 0:
+
time_str = f"{minutes}m {seconds}s"
+
else:
+
time_str = f"{seconds}s"
+
+
lines.extend(
+
[
+
f"๐Ÿ• **Last Sync:** {last_sync.strftime('%H:%M:%S')}",
+
f"โฐ **Next Sync:** {next_sync.strftime('%H:%M:%S')} (in {time_str})",
+
]
+
)
+
else:
+
lines.extend(
+
[
+
f"๐Ÿ• **Last Sync:** {last_sync.strftime('%H:%M:%S')}",
+
f"โฐ **Next Sync:** Due now (running every {self.sync_interval}s)",
+
]
+
)
+
else:
+
lines.append("๐Ÿ• **Last Sync:** Never (bot starting up)")
+
+
# Add sync frequency info
+
if self.sync_interval >= 3600:
+
frequency_str = (
+
f"{self.sync_interval // 3600}h {(self.sync_interval % 3600) // 60}m"
+
)
+
elif self.sync_interval >= 60:
+
frequency_str = f"{self.sync_interval // 60}m {self.sync_interval % 60}s"
+
else:
+
frequency_str = f"{self.sync_interval}s"
+
+
lines.append(f"๐Ÿ”„ **Sync Frequency:** Every {frequency_str}")
+
+
return "\n".join(lines)
+
+
def _send_config_change_notification(
+
self,
+
bot_handler: BotHandler,
+
changer: str,
+
setting: str,
+
old_value: Optional[str],
+
new_value: str,
+
) -> None:
+
"""Send configuration change notification if enabled."""
+
if not self.config_change_notifications or self.debug_user:
+
return
+
+
# Don't send notification if stream/topic aren't configured yet
+
if not self.stream_name or not self.topic_name:
+
return
+
+
try:
+
old_display = old_value if old_value else "(not set)"
+
notification_msg = (
+
f"โš™๏ธ **{changer}** changed {setting}: `{old_display}` โ†’ `{new_value}`"
+
)
+
+
bot_handler.send_message(
+
{
+
"type": "stream",
+
"to": self.stream_name,
+
"subject": self.topic_name,
+
"content": notification_msg,
+
}
+
)
+
except Exception as e:
+
self.logger.error(f"Failed to send config change notification: {e}")
+
+
+
handler_class = ThicketBotHandler
+24 -2
src/thicket/cli/commands/__init__.py
···
"""CLI commands for thicket."""
# Import all commands to register them with the main app
-
from . import add, duplicates, index_cmd, info_cmd, init, links_cmd, list_cmd, sync
+
from . import (
+
add,
+
bot,
+
duplicates,
+
info_cmd,
+
init,
+
list_cmd,
+
search,
+
sync,
+
upload,
+
zulip,
+
)
-
__all__ = ["add", "duplicates", "index_cmd", "info_cmd", "init", "links_cmd", "list_cmd", "sync"]
+
__all__ = [
+
"add",
+
"bot",
+
"duplicates",
+
"info_cmd",
+
"init",
+
"list_cmd",
+
"search",
+
"sync",
+
"upload",
+
"zulip",
+
]
+44 -9
src/thicket/cli/commands/add.py
···
def add_command(
subcommand: str = typer.Argument(..., help="Subcommand: 'user' or 'feed'"),
username: str = typer.Argument(..., help="Username"),
-
feed_url: Optional[str] = typer.Argument(None, help="Feed URL (required for 'user' command)"),
+
feed_url: Optional[str] = typer.Argument(
+
None, help="Feed URL (required for 'user' command)"
+
),
email: Optional[str] = typer.Option(None, "--email", "-e", help="User email"),
-
homepage: Optional[str] = typer.Option(None, "--homepage", "-h", help="User homepage"),
+
homepage: Optional[str] = typer.Option(
+
None, "--homepage", "-h", help="User homepage"
+
),
icon: Optional[str] = typer.Option(None, "--icon", "-i", help="User icon URL"),
-
display_name: Optional[str] = typer.Option(None, "--display-name", "-d", help="User display name"),
+
display_name: Optional[str] = typer.Option(
+
None, "--display-name", "-d", help="User display name"
+
),
config_file: Optional[Path] = typer.Option(
Path("thicket.yaml"), "--config", help="Configuration file path"
),
auto_discover: bool = typer.Option(
-
True, "--auto-discover/--no-auto-discover", help="Auto-discover user metadata from feed"
+
True,
+
"--auto-discover/--no-auto-discover",
+
help="Auto-discover user metadata from feed",
),
) -> None:
"""Add a user or feed to thicket."""
if subcommand == "user":
-
add_user(username, feed_url, email, homepage, icon, display_name, config_file, auto_discover)
+
add_user(
+
username,
+
feed_url,
+
email,
+
homepage,
+
icon,
+
display_name,
+
config_file,
+
auto_discover,
+
)
elif subcommand == "feed":
add_feed(username, feed_url, config_file)
else:
···
discovered_metadata = asyncio.run(discover_feed_metadata(validated_feed_url))
# Prepare user data with manual overrides taking precedence
-
user_display_name = display_name or (discovered_metadata.author_name or discovered_metadata.title if discovered_metadata else None)
-
user_email = email or (discovered_metadata.author_email if discovered_metadata else None)
-
user_homepage = homepage or (str(discovered_metadata.author_uri or discovered_metadata.link) if discovered_metadata else None)
-
user_icon = icon or (str(discovered_metadata.logo or discovered_metadata.icon or discovered_metadata.image_url) if discovered_metadata else None)
+
user_display_name = display_name or (
+
discovered_metadata.author_name or discovered_metadata.title
+
if discovered_metadata
+
else None
+
)
+
user_email = email or (
+
discovered_metadata.author_email if discovered_metadata else None
+
)
+
user_homepage = homepage or (
+
str(discovered_metadata.author_uri or discovered_metadata.link)
+
if discovered_metadata
+
else None
+
)
+
user_icon = icon or (
+
str(
+
discovered_metadata.logo
+
or discovered_metadata.icon
+
or discovered_metadata.image_url
+
)
+
if discovered_metadata
+
else None
+
)
# Add user to Git store
git_store.add_user(
+247
src/thicket/cli/commands/bot.py
···
+
"""Bot management commands for thicket."""
+
+
import subprocess
+
import sys
+
from pathlib import Path
+
+
import typer
+
from rich.console import Console
+
+
from ..main import app
+
from ..utils import print_error, print_info, print_success
+
+
console = Console()
+
+
+
@app.command()
+
def bot(
+
action: str = typer.Argument(..., help="Action: run, test, or status"),
+
config_file: Path = typer.Option(
+
Path("bot-config/zuliprc"),
+
"--config",
+
"-c",
+
help="Zulip bot configuration file",
+
),
+
thicket_config: Path = typer.Option(
+
Path("thicket.yaml"),
+
"--thicket-config",
+
help="Path to thicket configuration file",
+
),
+
daemon: bool = typer.Option(
+
False,
+
"--daemon",
+
"-d",
+
help="Run bot in daemon mode (background)",
+
),
+
debug_user: str = typer.Option(
+
None,
+
"--debug-user",
+
help="Debug mode: send DMs to this thicket username instead of posting to streams",
+
),
+
) -> None:
+
"""Manage the Thicket Zulip bot.
+
+
Actions:
+
- run: Start the Zulip bot
+
- test: Test bot functionality
+
- status: Show bot status
+
"""
+
+
if action == "run":
+
_run_bot(config_file, thicket_config, daemon, debug_user)
+
elif action == "test":
+
_test_bot()
+
elif action == "status":
+
_bot_status(config_file)
+
else:
+
print_error(f"Unknown action: {action}")
+
print_info("Available actions: run, test, status")
+
raise typer.Exit(1)
+
+
+
def _run_bot(
+
config_file: Path, thicket_config: Path, daemon: bool, debug_user: str = None
+
) -> None:
+
"""Run the Zulip bot."""
+
if not config_file.exists():
+
print_error(f"Configuration file not found: {config_file}")
+
print_info(
+
f"Copy bot-config/zuliprc.template to {config_file} and configure it"
+
)
+
print_info("See bot-config/README.md for setup instructions")
+
raise typer.Exit(1)
+
+
if not thicket_config.exists():
+
print_error(f"Thicket configuration file not found: {thicket_config}")
+
print_info("Run `thicket init` to create a thicket.yaml file")
+
raise typer.Exit(1)
+
+
# Parse zuliprc to extract server URL
+
zulip_site_url = _parse_zulip_config(config_file)
+
+
print_info(f"Starting Thicket Zulip bot with config: {config_file}")
+
print_info(f"Using thicket config: {thicket_config}")
+
+
if debug_user:
+
print_info(
+
f"๐Ÿ› DEBUG MODE: Will send DMs to thicket user '{debug_user}' instead of posting to streams"
+
)
+
+
if daemon:
+
print_info("Running in daemon mode...")
+
else:
+
print_info("Bot will be available as @thicket in your Zulip chat")
+
print_info("Press Ctrl+C to stop the bot")
+
+
try:
+
# Build the command
+
cmd = [
+
sys.executable,
+
"-m",
+
"zulip_bots.run",
+
"src/thicket/bots/thicket_bot.py",
+
"--config-file",
+
str(config_file),
+
]
+
+
# Add environment variables for bot configuration
+
import os
+
+
env = os.environ.copy()
+
+
# Always pass thicket config path
+
env["THICKET_CONFIG_PATH"] = str(thicket_config.absolute())
+
+
# Add debug user if specified
+
if debug_user:
+
env["THICKET_DEBUG_USER"] = debug_user
+
+
# Pass Zulip server URL to bot
+
if zulip_site_url:
+
env["THICKET_ZULIP_SITE_URL"] = zulip_site_url
+
+
if daemon:
+
# Run in background
+
process = subprocess.Popen(
+
cmd,
+
stdout=subprocess.DEVNULL,
+
stderr=subprocess.DEVNULL,
+
start_new_session=True,
+
env=env,
+
)
+
print_success(f"Bot started in background with PID {process.pid}")
+
else:
+
# Run in foreground
+
subprocess.run(cmd, check=True, env=env)
+
+
except subprocess.CalledProcessError as e:
+
print_error(f"Failed to start bot: {e}")
+
raise typer.Exit(1) from e
+
except KeyboardInterrupt:
+
print_info("Bot stopped by user")
+
+
+
def _parse_zulip_config(config_file: Path) -> str:
+
"""Parse zuliprc file to extract the site URL."""
+
try:
+
import configparser
+
+
config = configparser.ConfigParser()
+
config.read(config_file)
+
+
if "api" in config and "site" in config["api"]:
+
site_url = config["api"]["site"]
+
print_info(f"Detected Zulip server: {site_url}")
+
return site_url
+
else:
+
print_error("Could not find 'site' in zuliprc [api] section")
+
return ""
+
+
except Exception as e:
+
print_error(f"Error parsing zuliprc: {e}")
+
return ""
+
+
+
def _test_bot() -> None:
+
"""Test bot functionality."""
+
print_info("Testing Thicket Zulip bot...")
+
+
try:
+
from ...bots.test_bot import BotTester
+
+
# Create bot tester
+
tester = BotTester()
+
+
# Test basic functionality
+
console.print("โœ“ Testing help command...", style="green")
+
responses = tester.send_command("help")
+
assert len(responses) == 1
+
assert "Thicket Feed Bot" in tester.get_last_response_content()
+
+
console.print("โœ“ Testing status command...", style="green")
+
responses = tester.send_command("status")
+
assert len(responses) == 1
+
assert "Status" in tester.get_last_response_content()
+
+
console.print("โœ“ Testing config commands...", style="green")
+
responses = tester.send_command("config stream test-stream")
+
tester.assert_response_contains("Stream set to")
+
+
responses = tester.send_command("config topic test-topic")
+
tester.assert_response_contains("Topic set to")
+
+
responses = tester.send_command("config interval 300")
+
tester.assert_response_contains("Sync interval set to")
+
+
print_success("All bot tests passed!")
+
+
except Exception as e:
+
print_error(f"Bot test failed: {e}")
+
raise typer.Exit(1) from e
+
+
+
def _bot_status(config_file: Path) -> None:
+
"""Show bot status."""
+
console.print("Thicket Zulip Bot Status", style="bold blue")
+
console.print()
+
+
# Check config file
+
if config_file.exists():
+
console.print(f"โœ“ Config file: {config_file}", style="green")
+
else:
+
console.print(f"โœ— Config file not found: {config_file}", style="red")
+
console.print(
+
" Copy bot-config/zuliprc.template and configure it", style="yellow"
+
)
+
console.print(
+
" See bot-config/README.md for setup instructions", style="yellow"
+
)
+
+
# Check dependencies
+
try:
+
import zulip_bots
+
+
version = getattr(zulip_bots, "__version__", "unknown")
+
console.print(f"โœ“ zulip-bots version: {version}", style="green")
+
except ImportError:
+
console.print("โœ— zulip-bots not installed", style="red")
+
+
try:
+
from ...bots.thicket_bot import ThicketBotHandler # noqa: F401
+
+
console.print("โœ“ ThicketBotHandler available", style="green")
+
except ImportError as e:
+
console.print(f"โœ— Bot handler not available: {e}", style="red")
+
+
# Check bot file
+
bot_file = Path("src/thicket/bots/thicket_bot.py")
+
if bot_file.exists():
+
console.print(f"โœ“ Bot file: {bot_file}", style="green")
+
else:
+
console.print(f"โœ— Bot file not found: {bot_file}", style="red")
+
+
console.print()
+
console.print("To run the bot:", style="bold")
+
console.print(f" thicket bot run --config {config_file}")
+
console.print()
+
console.print("For help setting up the bot, see: docs/ZULIP_BOT.md", style="dim")
+7 -3
src/thicket/cli/commands/duplicates.py
···
from ..main import app
from ..utils import (
console,
+
get_tsv_mode,
load_config,
print_error,
print_info,
print_success,
-
get_tsv_mode,
)
···
print_info(f"Total duplicates: {len(duplicates.duplicates)}")
-
def add_duplicate(git_store: GitStore, duplicate_id: Optional[str], canonical_id: Optional[str]) -> None:
+
def add_duplicate(
+
git_store: GitStore, duplicate_id: Optional[str], canonical_id: Optional[str]
+
) -> None:
"""Add a duplicate mapping."""
if not duplicate_id:
print_error("Duplicate ID is required")
···
# Remove the mapping
if git_store.remove_duplicate(duplicate_id):
# Commit changes
-
git_store.commit_changes(f"Remove duplicate mapping: {duplicate_id} -> {canonical_id}")
+
git_store.commit_changes(
+
f"Remove duplicate mapping: {duplicate_id} -> {canonical_id}"
+
)
print_success(f"Removed duplicate mapping: {duplicate_id} -> {canonical_id}")
else:
print_error(f"Failed to remove duplicate mapping: {duplicate_id}")
-427
src/thicket/cli/commands/index_cmd.py
···
-
"""CLI command for building reference index from blog entries."""
-
-
import json
-
from pathlib import Path
-
from typing import Optional
-
-
import typer
-
from rich.console import Console
-
from rich.progress import (
-
BarColumn,
-
Progress,
-
SpinnerColumn,
-
TaskProgressColumn,
-
TextColumn,
-
)
-
from rich.table import Table
-
-
from ...core.git_store import GitStore
-
from ...core.reference_parser import ReferenceIndex, ReferenceParser
-
from ..main import app
-
from ..utils import get_tsv_mode, load_config
-
-
console = Console()
-
-
-
@app.command()
-
def index(
-
config_file: Optional[Path] = typer.Option(
-
None,
-
"--config",
-
"-c",
-
help="Path to configuration file",
-
),
-
output_file: Optional[Path] = typer.Option(
-
None,
-
"--output",
-
"-o",
-
help="Path to output index file (default: updates links.json in git store)",
-
),
-
verbose: bool = typer.Option(
-
False,
-
"--verbose",
-
"-v",
-
help="Show detailed progress information",
-
),
-
) -> None:
-
"""Build a reference index showing which blog entries reference others.
-
-
This command analyzes all blog entries to detect cross-references between
-
different blogs, creating an index that can be used to build threaded
-
views of related content.
-
-
Updates the unified links.json file with reference data.
-
"""
-
try:
-
# Load configuration
-
config = load_config(config_file)
-
-
# Initialize Git store
-
git_store = GitStore(config.git_store)
-
-
# Initialize reference parser
-
parser = ReferenceParser()
-
-
# Build user domain mapping
-
if verbose:
-
console.print("Building user domain mapping...")
-
user_domains = parser.build_user_domain_mapping(git_store)
-
-
if verbose:
-
console.print(f"Found {len(user_domains)} users with {sum(len(d) for d in user_domains.values())} total domains")
-
-
# Initialize reference index
-
ref_index = ReferenceIndex()
-
ref_index.user_domains = user_domains
-
-
# Get all users
-
index = git_store._load_index()
-
users = list(index.users.keys())
-
-
if not users:
-
console.print("[yellow]No users found in Git store[/yellow]")
-
raise typer.Exit(0)
-
-
# Process all entries
-
total_entries = 0
-
total_references = 0
-
all_references = []
-
-
with Progress(
-
SpinnerColumn(),
-
TextColumn("[progress.description]{task.description}"),
-
BarColumn(),
-
TaskProgressColumn(),
-
console=console,
-
) as progress:
-
-
# Count total entries first
-
counting_task = progress.add_task("Counting entries...", total=len(users))
-
entry_counts = {}
-
for username in users:
-
entries = git_store.list_entries(username)
-
entry_counts[username] = len(entries)
-
total_entries += len(entries)
-
progress.advance(counting_task)
-
-
progress.remove_task(counting_task)
-
-
# Process entries - extract references
-
processing_task = progress.add_task(
-
f"Extracting references from {total_entries} entries...",
-
total=total_entries
-
)
-
-
for username in users:
-
entries = git_store.list_entries(username)
-
-
for entry in entries:
-
# Extract references from this entry
-
references = parser.extract_references(entry, username, user_domains)
-
all_references.extend(references)
-
-
progress.advance(processing_task)
-
-
if verbose and references:
-
console.print(f" Found {len(references)} references in {username}:{entry.title[:50]}...")
-
-
progress.remove_task(processing_task)
-
-
# Resolve target_entry_ids for references
-
if all_references:
-
resolve_task = progress.add_task(
-
f"Resolving {len(all_references)} references...",
-
total=len(all_references)
-
)
-
-
if verbose:
-
console.print(f"Resolving target entry IDs for {len(all_references)} references...")
-
-
resolved_references = parser.resolve_target_entry_ids(all_references, git_store)
-
-
# Count resolved references
-
resolved_count = sum(1 for ref in resolved_references if ref.target_entry_id is not None)
-
if verbose:
-
console.print(f"Resolved {resolved_count} out of {len(all_references)} references")
-
-
# Add resolved references to index
-
for ref in resolved_references:
-
ref_index.add_reference(ref)
-
total_references += 1
-
progress.advance(resolve_task)
-
-
progress.remove_task(resolve_task)
-
-
# Determine output path
-
if output_file:
-
output_path = output_file
-
else:
-
output_path = config.git_store / "links.json"
-
-
# Load existing links data or create new structure
-
if output_path.exists() and not output_file:
-
# Load existing unified structure
-
with open(output_path) as f:
-
existing_data = json.load(f)
-
else:
-
# Create new structure
-
existing_data = {
-
"links": {},
-
"reverse_mapping": {},
-
"user_domains": {}
-
}
-
-
# Update with reference data
-
existing_data["references"] = ref_index.to_dict()["references"]
-
existing_data["user_domains"] = {k: list(v) for k, v in user_domains.items()}
-
-
# Save updated structure
-
with open(output_path, "w") as f:
-
json.dump(existing_data, f, indent=2, default=str)
-
-
# Show summary
-
if not get_tsv_mode():
-
console.print("\n[green]โœ“ Reference index built successfully[/green]")
-
-
# Create summary table or TSV output
-
if get_tsv_mode():
-
print("Metric\tCount")
-
print(f"Total Users\t{len(users)}")
-
print(f"Total Entries\t{total_entries}")
-
print(f"Total References\t{total_references}")
-
print(f"Outbound Refs\t{len(ref_index.outbound_refs)}")
-
print(f"Inbound Refs\t{len(ref_index.inbound_refs)}")
-
print(f"Output File\t{output_path}")
-
else:
-
table = Table(title="Reference Index Summary")
-
table.add_column("Metric", style="cyan")
-
table.add_column("Count", style="green")
-
-
table.add_row("Total Users", str(len(users)))
-
table.add_row("Total Entries", str(total_entries))
-
table.add_row("Total References", str(total_references))
-
table.add_row("Outbound Refs", str(len(ref_index.outbound_refs)))
-
table.add_row("Inbound Refs", str(len(ref_index.inbound_refs)))
-
table.add_row("Output File", str(output_path))
-
-
console.print(table)
-
-
# Show some interesting statistics
-
if total_references > 0:
-
if not get_tsv_mode():
-
console.print("\n[bold]Reference Statistics:[/bold]")
-
-
# Most referenced users
-
target_counts = {}
-
unresolved_domains = set()
-
-
for ref in ref_index.references:
-
if ref.target_username:
-
target_counts[ref.target_username] = target_counts.get(ref.target_username, 0) + 1
-
else:
-
# Track unresolved domains
-
from urllib.parse import urlparse
-
domain = urlparse(ref.target_url).netloc.lower()
-
unresolved_domains.add(domain)
-
-
if target_counts:
-
if get_tsv_mode():
-
print("Referenced User\tReference Count")
-
for username, count in sorted(target_counts.items(), key=lambda x: x[1], reverse=True)[:5]:
-
print(f"{username}\t{count}")
-
else:
-
console.print("\nMost referenced users:")
-
for username, count in sorted(target_counts.items(), key=lambda x: x[1], reverse=True)[:5]:
-
console.print(f" {username}: {count} references")
-
-
if unresolved_domains and verbose:
-
if get_tsv_mode():
-
print("Unresolved Domain\tCount")
-
for domain in sorted(list(unresolved_domains)[:10]):
-
print(f"{domain}\t1")
-
if len(unresolved_domains) > 10:
-
print(f"... and {len(unresolved_domains) - 10} more\t...")
-
else:
-
console.print(f"\nUnresolved domains: {len(unresolved_domains)}")
-
for domain in sorted(list(unresolved_domains)[:10]):
-
console.print(f" {domain}")
-
if len(unresolved_domains) > 10:
-
console.print(f" ... and {len(unresolved_domains) - 10} more")
-
-
except Exception as e:
-
console.print(f"[red]Error building reference index: {e}[/red]")
-
if verbose:
-
console.print_exception()
-
raise typer.Exit(1)
-
-
-
@app.command()
-
def threads(
-
config_file: Optional[Path] = typer.Option(
-
None,
-
"--config",
-
"-c",
-
help="Path to configuration file",
-
),
-
index_file: Optional[Path] = typer.Option(
-
None,
-
"--index",
-
"-i",
-
help="Path to reference index file (default: links.json in git store)",
-
),
-
username: Optional[str] = typer.Option(
-
None,
-
"--username",
-
"-u",
-
help="Show threads for specific username only",
-
),
-
entry_id: Optional[str] = typer.Option(
-
None,
-
"--entry",
-
"-e",
-
help="Show thread for specific entry ID",
-
),
-
min_size: int = typer.Option(
-
2,
-
"--min-size",
-
"-m",
-
help="Minimum thread size to display",
-
),
-
) -> None:
-
"""Show threaded view of related blog entries.
-
-
This command uses the reference index to show which blog entries
-
are connected through cross-references, creating an email-style
-
threaded view of the conversation.
-
-
Reads reference data from the unified links.json file.
-
"""
-
try:
-
# Load configuration
-
config = load_config(config_file)
-
-
# Determine index file path
-
if index_file:
-
index_path = index_file
-
else:
-
index_path = config.git_store / "links.json"
-
-
if not index_path.exists():
-
console.print(f"[red]Links file not found: {index_path}[/red]")
-
console.print("Run 'thicket links' and 'thicket index' first to build the reference index")
-
raise typer.Exit(1)
-
-
# Load unified data
-
with open(index_path) as f:
-
unified_data = json.load(f)
-
-
# Check if references exist in the unified structure
-
if "references" not in unified_data:
-
console.print(f"[red]No references found in {index_path}[/red]")
-
console.print("Run 'thicket index' first to build the reference index")
-
raise typer.Exit(1)
-
-
# Extract reference data and reconstruct ReferenceIndex
-
ref_index = ReferenceIndex.from_dict({
-
"references": unified_data["references"],
-
"user_domains": unified_data.get("user_domains", {})
-
})
-
-
# Initialize Git store to get entry details
-
git_store = GitStore(config.git_store)
-
-
if entry_id and username:
-
# Show specific thread
-
thread_members = ref_index.get_thread_members(username, entry_id)
-
_display_thread(thread_members, ref_index, git_store, f"Thread for {username}:{entry_id}")
-
-
elif username:
-
# Show all threads involving this user
-
user_index = git_store._load_index()
-
user = user_index.get_user(username)
-
if not user:
-
console.print(f"[red]User not found: {username}[/red]")
-
raise typer.Exit(1)
-
-
entries = git_store.list_entries(username)
-
threads_found = set()
-
-
console.print(f"[bold]Threads involving {username}:[/bold]\n")
-
-
for entry in entries:
-
thread_members = ref_index.get_thread_members(username, entry.id)
-
if len(thread_members) >= min_size:
-
thread_key = tuple(sorted(thread_members))
-
if thread_key not in threads_found:
-
threads_found.add(thread_key)
-
_display_thread(thread_members, ref_index, git_store, f"Thread #{len(threads_found)}")
-
-
else:
-
# Show all threads
-
console.print("[bold]All conversation threads:[/bold]\n")
-
-
all_threads = set()
-
processed_entries = set()
-
-
# Get all entries
-
user_index = git_store._load_index()
-
for username in user_index.users.keys():
-
entries = git_store.list_entries(username)
-
for entry in entries:
-
entry_key = (username, entry.id)
-
if entry_key in processed_entries:
-
continue
-
-
thread_members = ref_index.get_thread_members(username, entry.id)
-
if len(thread_members) >= min_size:
-
thread_key = tuple(sorted(thread_members))
-
if thread_key not in all_threads:
-
all_threads.add(thread_key)
-
_display_thread(thread_members, ref_index, git_store, f"Thread #{len(all_threads)}")
-
-
# Mark all members as processed
-
for member in thread_members:
-
processed_entries.add(member)
-
-
if not all_threads:
-
console.print("[yellow]No conversation threads found[/yellow]")
-
console.print(f"(minimum thread size: {min_size})")
-
-
except Exception as e:
-
console.print(f"[red]Error showing threads: {e}[/red]")
-
raise typer.Exit(1)
-
-
-
def _display_thread(thread_members, ref_index, git_store, title):
-
"""Display a single conversation thread."""
-
console.print(f"[bold cyan]{title}[/bold cyan]")
-
console.print(f"Thread size: {len(thread_members)} entries")
-
-
# Get entry details for each member
-
thread_entries = []
-
for username, entry_id in thread_members:
-
entry = git_store.get_entry(username, entry_id)
-
if entry:
-
thread_entries.append((username, entry))
-
-
# Sort by publication date
-
thread_entries.sort(key=lambda x: x[1].published or x[1].updated)
-
-
# Display entries
-
for i, (username, entry) in enumerate(thread_entries):
-
prefix = "โ”œโ”€" if i < len(thread_entries) - 1 else "โ””โ”€"
-
-
# Get references for this entry
-
outbound = ref_index.get_outbound_refs(username, entry.id)
-
inbound = ref_index.get_inbound_refs(username, entry.id)
-
-
ref_info = ""
-
if outbound or inbound:
-
ref_info = f" ({len(outbound)} out, {len(inbound)} in)"
-
-
console.print(f" {prefix} [{username}] {entry.title[:60]}...{ref_info}")
-
-
if entry.published:
-
console.print(f" Published: {entry.published.strftime('%Y-%m-%d')}")
-
-
console.print() # Empty line after each thread
+106 -119
src/thicket/cli/commands/info_cmd.py
···
"""CLI command for displaying detailed information about a specific atom entry."""
-
import json
from pathlib import Path
from typing import Optional
···
from rich.console import Console
from rich.panel import Panel
from rich.table import Table
-
from rich.text import Text
from ...core.git_store import GitStore
-
from ...core.reference_parser import ReferenceIndex
from ..main import app
-
from ..utils import load_config, get_tsv_mode
+
from ..utils import get_tsv_mode, load_config
console = Console()
···
@app.command()
def info(
identifier: str = typer.Argument(
-
...,
-
help="The atom ID or URL of the entry to display information about"
+
..., help="The atom ID or URL of the entry to display information about"
),
username: Optional[str] = typer.Option(
None,
"--username",
"-u",
-
help="Username to search for the entry (if not provided, searches all users)"
+
help="Username to search for the entry (if not provided, searches all users)",
),
config_file: Optional[Path] = typer.Option(
Path("thicket.yaml"),
···
help="Path to configuration file",
),
show_content: bool = typer.Option(
-
False,
-
"--content",
-
help="Include the full content of the entry in the output"
+
False, "--content", help="Include the full content of the entry in the output"
),
) -> None:
"""Display detailed information about a specific atom entry.
-
+
You can specify the entry using either its atom ID or URL.
Shows all metadata for the given entry, including title, dates, categories,
and summarizes all inbound and outbound links to/from other posts.
···
try:
# Load configuration
config = load_config(config_file)
-
+
# Initialize Git store
git_store = GitStore(config.git_store)
-
+
# Find the entry
entry = None
found_username = None
-
+
# Check if identifier looks like a URL
-
is_url = identifier.startswith(('http://', 'https://'))
-
+
is_url = identifier.startswith(("http://", "https://"))
+
if username:
# Search specific username
if is_url:
···
if entry:
found_username = user
break
-
+
if not entry or not found_username:
if username:
-
console.print(f"[red]Entry with {'URL' if is_url else 'atom ID'} '{identifier}' not found for user '{username}'[/red]")
+
console.print(
+
f"[red]Entry with {'URL' if is_url else 'atom ID'} '{identifier}' not found for user '{username}'[/red]"
+
)
else:
-
console.print(f"[red]Entry with {'URL' if is_url else 'atom ID'} '{identifier}' not found in any user's entries[/red]")
+
console.print(
+
f"[red]Entry with {'URL' if is_url else 'atom ID'} '{identifier}' not found in any user's entries[/red]"
+
)
raise typer.Exit(1)
-
-
# Load reference index if available
-
links_path = config.git_store / "links.json"
-
ref_index = None
-
if links_path.exists():
-
with open(links_path) as f:
-
unified_data = json.load(f)
-
-
# Check if references exist in the unified structure
-
if "references" in unified_data:
-
ref_index = ReferenceIndex.from_dict({
-
"references": unified_data["references"],
-
"user_domains": unified_data.get("user_domains", {})
-
})
-
+
# Display information
if get_tsv_mode():
-
_display_entry_info_tsv(entry, found_username, ref_index, show_content)
+
_display_entry_info_tsv(entry, found_username, show_content)
else:
_display_entry_info(entry, found_username)
-
-
if ref_index:
-
_display_link_info(entry, found_username, ref_index)
-
else:
-
console.print("\n[yellow]No reference index found. Run 'thicket links' and 'thicket index' to build cross-reference data.[/yellow]")
-
+
+
# Display links and backlinks from entry fields
+
_display_link_info(entry, found_username, git_store)
+
# Optionally display content
if show_content and entry.content:
_display_content(entry.content)
-
+
except Exception as e:
console.print(f"[red]Error displaying entry info: {e}[/red]")
-
raise typer.Exit(1)
+
raise typer.Exit(1) from e
def _display_entry_info(entry, username: str) -> None:
"""Display basic entry information in a structured format."""
-
+
# Create main info panel
info_table = Table.grid(padding=(0, 2))
info_table.add_column("Field", style="cyan bold", width=15)
info_table.add_column("Value", style="white")
-
+
info_table.add_row("User", f"[green]{username}[/green]")
info_table.add_row("Atom ID", f"[blue]{entry.id}[/blue]")
info_table.add_row("Title", entry.title)
info_table.add_row("Link", str(entry.link))
-
+
if entry.published:
-
info_table.add_row("Published", entry.published.strftime("%Y-%m-%d %H:%M:%S UTC"))
-
+
info_table.add_row(
+
"Published", entry.published.strftime("%Y-%m-%d %H:%M:%S UTC")
+
)
+
info_table.add_row("Updated", entry.updated.strftime("%Y-%m-%d %H:%M:%S UTC"))
-
+
if entry.summary:
# Truncate long summaries
-
summary = entry.summary[:200] + "..." if len(entry.summary) > 200 else entry.summary
+
summary = (
+
entry.summary[:200] + "..." if len(entry.summary) > 200 else entry.summary
+
)
info_table.add_row("Summary", summary)
-
+
if entry.categories:
categories_text = ", ".join(entry.categories)
info_table.add_row("Categories", categories_text)
-
+
if entry.author:
author_info = []
if "name" in entry.author:
···
author_info.append(f"<{entry.author['email']}>")
if author_info:
info_table.add_row("Author", " ".join(author_info))
-
+
if entry.content_type:
info_table.add_row("Content Type", entry.content_type)
-
+
if entry.rights:
info_table.add_row("Rights", entry.rights)
-
+
if entry.source:
info_table.add_row("Source Feed", entry.source)
-
+
panel = Panel(
-
info_table,
-
title=f"[bold]Entry Information[/bold]",
-
border_style="blue"
+
info_table, title="[bold]Entry Information[/bold]", border_style="blue"
)
-
+
console.print(panel)
-
def _display_link_info(entry, username: str, ref_index: ReferenceIndex) -> None:
+
def _display_link_info(entry, username: str, git_store: GitStore) -> None:
"""Display inbound and outbound link information."""
-
-
# Get links
-
outbound_refs = ref_index.get_outbound_refs(username, entry.id)
-
inbound_refs = ref_index.get_inbound_refs(username, entry.id)
-
-
if not outbound_refs and not inbound_refs:
+
+
# Get links from entry fields
+
outbound_links = getattr(entry, "links", [])
+
backlinks = getattr(entry, "backlinks", [])
+
+
if not outbound_links and not backlinks:
console.print("\n[dim]No cross-references found for this entry.[/dim]")
return
-
+
# Create links table
links_table = Table(title="Cross-References")
links_table.add_column("Direction", style="cyan", width=10)
-
links_table.add_column("Target/Source", style="green", width=20)
-
links_table.add_column("URL", style="blue", width=50)
-
-
# Add outbound references
-
for ref in outbound_refs:
-
target_info = f"{ref.target_username}:{ref.target_entry_id}" if ref.target_username and ref.target_entry_id else "External"
-
links_table.add_row("โ†’ Out", target_info, ref.target_url)
-
-
# Add inbound references
-
for ref in inbound_refs:
-
source_info = f"{ref.source_username}:{ref.source_entry_id}"
-
links_table.add_row("โ† In", source_info, ref.target_url)
-
+
links_table.add_column("Target/Source", style="green", width=30)
+
links_table.add_column("URL/ID", style="blue", width=60)
+
+
# Add outbound links
+
for link in outbound_links:
+
links_table.add_row("โ†’ Out", "External/Other", link)
+
+
# Add backlinks (inbound references)
+
for backlink_id in backlinks:
+
# Try to find which user this entry belongs to
+
source_info = backlink_id
+
# Could enhance this by looking up the actual entry to get username
+
links_table.add_row("โ† In", "Entry", source_info)
+
console.print()
console.print(links_table)
-
+
# Summary
-
console.print(f"\n[bold]Summary:[/bold] {len(outbound_refs)} outbound, {len(inbound_refs)} inbound references")
+
console.print(
+
f"\n[bold]Summary:[/bold] {len(outbound_links)} outbound links, {len(backlinks)} inbound backlinks"
+
)
def _display_content(content: str) -> None:
"""Display the full content of the entry."""
-
+
# Truncate very long content
display_content = content
if len(content) > 5000:
display_content = content[:5000] + "\n\n[... content truncated ...]"
-
+
panel = Panel(
display_content,
title="[bold]Entry Content[/bold]",
border_style="green",
-
expand=False
+
expand=False,
)
-
+
console.print()
console.print(panel)
-
def _display_entry_info_tsv(entry, username: str, ref_index: Optional[ReferenceIndex], show_content: bool) -> None:
+
def _display_entry_info_tsv(entry, username: str, show_content: bool) -> None:
"""Display entry information in TSV format."""
-
+
# Basic info
print("Field\tValue")
print(f"User\t{username}")
print(f"Atom ID\t{entry.id}")
-
print(f"Title\t{entry.title.replace(chr(9), ' ').replace(chr(10), ' ').replace(chr(13), ' ')}")
+
print(
+
f"Title\t{entry.title.replace(chr(9), ' ').replace(chr(10), ' ').replace(chr(13), ' ')}"
+
)
print(f"Link\t{entry.link}")
-
+
if entry.published:
print(f"Published\t{entry.published.strftime('%Y-%m-%d %H:%M:%S UTC')}")
-
+
print(f"Updated\t{entry.updated.strftime('%Y-%m-%d %H:%M:%S UTC')}")
-
+
if entry.summary:
# Escape tabs and newlines in summary
-
summary = entry.summary.replace('\t', ' ').replace('\n', ' ').replace('\r', ' ')
+
summary = entry.summary.replace("\t", " ").replace("\n", " ").replace("\r", " ")
print(f"Summary\t{summary}")
-
+
if entry.categories:
print(f"Categories\t{', '.join(entry.categories)}")
-
+
if entry.author:
author_info = []
if "name" in entry.author:
···
author_info.append(f"<{entry.author['email']}>")
if author_info:
print(f"Author\t{' '.join(author_info)}")
-
+
if entry.content_type:
print(f"Content Type\t{entry.content_type}")
-
+
if entry.rights:
print(f"Rights\t{entry.rights}")
-
+
if entry.source:
print(f"Source Feed\t{entry.source}")
-
-
# Add reference info if available
-
if ref_index:
-
outbound_refs = ref_index.get_outbound_refs(username, entry.id)
-
inbound_refs = ref_index.get_inbound_refs(username, entry.id)
-
-
print(f"Outbound References\t{len(outbound_refs)}")
-
print(f"Inbound References\t{len(inbound_refs)}")
-
-
# Show each reference
-
for ref in outbound_refs:
-
target_info = f"{ref.target_username}:{ref.target_entry_id}" if ref.target_username and ref.target_entry_id else "External"
-
print(f"Outbound Reference\t{target_info}\t{ref.target_url}")
-
-
for ref in inbound_refs:
-
source_info = f"{ref.source_username}:{ref.source_entry_id}"
-
print(f"Inbound Reference\t{source_info}\t{ref.target_url}")
-
+
+
# Add links info from entry fields
+
outbound_links = getattr(entry, "links", [])
+
backlinks = getattr(entry, "backlinks", [])
+
+
if outbound_links or backlinks:
+
print(f"Outbound Links\t{len(outbound_links)}")
+
print(f"Backlinks\t{len(backlinks)}")
+
+
# Show each link
+
for link in outbound_links:
+
print(f"โ†’ Link\t{link}")
+
+
for backlink_id in backlinks:
+
print(f"โ† Backlink\t{backlink_id}")
+
# Show content if requested
if show_content and entry.content:
# Escape tabs and newlines in content
-
content = entry.content.replace('\t', ' ').replace('\n', ' ').replace('\r', ' ')
-
print(f"Content\t{content}")
+
content = entry.content.replace("\t", " ").replace("\n", " ").replace("\r", " ")
+
print(f"Content\t{content}")
+5 -6
src/thicket/cli/commands/init.py
···
@app.command()
def init(
-
git_store: Path = typer.Argument(..., help="Path to Git repository for storing feeds"),
+
git_store: Path = typer.Argument(
+
..., help="Path to Git repository for storing feeds"
+
),
cache_dir: Optional[Path] = typer.Option(
None, "--cache-dir", "-c", help="Cache directory (default: ~/.cache/thicket)"
),
···
# Set default paths
if cache_dir is None:
from platformdirs import user_cache_dir
+
cache_dir = Path(user_cache_dir("thicket"))
if config_file is None:
···
# Create configuration
try:
-
config = ThicketConfig(
-
git_store=git_store,
-
cache_dir=cache_dir,
-
users=[]
-
)
+
config = ThicketConfig(git_store=git_store, cache_dir=cache_dir, users=[])
save_config(config, config_file)
print_success(f"Created configuration file: {config_file}")
-423
src/thicket/cli/commands/links_cmd.py
···
-
"""CLI command for extracting and categorizing all outbound links from blog entries."""
-
-
import json
-
import re
-
from pathlib import Path
-
from typing import Dict, List, Optional, Set
-
from urllib.parse import urljoin, urlparse
-
-
import typer
-
from rich.console import Console
-
from rich.progress import Progress, SpinnerColumn, TextColumn, BarColumn, TaskProgressColumn
-
from rich.table import Table
-
-
from ...core.git_store import GitStore
-
from ..main import app
-
from ..utils import load_config, get_tsv_mode
-
-
console = Console()
-
-
-
class LinkData:
-
"""Represents a link found in a blog entry."""
-
-
def __init__(self, url: str, entry_id: str, username: str):
-
self.url = url
-
self.entry_id = entry_id
-
self.username = username
-
-
def to_dict(self) -> dict:
-
"""Convert to dictionary for JSON serialization."""
-
return {
-
"url": self.url,
-
"entry_id": self.entry_id,
-
"username": self.username
-
}
-
-
@classmethod
-
def from_dict(cls, data: dict) -> "LinkData":
-
"""Create from dictionary."""
-
return cls(
-
url=data["url"],
-
entry_id=data["entry_id"],
-
username=data["username"]
-
)
-
-
-
class LinkCategorizer:
-
"""Categorizes links as internal, user, or unknown."""
-
-
def __init__(self, user_domains: Dict[str, Set[str]]):
-
self.user_domains = user_domains
-
# Create reverse mapping of domain -> username
-
self.domain_to_user = {}
-
for username, domains in user_domains.items():
-
for domain in domains:
-
self.domain_to_user[domain] = username
-
-
def categorize_url(self, url: str, source_username: str) -> tuple[str, Optional[str]]:
-
"""
-
Categorize a URL as 'internal', 'user', or 'unknown'.
-
Returns (category, target_username).
-
"""
-
try:
-
parsed = urlparse(url)
-
domain = parsed.netloc.lower()
-
-
# Check if it's a link to the same user's domain (internal)
-
if domain in self.user_domains.get(source_username, set()):
-
return "internal", source_username
-
-
# Check if it's a link to another user's domain
-
if domain in self.domain_to_user:
-
return "user", self.domain_to_user[domain]
-
-
# Everything else is unknown
-
return "unknown", None
-
-
except Exception:
-
return "unknown", None
-
-
-
class LinkExtractor:
-
"""Extracts and resolves links from blog entries."""
-
-
def __init__(self):
-
# Pattern for extracting links from HTML
-
self.link_pattern = re.compile(r'<a[^>]+href="([^"]+)"[^>]*>(.*?)</a>', re.IGNORECASE | re.DOTALL)
-
self.url_pattern = re.compile(r'https?://[^\s<>"]+')
-
-
def extract_links_from_html(self, html_content: str, base_url: str) -> List[tuple[str, str]]:
-
"""Extract all links from HTML content and resolve them against base URL."""
-
links = []
-
-
# Extract links from <a> tags
-
for match in self.link_pattern.finditer(html_content):
-
url = match.group(1)
-
text = re.sub(r'<[^>]+>', '', match.group(2)).strip() # Remove HTML tags from link text
-
-
# Resolve relative URLs against base URL
-
resolved_url = urljoin(base_url, url)
-
links.append((resolved_url, text))
-
-
return links
-
-
-
def extract_links_from_entry(self, entry, username: str, base_url: str) -> List[LinkData]:
-
"""Extract all links from a blog entry."""
-
links = []
-
-
# Combine all text content for analysis
-
content_to_search = []
-
if entry.content:
-
content_to_search.append(entry.content)
-
if entry.summary:
-
content_to_search.append(entry.summary)
-
-
for content in content_to_search:
-
extracted_links = self.extract_links_from_html(content, base_url)
-
-
for url, link_text in extracted_links:
-
# Skip empty URLs
-
if not url or url.startswith('#'):
-
continue
-
-
link_data = LinkData(
-
url=url,
-
entry_id=entry.id,
-
username=username
-
)
-
-
links.append(link_data)
-
-
return links
-
-
-
@app.command()
-
def links(
-
config_file: Optional[Path] = typer.Option(
-
Path("thicket.yaml"),
-
"--config",
-
"-c",
-
help="Path to configuration file",
-
),
-
output_file: Optional[Path] = typer.Option(
-
None,
-
"--output",
-
"-o",
-
help="Path to output unified links file (default: links.json in git store)",
-
),
-
verbose: bool = typer.Option(
-
False,
-
"--verbose",
-
"-v",
-
help="Show detailed progress information",
-
),
-
) -> None:
-
"""Extract and categorize all outbound links from blog entries.
-
-
This command analyzes all blog entries to extract outbound links,
-
resolve them properly with respect to the feed's base URL, and
-
categorize them as internal, user, or unknown links.
-
-
Creates a unified links.json file containing all link data.
-
"""
-
try:
-
# Load configuration
-
config = load_config(config_file)
-
-
# Initialize Git store
-
git_store = GitStore(config.git_store)
-
-
# Build user domain mapping
-
if verbose:
-
console.print("Building user domain mapping...")
-
-
index = git_store._load_index()
-
user_domains = {}
-
-
for username, user_metadata in index.users.items():
-
domains = set()
-
-
# Add domains from feeds
-
for feed_url in user_metadata.feeds:
-
domain = urlparse(feed_url).netloc.lower()
-
if domain:
-
domains.add(domain)
-
-
# Add domain from homepage
-
if user_metadata.homepage:
-
domain = urlparse(str(user_metadata.homepage)).netloc.lower()
-
if domain:
-
domains.add(domain)
-
-
user_domains[username] = domains
-
-
if verbose:
-
console.print(f"Found {len(user_domains)} users with {sum(len(d) for d in user_domains.values())} total domains")
-
-
# Initialize components
-
link_extractor = LinkExtractor()
-
categorizer = LinkCategorizer(user_domains)
-
-
# Get all users
-
users = list(index.users.keys())
-
-
if not users:
-
console.print("[yellow]No users found in Git store[/yellow]")
-
raise typer.Exit(0)
-
-
# Process all entries
-
all_links = []
-
link_categories = {"internal": [], "user": [], "unknown": []}
-
link_dict = {} # Dictionary with link URL as key, maps to list of atom IDs
-
reverse_dict = {} # Dictionary with atom ID as key, maps to list of URLs
-
-
with Progress(
-
SpinnerColumn(),
-
TextColumn("[progress.description]{task.description}"),
-
BarColumn(),
-
TaskProgressColumn(),
-
console=console,
-
) as progress:
-
-
# Count total entries first
-
counting_task = progress.add_task("Counting entries...", total=len(users))
-
total_entries = 0
-
-
for username in users:
-
entries = git_store.list_entries(username)
-
total_entries += len(entries)
-
progress.advance(counting_task)
-
-
progress.remove_task(counting_task)
-
-
# Process entries
-
processing_task = progress.add_task(
-
f"Processing {total_entries} entries...",
-
total=total_entries
-
)
-
-
for username in users:
-
entries = git_store.list_entries(username)
-
user_metadata = index.users[username]
-
-
# Get base URL for this user (use first feed URL)
-
base_url = str(user_metadata.feeds[0]) if user_metadata.feeds else "https://example.com"
-
-
for entry in entries:
-
# Extract links from this entry
-
entry_links = link_extractor.extract_links_from_entry(entry, username, base_url)
-
-
# Track unique links per entry
-
entry_urls_seen = set()
-
-
# Categorize each link
-
for link_data in entry_links:
-
# Skip if we've already seen this URL in this entry
-
if link_data.url in entry_urls_seen:
-
continue
-
entry_urls_seen.add(link_data.url)
-
-
category, target_username = categorizer.categorize_url(link_data.url, username)
-
-
# Add to link dictionary (URL as key, maps to list of atom IDs)
-
if link_data.url not in link_dict:
-
link_dict[link_data.url] = []
-
if link_data.entry_id not in link_dict[link_data.url]:
-
link_dict[link_data.url].append(link_data.entry_id)
-
-
# Also add to reverse mapping (atom ID -> list of URLs)
-
if link_data.entry_id not in reverse_dict:
-
reverse_dict[link_data.entry_id] = []
-
if link_data.url not in reverse_dict[link_data.entry_id]:
-
reverse_dict[link_data.entry_id].append(link_data.url)
-
-
# Add category info to link data for categories tracking
-
link_info = link_data.to_dict()
-
link_info["category"] = category
-
link_info["target_username"] = target_username
-
-
all_links.append(link_info)
-
link_categories[category].append(link_info)
-
-
progress.advance(processing_task)
-
-
if verbose and entry_links:
-
console.print(f" Found {len(entry_links)} links in {username}:{entry.title[:50]}...")
-
-
# Determine output path
-
if output_file:
-
output_path = output_file
-
else:
-
output_path = config.git_store / "links.json"
-
-
# Save all extracted links (not just filtered ones)
-
if verbose:
-
console.print("Preparing output data...")
-
-
# Build a set of all URLs that correspond to posts in the git database
-
registered_urls = set()
-
-
# Get all entries from all users and build URL mappings
-
for username in users:
-
entries = git_store.list_entries(username)
-
user_metadata = index.users[username]
-
-
for entry in entries:
-
# Try to match entry URLs with extracted links
-
if hasattr(entry, 'link') and entry.link:
-
registered_urls.add(str(entry.link))
-
-
# Also check entry alternate links if they exist
-
if hasattr(entry, 'links') and entry.links:
-
for link in entry.links:
-
if hasattr(link, 'href') and link.href:
-
registered_urls.add(str(link.href))
-
-
# Build unified structure with metadata
-
unified_links = {}
-
reverse_mapping = {}
-
-
for url, entry_ids in link_dict.items():
-
is_tracked = url in registered_urls
-
target_username = None
-
-
# Find target username if this is a tracked post
-
if is_tracked:
-
for username in users:
-
user_domains_set = {domain for domain in user_domains.get(username, [])}
-
if any(domain in url for domain in user_domains_set):
-
target_username = username
-
break
-
-
unified_links[url] = {
-
"referencing_entries": entry_ids,
-
"is_tracked_post": is_tracked
-
}
-
-
if target_username:
-
unified_links[url]["target_username"] = target_username
-
-
# Build reverse mapping
-
for entry_id in entry_ids:
-
if entry_id not in reverse_mapping:
-
reverse_mapping[entry_id] = []
-
if url not in reverse_mapping[entry_id]:
-
reverse_mapping[entry_id].append(url)
-
-
# Create unified output data
-
output_data = {
-
"links": unified_links,
-
"reverse_mapping": reverse_mapping,
-
"user_domains": {k: list(v) for k, v in user_domains.items()}
-
}
-
-
if verbose:
-
console.print(f"Found {len(registered_urls)} registered post URLs")
-
console.print(f"Found {len(link_dict)} total links, {sum(1 for link in unified_links.values() if link['is_tracked_post'])} tracked posts")
-
-
# Save unified data
-
with open(output_path, "w") as f:
-
json.dump(output_data, f, indent=2, default=str)
-
-
# Show summary
-
if not get_tsv_mode():
-
console.print("\n[green]โœ“ Links extraction completed successfully[/green]")
-
-
# Create summary table or TSV output
-
if get_tsv_mode():
-
print("Category\tCount\tDescription")
-
print(f"Internal\t{len(link_categories['internal'])}\tLinks to same user's domain")
-
print(f"User\t{len(link_categories['user'])}\tLinks to other tracked users")
-
print(f"Unknown\t{len(link_categories['unknown'])}\tLinks to external sites")
-
print(f"Total Extracted\t{len(all_links)}\tAll extracted links")
-
print(f"Saved to Output\t{len(output_data['links'])}\tLinks saved to output file")
-
print(f"Cross-references\t{sum(1 for link in unified_links.values() if link['is_tracked_post'])}\tLinks to registered posts only")
-
else:
-
table = Table(title="Links Summary")
-
table.add_column("Category", style="cyan")
-
table.add_column("Count", style="green")
-
table.add_column("Description", style="white")
-
-
table.add_row("Internal", str(len(link_categories["internal"])), "Links to same user's domain")
-
table.add_row("User", str(len(link_categories["user"])), "Links to other tracked users")
-
table.add_row("Unknown", str(len(link_categories["unknown"])), "Links to external sites")
-
table.add_row("Total Extracted", str(len(all_links)), "All extracted links")
-
table.add_row("Saved to Output", str(len(output_data['links'])), "Links saved to output file")
-
table.add_row("Cross-references", str(sum(1 for link in unified_links.values() if link['is_tracked_post'])), "Links to registered posts only")
-
-
console.print(table)
-
-
# Show user links if verbose
-
if verbose and link_categories["user"]:
-
if get_tsv_mode():
-
print("User Link Source\tUser Link Target\tLink Count")
-
user_link_counts = {}
-
-
for link in link_categories["user"]:
-
key = f"{link['username']} -> {link['target_username']}"
-
user_link_counts[key] = user_link_counts.get(key, 0) + 1
-
-
for link_pair, count in sorted(user_link_counts.items(), key=lambda x: x[1], reverse=True)[:10]:
-
source, target = link_pair.split(" -> ")
-
print(f"{source}\t{target}\t{count}")
-
else:
-
console.print("\n[bold]User-to-user links:[/bold]")
-
user_link_counts = {}
-
-
for link in link_categories["user"]:
-
key = f"{link['username']} -> {link['target_username']}"
-
user_link_counts[key] = user_link_counts.get(key, 0) + 1
-
-
for link_pair, count in sorted(user_link_counts.items(), key=lambda x: x[1], reverse=True)[:10]:
-
console.print(f" {link_pair}: {count} links")
-
-
if not get_tsv_mode():
-
console.print(f"\nUnified links data saved to: {output_path}")
-
-
except Exception as e:
-
console.print(f"[red]Error extracting links: {e}[/red]")
-
if verbose:
-
console.print_exception()
-
raise typer.Exit(1)
+11 -11
src/thicket/cli/commands/list_cmd.py
···
from ..main import app
from ..utils import (
console,
+
get_tsv_mode,
load_config,
+
print_entries_tsv,
print_error,
-
print_feeds_table,
print_feeds_table_from_git,
print_info,
-
print_users_table,
print_users_table_from_git,
-
print_entries_tsv,
-
get_tsv_mode,
)
···
"""List all users."""
index = git_store._load_index()
users = list(index.users.values())
-
+
if not users:
print_info("No users configured")
return
···
print_feeds_table_from_git(git_store, username)
-
def list_entries(git_store: GitStore, username: Optional[str] = None, limit: Optional[int] = None) -> None:
+
def list_entries(
+
git_store: GitStore, username: Optional[str] = None, limit: Optional[int] = None
+
) -> None:
"""List entries, optionally filtered by user."""
if username:
···
"""Clean HTML content for display in table."""
if not content:
return ""
-
+
# Remove HTML tags
-
clean_text = re.sub(r'<[^>]+>', ' ', content)
+
clean_text = re.sub(r"<[^>]+>", " ", content)
# Replace multiple whitespace with single space
-
clean_text = re.sub(r'\s+', ' ', clean_text)
+
clean_text = re.sub(r"\s+", " ", clean_text)
# Strip and limit length
clean_text = clean_text.strip()
if len(clean_text) > 100:
clean_text = clean_text[:97] + "..."
-
+
return clean_text
···
if get_tsv_mode():
print_entries_tsv(entries_by_user, usernames)
return
-
+
table = Table(title="Feed Entries")
table.add_column("User", style="cyan", no_wrap=True)
table.add_column("Title", style="bold")
+301
src/thicket/cli/commands/search.py
···
+
"""Search command for thicket CLI."""
+
+
import logging
+
from pathlib import Path
+
from typing import Optional
+
+
import typer
+
from rich.console import Console
+
from rich.table import Table
+
+
from ...core.typesense_client import TypesenseClient, TypesenseConfig
+
from ..main import app
+
+
console = Console()
+
logger = logging.getLogger(__name__)
+
+
+
def _load_typesense_config() -> tuple[Optional[str], Optional[str]]:
+
"""Load Typesense URL and API key from ~/.typesense directory."""
+
typesense_dir = Path.home() / ".typesense"
+
url_file = typesense_dir / "url"
+
key_file = typesense_dir / "api_key"
+
+
url = None
+
api_key = None
+
+
try:
+
if url_file.exists():
+
url = url_file.read_text().strip()
+
except Exception as e:
+
logger.debug(f"Could not read Typesense URL from {url_file}: {e}")
+
+
try:
+
if key_file.exists():
+
api_key = key_file.read_text().strip()
+
except Exception as e:
+
logger.debug(f"Could not read Typesense API key from {key_file}: {e}")
+
+
return url, api_key
+
+
+
@app.command("search")
+
def search_command(
+
query: str = typer.Argument(..., help="Search query"),
+
typesense_url: Optional[str] = typer.Option(
+
None,
+
"--typesense-url",
+
"-u",
+
help="Typesense server URL (e.g., http://localhost:8108). Defaults to ~/.typesense/url",
+
),
+
api_key: Optional[str] = typer.Option(
+
None,
+
"--api-key",
+
"-k",
+
help="Typesense API key. Defaults to ~/.typesense/api_key",
+
hide_input=True,
+
),
+
collection_name: str = typer.Option(
+
"thicket",
+
"--collection",
+
"-c",
+
help="Typesense collection name",
+
),
+
config_path: Optional[str] = typer.Option(
+
None,
+
"--config",
+
"-C",
+
help="Path to thicket configuration file",
+
),
+
limit: int = typer.Option(
+
20,
+
"--limit",
+
"-l",
+
help="Maximum number of results to display",
+
),
+
user: Optional[str] = typer.Option(
+
None,
+
"--user",
+
help="Filter results by specific user",
+
),
+
timeout: int = typer.Option(
+
10,
+
"--timeout",
+
"-t",
+
help="Connection timeout in seconds",
+
),
+
raw: bool = typer.Option(
+
False,
+
"--raw",
+
help="Display raw JSON output instead of formatted table",
+
),
+
) -> None:
+
"""Search thicket entries using Typesense full-text and semantic search.
+
+
This command searches through all entries in the Typesense collection
+
using the provided query. The search covers entry titles, content,
+
summaries, user information, and metadata.
+
+
Examples:
+
+
# Basic search
+
thicket search "machine learning"
+
+
# Search with user filter
+
thicket search "python programming" --user avsm
+
+
# Limit results
+
thicket search "web development" --limit 10
+
+
# Get raw JSON output
+
thicket search "database" --raw
+
"""
+
try:
+
# Load Typesense configuration from defaults if not provided
+
default_url, default_api_key = _load_typesense_config()
+
+
# Use provided values or defaults
+
final_url = typesense_url or default_url
+
final_api_key = api_key or default_api_key
+
+
# Check that we have required configuration
+
if not final_url:
+
console.print("[red]Error: Typesense URL is required[/red]")
+
console.print(
+
"Either provide --typesense-url or create ~/.typesense/url file"
+
)
+
raise typer.Exit(1)
+
+
if not final_api_key:
+
console.print("[red]Error: Typesense API key is required[/red]")
+
console.print(
+
"Either provide --api-key or create ~/.typesense/api_key file"
+
)
+
raise typer.Exit(1)
+
+
# Create Typesense configuration
+
typesense_config = TypesenseConfig.from_url(
+
final_url, final_api_key, collection_name
+
)
+
typesense_config.connection_timeout = timeout
+
+
console.print("[bold blue]Searching thicket entries[/bold blue]")
+
console.print(f"Query: [cyan]{query}[/cyan]")
+
if user:
+
console.print(f"User filter: [yellow]{user}[/yellow]")
+
+
# Initialize Typesense client
+
typesense_client = TypesenseClient(typesense_config)
+
+
# Prepare search parameters
+
search_params = {
+
"per_page": limit,
+
}
+
+
# Add user filter if specified
+
if user:
+
search_params["filter_by"] = f"username:{user}"
+
+
# Perform search
+
try:
+
results = typesense_client.search(query, search_params)
+
+
if raw:
+
import json
+
+
console.print(json.dumps(results, indent=2))
+
return
+
+
# Display results
+
_display_search_results(results, query)
+
+
except Exception as e:
+
console.print(f"[red]โŒ Search failed: {e}[/red]")
+
raise typer.Exit(1) from e
+
+
except Exception as e:
+
logger.error(f"Search failed: {e}")
+
console.print(f"[red]Error: {e}[/red]")
+
raise typer.Exit(1) from e
+
+
+
def _display_search_results(results: dict, query: str) -> None:
+
"""Display search results in a formatted table."""
+
hits = results.get("hits", [])
+
found = results.get("found", 0)
+
search_time = results.get("search_time_ms", 0)
+
+
if not hits:
+
console.print("\n[yellow]No results found.[/yellow]")
+
return
+
+
console.print(f"\n[green]Found {found} results in {search_time}ms[/green]")
+
+
table = Table(title=f"Search Results for '{query}'", show_lines=True)
+
table.add_column("Score", style="green", width=8, no_wrap=True)
+
table.add_column("User", style="cyan", width=15, no_wrap=True)
+
table.add_column("Title", style="bold", width=45)
+
table.add_column("Updated", style="blue", width=12, no_wrap=True)
+
table.add_column("Summary", style="dim", width=50)
+
+
for hit in hits:
+
doc = hit["document"]
+
+
# Format score
+
score = f"{hit.get('text_match', 0):.2f}"
+
+
# Format user
+
user_display = doc.get("user_display_name", doc.get("username", "Unknown"))
+
if len(user_display) > 12:
+
user_display = user_display[:9] + "..."
+
+
# Format title
+
title = doc.get("title", "Untitled")
+
if len(title) > 40:
+
title = title[:37] + "..."
+
+
# Format date
+
updated_timestamp = doc.get("updated", 0)
+
if updated_timestamp:
+
from datetime import datetime
+
+
updated_date = datetime.fromtimestamp(updated_timestamp)
+
updated_str = updated_date.strftime("%Y-%m-%d")
+
else:
+
updated_str = "Unknown"
+
+
# Format summary
+
summary = doc.get("summary") or doc.get("content", "")
+
if summary:
+
# Remove HTML tags and truncate
+
import re
+
+
summary = re.sub(r"<[^>]+>", "", summary)
+
summary = summary.strip()
+
if len(summary) > 60:
+
summary = summary[:57] + "..."
+
else:
+
summary = ""
+
+
table.add_row(score, user_display, title, updated_str, summary)
+
+
console.print(table)
+
+
# Show additional info
+
console.print(f"\n[dim]Showing {len(hits)} of {found} results[/dim]")
+
if len(hits) < found:
+
console.print(
+
f"[dim]Use --limit to see more results (current limit: {len(hits)})[/dim]"
+
)
+
+
+
def _display_compact_results(results: dict, query: str) -> None:
+
"""Display search results in a compact format."""
+
hits = results.get("hits", [])
+
found = results.get("found", 0)
+
+
if not hits:
+
console.print("\n[yellow]No results found.[/yellow]")
+
return
+
+
console.print(f"\n[green]Found {found} results[/green]\n")
+
+
for i, hit in enumerate(hits, 1):
+
doc = hit["document"]
+
score = hit.get("text_match", 0)
+
+
# Header with score and user
+
user = doc.get("user_display_name", doc.get("username", "Unknown"))
+
console.print(
+
f"[green]{i:2d}.[/green] [cyan]{user}[/cyan] [dim](score: {score:.2f})[/dim]"
+
)
+
+
# Title
+
title = doc.get("title", "Untitled")
+
console.print(f" [bold]{title}[/bold]")
+
+
# Date and link
+
updated_timestamp = doc.get("updated", 0)
+
if updated_timestamp:
+
from datetime import datetime
+
+
updated_date = datetime.fromtimestamp(updated_timestamp)
+
updated_str = updated_date.strftime("%Y-%m-%d %H:%M")
+
else:
+
updated_str = "Unknown date"
+
+
link = doc.get("link", "")
+
console.print(f" [blue]{updated_str}[/blue] - [link={link}]{link}[/link]")
+
+
# Summary
+
summary = doc.get("summary") or doc.get("content", "")
+
if summary:
+
import re
+
+
summary = re.sub(r"<[^>]+>", "", summary)
+
summary = summary.strip()
+
if len(summary) > 150:
+
summary = summary[:147] + "..."
+
console.print(f" [dim]{summary}[/dim]")
+
+
console.print() # Empty line between results
+37 -7
src/thicket/cli/commands/sync.py
···
from typing import Optional
import typer
+
from pydantic import HttpUrl
from rich.progress import track
from ...core.feed_parser import FeedParser
from ...core.git_store import GitStore
+
from ...core.opml_generator import OPMLGenerator
from ..main import app
from ..utils import (
load_config,
···
user_updated_entries = 0
# Sync each feed for the user
-
for feed_url in track(user_metadata.feeds, description=f"Syncing {user_metadata.username}'s feeds"):
+
for feed_url in track(
+
user_metadata.feeds, description=f"Syncing {user_metadata.username}'s feeds"
+
):
try:
new_entries, updated_entries = asyncio.run(
sync_feed(git_store, user_metadata.username, feed_url, dry_run)
···
print_error(f"Failed to sync feed {feed_url}: {e}")
continue
-
print_info(f"User {user_metadata.username}: {user_new_entries} new, {user_updated_entries} updated")
+
print_info(
+
f"User {user_metadata.username}: {user_new_entries} new, {user_updated_entries} updated"
+
)
total_new_entries += user_new_entries
total_updated_entries += user_updated_entries
···
git_store.commit_changes(commit_message)
print_success(f"Committed changes: {commit_message}")
+
# Generate OPML file with all feeds
+
if not dry_run:
+
try:
+
opml_generator = OPMLGenerator()
+
index = git_store._load_index()
+
opml_path = config.git_store / "index.opml"
+
+
opml_generator.generate_opml(
+
users=index.users,
+
title="Thicket Feed Collection",
+
output_path=opml_path,
+
)
+
print_info(f"Generated OPML file: {opml_path}")
+
+
except Exception as e:
+
print_error(f"Failed to generate OPML file: {e}")
+
# Summary
if dry_run:
-
print_info(f"Dry run complete: would sync {total_new_entries} new entries, {total_updated_entries} updated")
+
print_info(
+
f"Dry run complete: would sync {total_new_entries} new entries, {total_updated_entries} updated"
+
)
else:
-
print_success(f"Sync complete: {total_new_entries} new entries, {total_updated_entries} updated")
+
print_success(
+
f"Sync complete: {total_new_entries} new entries, {total_updated_entries} updated"
+
)
-
async def sync_feed(git_store: GitStore, username: str, feed_url, dry_run: bool) -> tuple[int, int]:
+
async def sync_feed(
+
git_store: GitStore, username: str, feed_url: str, dry_run: bool
+
) -> tuple[int, int]:
"""Sync a single feed for a user."""
parser = FeedParser()
try:
# Fetch and parse feed
-
content = await parser.fetch_feed(feed_url)
-
metadata, entries = parser.parse_feed(content, feed_url)
+
validated_feed_url = HttpUrl(feed_url)
+
content = await parser.fetch_feed(validated_feed_url)
+
metadata, entries = parser.parse_feed(content, validated_feed_url)
new_entries = 0
updated_entries = 0
+323
src/thicket/cli/commands/upload.py
···
+
"""Upload command for thicket CLI."""
+
+
import logging
+
from pathlib import Path
+
from typing import Optional
+
+
import typer
+
from rich.console import Console
+
from rich.progress import Progress, SpinnerColumn, TextColumn
+
+
from ...core.git_store import GitStore
+
from ...core.typesense_client import TypesenseClient, TypesenseConfig
+
from ...models.config import ThicketConfig
+
from ..main import app
+
from ..utils import load_config
+
+
console = Console()
+
logger = logging.getLogger(__name__)
+
+
+
def _load_typesense_config() -> tuple[Optional[str], Optional[str]]:
+
"""Load Typesense URL and API key from ~/.typesense directory."""
+
typesense_dir = Path.home() / ".typesense"
+
url_file = typesense_dir / "url"
+
key_file = typesense_dir / "api_key"
+
+
url = None
+
api_key = None
+
+
try:
+
if url_file.exists():
+
url = url_file.read_text().strip()
+
except Exception as e:
+
logger.debug(f"Could not read Typesense URL from {url_file}: {e}")
+
+
try:
+
if key_file.exists():
+
api_key = key_file.read_text().strip()
+
except Exception as e:
+
logger.debug(f"Could not read Typesense API key from {key_file}: {e}")
+
+
return url, api_key
+
+
+
def _save_typesense_config(
+
url: Optional[str] = None, api_key: Optional[str] = None
+
) -> None:
+
"""Save Typesense URL and API key to ~/.typesense directory."""
+
typesense_dir = Path.home() / ".typesense"
+
typesense_dir.mkdir(exist_ok=True, mode=0o700) # Secure permissions
+
+
if url:
+
url_file = typesense_dir / "url"
+
url_file.write_text(url)
+
url_file.chmod(0o600)
+
+
if api_key:
+
key_file = typesense_dir / "api_key"
+
key_file.write_text(api_key)
+
key_file.chmod(0o600) # Keep API key secure
+
+
+
@app.command("upload")
+
def upload_command(
+
typesense_url: Optional[str] = typer.Option(
+
None,
+
"--typesense-url",
+
"-u",
+
help="Typesense server URL (e.g., http://localhost:8108). Defaults to ~/.typesense/url",
+
),
+
api_key: Optional[str] = typer.Option(
+
None,
+
"--api-key",
+
"-k",
+
help="Typesense API key. Defaults to ~/.typesense/api_key",
+
hide_input=True,
+
),
+
collection_name: str = typer.Option(
+
"thicket_entries",
+
"--collection",
+
"-c",
+
help="Typesense collection name",
+
),
+
config_path: Optional[str] = typer.Option(
+
None,
+
"--config",
+
"-C",
+
help="Path to thicket configuration file",
+
),
+
git_store_path: Optional[str] = typer.Option(
+
None,
+
"--git-store",
+
"-g",
+
help="Path to Git store (overrides config)",
+
),
+
timeout: int = typer.Option(
+
10,
+
"--timeout",
+
"-t",
+
help="Connection timeout in seconds",
+
),
+
dry_run: bool = typer.Option(
+
False,
+
"--dry-run",
+
help="Show what would be uploaded without actually uploading",
+
),
+
) -> None:
+
"""Upload thicket entries to a Typesense search engine.
+
+
This command uploads all entries from the Git store to a Typesense server
+
for full-text and semantic search capabilities. The uploaded data includes
+
entry content, metadata, user information, and searchable text fields
+
optimized for embedding-based queries.
+
+
Configuration defaults can be stored in ~/.typesense/ directory:
+
- URL in ~/.typesense/url
+
- API key in ~/.typesense/api_key
+
+
Examples:
+
+
# Upload using saved defaults (first run will save config)
+
thicket upload -u http://localhost:8108 -k your-api-key
+
+
# Subsequent runs can omit URL and key if saved
+
thicket upload
+
+
# Upload to remote server with custom collection name
+
thicket upload -u https://search.example.com -k api-key -c my_blog_entries
+
+
# Dry run to see what would be uploaded
+
thicket upload --dry-run
+
"""
+
try:
+
# Load Typesense configuration from defaults if not provided
+
default_url, default_api_key = _load_typesense_config()
+
+
# Use provided values or defaults
+
final_url = typesense_url or default_url
+
final_api_key = api_key or default_api_key
+
+
# Check that we have required configuration
+
if not final_url:
+
console.print("[red]Error: Typesense URL is required[/red]")
+
console.print(
+
"Either provide --typesense-url or create ~/.typesense/url file"
+
)
+
raise typer.Exit(1)
+
+
if not final_api_key:
+
console.print("[red]Error: Typesense API key is required[/red]")
+
console.print(
+
"Either provide --api-key or create ~/.typesense/api_key file"
+
)
+
raise typer.Exit(1)
+
+
# Save configuration if provided via command line (for future use)
+
if typesense_url or api_key:
+
_save_typesense_config(typesense_url, api_key)
+
+
# Load thicket configuration
+
config_path_obj = Path(config_path) if config_path else None
+
config = load_config(config_path_obj)
+
+
# Override git store path if provided
+
if git_store_path:
+
config.git_store = Path(git_store_path)
+
+
console.print("[bold blue]Thicket Typesense Upload[/bold blue]")
+
console.print(f"Git store: {config.git_store}")
+
console.print(f"Typesense URL: {final_url}")
+
+
# Show where config is loaded from
+
if not typesense_url and default_url:
+
console.print("[dim] (URL loaded from ~/.typesense/url)[/dim]")
+
if not api_key and default_api_key:
+
console.print("[dim] (API key loaded from ~/.typesense/api_key)[/dim]")
+
+
console.print(f"Collection: {collection_name}")
+
+
if dry_run:
+
console.print("[yellow]DRY RUN MODE - No data will be uploaded[/yellow]")
+
+
# Initialize Git store
+
git_store = GitStore(config.git_store)
+
if not git_store.repo or not config.git_store.exists():
+
console.print("[red]Error: Git store is not valid or not initialized[/red]")
+
console.print("Run 'thicket init' first to set up the Git store.")
+
raise typer.Exit(1)
+
+
# Create Typesense configuration
+
typesense_config = TypesenseConfig.from_url(
+
final_url, final_api_key, collection_name
+
)
+
typesense_config.connection_timeout = timeout
+
+
if dry_run:
+
_dry_run_upload(git_store, config, typesense_config)
+
else:
+
_perform_upload(git_store, config, typesense_config)
+
+
except Exception as e:
+
logger.error(f"Upload failed: {e}")
+
console.print(f"[red]Error: {e}[/red]")
+
raise typer.Exit(1) from e
+
+
+
def _dry_run_upload(
+
git_store: GitStore, config: ThicketConfig, typesense_config: TypesenseConfig
+
) -> None:
+
"""Perform a dry run showing what would be uploaded."""
+
console.print("\n[bold]Dry run analysis:[/bold]")
+
+
index = git_store._load_index()
+
total_entries = 0
+
+
for username, user_metadata in index.users.items():
+
try:
+
user_dir = git_store.repo_path / user_metadata.directory
+
if not user_dir.exists():
+
console.print(f" โš ๏ธ User {username}: Directory not found")
+
continue
+
+
entry_files = list(user_dir.glob("*.json"))
+
total_entries += len(entry_files)
+
console.print(
+
f" โœ… User {username}: {len(entry_files)} entries would be uploaded"
+
)
+
except Exception as e:
+
console.print(f" โŒ User {username}: Error loading entries - {e}")
+
+
console.print("\n[bold]Summary:[/bold]")
+
console.print(f" โ€ข Total users: {len(index.users)}")
+
console.print(f" โ€ข Total entries to upload: {total_entries}")
+
console.print(f" โ€ข Target collection: {typesense_config.collection_name}")
+
console.print(
+
f" โ€ข Typesense server: {typesense_config.protocol}://{typesense_config.host}:{typesense_config.port}"
+
)
+
+
if total_entries > 0:
+
console.print("\n[green]Ready to upload! Remove --dry-run to proceed.[/green]")
+
else:
+
console.print("\n[yellow]No entries found to upload.[/yellow]")
+
+
+
def _perform_upload(
+
git_store: GitStore, config: ThicketConfig, typesense_config: TypesenseConfig
+
) -> None:
+
"""Perform the actual upload to Typesense."""
+
with Progress(
+
SpinnerColumn(),
+
TextColumn("[progress.description]{task.description}"),
+
console=console,
+
) as progress:
+
# Test connection
+
progress.add_task("Testing Typesense connection...", total=None)
+
+
try:
+
typesense_client = TypesenseClient(typesense_config)
+
# Test connection by attempting to list collections
+
typesense_client.client.collections.retrieve()
+
progress.stop()
+
console.print("[green]โœ… Connected to Typesense server[/green]")
+
except Exception as e:
+
progress.stop()
+
console.print(f"[red]โŒ Failed to connect to Typesense: {e}[/red]")
+
raise typer.Exit(1) from e
+
+
# Perform upload
+
with Progress(
+
SpinnerColumn(),
+
TextColumn("[progress.description]{task.description}"),
+
console=console,
+
) as upload_progress:
+
upload_progress.add_task("Uploading entries to Typesense...", total=None)
+
+
try:
+
result = typesense_client.upload_from_git_store(git_store, config)
+
upload_progress.stop()
+
+
# Parse results if available
+
if result:
+
if isinstance(result, list):
+
# Batch import results
+
success_count = sum(1 for r in result if r.get("success"))
+
total_count = len(result)
+
console.print(
+
f"[green]โœ… Upload completed: {success_count}/{total_count} documents uploaded successfully[/green]"
+
)
+
+
# Show any errors
+
errors = [r for r in result if not r.get("success")]
+
if errors:
+
console.print(
+
f"[yellow]โš ๏ธ {len(errors)} documents had errors[/yellow]"
+
)
+
for i, error in enumerate(
+
errors[:5]
+
): # Show first 5 errors
+
console.print(f" Error {i + 1}: {error}")
+
if len(errors) > 5:
+
console.print(
+
f" ... and {len(errors) - 5} more errors"
+
)
+
else:
+
console.print("[green]โœ… Upload completed successfully[/green]")
+
else:
+
console.print(
+
"[yellow]โš ๏ธ Upload completed but no result data available[/yellow]"
+
)
+
+
console.print("\n[bold]Collection information:[/bold]")
+
console.print(
+
f" โ€ข Server: {typesense_config.protocol}://{typesense_config.host}:{typesense_config.port}"
+
)
+
console.print(f" โ€ข Collection: {typesense_config.collection_name}")
+
console.print(
+
"\n[dim]You can now search your entries using the Typesense API or dashboard.[/dim]"
+
)
+
+
except Exception as e:
+
upload_progress.stop()
+
console.print(f"[red]โŒ Upload failed: {e}[/red]")
+
raise typer.Exit(1) from e
+268
src/thicket/cli/commands/zulip.py
···
+
"""Zulip association management commands for thicket."""
+
+
from pathlib import Path
+
from typing import Optional
+
+
import typer
+
from rich.console import Console
+
from rich.table import Table
+
+
from ...core.git_store import GitStore
+
from ..main import app
+
from ..utils import load_config, print_error, print_info, print_success
+
+
console = Console()
+
+
+
@app.command()
+
def zulip_add(
+
username: str = typer.Argument(..., help="Username to associate with Zulip"),
+
server: str = typer.Argument(
+
..., help="Zulip server (e.g., yourorg.zulipchat.com)"
+
),
+
user_id: str = typer.Argument(..., help="Zulip user ID or email for @mentions"),
+
config_file: Path = typer.Option(
+
Path("thicket.yaml"),
+
"--config",
+
"-c",
+
help="Path to thicket configuration file",
+
),
+
) -> None:
+
"""Add a Zulip association for a user.
+
+
This associates a thicket user with their Zulip identity, enabling
+
@mentions when the bot posts their articles.
+
+
Example:
+
thicket zulip-add alice myorg.zulipchat.com alice@example.com
+
"""
+
try:
+
config = load_config(config_file)
+
git_store = GitStore(config.git_store)
+
+
# Check if user exists
+
user = git_store.get_user(username)
+
if not user:
+
print_error(f"User '{username}' not found")
+
raise typer.Exit(1)
+
+
# Add association
+
if git_store.add_zulip_association(username, server, user_id):
+
print_success(f"Added Zulip association for {username}: {user_id}@{server}")
+
git_store.commit_changes(f"Add Zulip association for {username}")
+
else:
+
print_info(f"Association already exists for {username}: {user_id}@{server}")
+
+
except Exception as e:
+
print_error(f"Failed to add Zulip association: {e}")
+
raise typer.Exit(1) from e
+
+
+
@app.command()
+
def zulip_remove(
+
username: str = typer.Argument(..., help="Username to remove association from"),
+
server: str = typer.Argument(..., help="Zulip server"),
+
user_id: str = typer.Argument(..., help="Zulip user ID or email"),
+
config_file: Path = typer.Option(
+
Path("thicket.yaml"),
+
"--config",
+
"-c",
+
help="Path to thicket configuration file",
+
),
+
) -> None:
+
"""Remove a Zulip association from a user.
+
+
Example:
+
thicket zulip-remove alice myorg.zulipchat.com alice@example.com
+
"""
+
try:
+
config = load_config(config_file)
+
git_store = GitStore(config.git_store)
+
+
# Check if user exists
+
user = git_store.get_user(username)
+
if not user:
+
print_error(f"User '{username}' not found")
+
raise typer.Exit(1)
+
+
# Remove association
+
if git_store.remove_zulip_association(username, server, user_id):
+
print_success(
+
f"Removed Zulip association for {username}: {user_id}@{server}"
+
)
+
git_store.commit_changes(f"Remove Zulip association for {username}")
+
else:
+
print_error(f"Association not found for {username}: {user_id}@{server}")
+
raise typer.Exit(1)
+
+
except Exception as e:
+
print_error(f"Failed to remove Zulip association: {e}")
+
raise typer.Exit(1) from e
+
+
+
@app.command()
+
def zulip_list(
+
username: Optional[str] = typer.Argument(
+
None, help="Username to list associations for"
+
),
+
config_file: Path = typer.Option(
+
Path("thicket.yaml"),
+
"--config",
+
"-c",
+
help="Path to thicket configuration file",
+
),
+
) -> None:
+
"""List Zulip associations for users.
+
+
If no username is provided, lists associations for all users.
+
+
Examples:
+
thicket zulip-list # List all associations
+
thicket zulip-list alice # List associations for alice
+
"""
+
try:
+
config = load_config(config_file)
+
git_store = GitStore(config.git_store)
+
+
# Create table
+
table = Table(title="Zulip Associations")
+
table.add_column("Username", style="cyan")
+
table.add_column("Server", style="green")
+
table.add_column("User ID", style="yellow")
+
+
if username:
+
# List for specific user
+
user = git_store.get_user(username)
+
if not user:
+
print_error(f"User '{username}' not found")
+
raise typer.Exit(1)
+
+
if not user.zulip_associations:
+
print_info(f"No Zulip associations for {username}")
+
return
+
+
for assoc in user.zulip_associations:
+
table.add_row(username, assoc.server, assoc.user_id)
+
else:
+
# List for all users
+
index = git_store._load_index()
+
has_associations = False
+
+
for username, user in index.users.items():
+
for assoc in user.zulip_associations:
+
table.add_row(username, assoc.server, assoc.user_id)
+
has_associations = True
+
+
if not has_associations:
+
print_info("No Zulip associations found")
+
return
+
+
console.print(table)
+
+
except Exception as e:
+
print_error(f"Failed to list Zulip associations: {e}")
+
raise typer.Exit(1) from e
+
+
+
@app.command()
+
def zulip_import(
+
csv_file: Path = typer.Argument(..., help="CSV file with username,server,user_id"),
+
config_file: Path = typer.Option(
+
Path("thicket.yaml"),
+
"--config",
+
"-c",
+
help="Path to thicket configuration file",
+
),
+
dry_run: bool = typer.Option(
+
False,
+
"--dry-run",
+
help="Show what would be imported without making changes",
+
),
+
) -> None:
+
"""Import Zulip associations from a CSV file.
+
+
CSV format (no header):
+
username,server,user_id
+
alice,myorg.zulipchat.com,alice@example.com
+
bob,myorg.zulipchat.com,bob.smith
+
+
Example:
+
thicket zulip-import associations.csv
+
"""
+
import csv
+
+
try:
+
config = load_config(config_file)
+
git_store = GitStore(config.git_store)
+
+
if not csv_file.exists():
+
print_error(f"CSV file not found: {csv_file}")
+
raise typer.Exit(1)
+
+
added = 0
+
skipped = 0
+
errors = 0
+
+
with open(csv_file) as f:
+
reader = csv.reader(f)
+
for row_num, row in enumerate(reader, 1):
+
if len(row) != 3:
+
print_error(f"Line {row_num}: Invalid format (expected 3 columns)")
+
errors += 1
+
continue
+
+
username, server, user_id = [col.strip() for col in row]
+
+
# Skip empty lines
+
if not username:
+
continue
+
+
# Check if user exists
+
user = git_store.get_user(username)
+
if not user:
+
print_error(f"Line {row_num}: User '{username}' not found")
+
errors += 1
+
continue
+
+
if dry_run:
+
# Check if association would be added
+
exists = any(
+
a.server == server and a.user_id == user_id
+
for a in user.zulip_associations
+
)
+
if exists:
+
print_info(
+
f"Would skip existing: {username} -> {user_id}@{server}"
+
)
+
skipped += 1
+
else:
+
print_info(f"Would add: {username} -> {user_id}@{server}")
+
added += 1
+
else:
+
# Actually add association
+
if git_store.add_zulip_association(username, server, user_id):
+
print_success(f"Added: {username} -> {user_id}@{server}")
+
added += 1
+
else:
+
print_info(
+
f"Skipped existing: {username} -> {user_id}@{server}"
+
)
+
skipped += 1
+
+
# Summary
+
console.print()
+
if dry_run:
+
console.print("[bold]Dry run summary:[/bold]")
+
console.print(f" Would add: {added}")
+
else:
+
console.print("[bold]Import summary:[/bold]")
+
console.print(f" Added: {added}")
+
if not dry_run and added > 0:
+
git_store.commit_changes(f"Import {added} Zulip associations from CSV")
+
+
console.print(f" Skipped: {skipped}")
+
console.print(f" Errors: {errors}")
+
+
except Exception as e:
+
print_error(f"Failed to import Zulip associations: {e}")
+
raise typer.Exit(1) from e
+9 -1
src/thicket/cli/main.py
···
# Import commands to register them
-
from .commands import add, duplicates, index_cmd, info_cmd, init, links_cmd, list_cmd, sync
+
from .commands import ( # noqa: F401, E402
+
add,
+
duplicates,
+
info_cmd,
+
init,
+
list_cmd,
+
sync,
+
upload,
+
)
if __name__ == "__main__":
app()
+32 -20
src/thicket/cli/utils.py
···
from rich.progress import Progress, SpinnerColumn, TextColumn
from rich.table import Table
-
from ..models import ThicketConfig, UserMetadata
from ..core.git_store import GitStore
+
from ..models import ThicketConfig, UserMetadata
console = Console()
···
def get_tsv_mode() -> bool:
"""Get the global TSV mode setting."""
from .main import tsv_mode
+
return tsv_mode
···
default_config = Path("thicket.yaml")
if default_config.exists():
import yaml
+
with open(default_config) as f:
config_data = yaml.safe_load(f)
return ThicketConfig(**config_data)
-
+
# Fall back to environment variables
return ThicketConfig()
except Exception as e:
console.print(f"[red]Error loading configuration: {e}[/red]")
-
console.print("[yellow]Run 'thicket init' to create a new configuration.[/yellow]")
+
console.print(
+
"[yellow]Run 'thicket init' to create a new configuration.[/yellow]"
+
)
raise typer.Exit(1) from e
···
if get_tsv_mode():
print_users_tsv(config)
return
-
+
table = Table(title="Users and Feeds")
table.add_column("Username", style="cyan", no_wrap=True)
table.add_column("Display Name", style="magenta")
···
if get_tsv_mode():
print_feeds_tsv(config, username)
return
-
+
table = Table(title=f"Feeds{f' for {username}' if username else ''}")
table.add_column("Username", style="cyan", no_wrap=True)
table.add_column("Feed URL", style="blue")
···
if get_tsv_mode():
print_users_tsv_from_git(users)
return
-
+
table = Table(title="Users and Feeds")
table.add_column("Username", style="cyan", no_wrap=True)
table.add_column("Display Name", style="magenta")
···
console.print(table)
-
def print_feeds_table_from_git(git_store: GitStore, username: Optional[str] = None) -> None:
+
def print_feeds_table_from_git(
+
git_store: GitStore, username: Optional[str] = None
+
) -> None:
"""Print a table of feeds from git repository."""
if get_tsv_mode():
print_feeds_tsv_from_git(git_store, username)
return
-
+
table = Table(title=f"Feeds{f' for {username}' if username else ''}")
table.add_column("Username", style="cyan", no_wrap=True)
table.add_column("Feed URL", style="blue")
···
print("Username\tDisplay Name\tEmail\tHomepage\tFeeds")
for user in config.users:
feeds_str = ",".join(str(feed) for feed in user.feeds)
-
print(f"{user.username}\t{user.display_name or ''}\t{user.email or ''}\t{user.homepage or ''}\t{feeds_str}")
+
print(
+
f"{user.username}\t{user.display_name or ''}\t{user.email or ''}\t{user.homepage or ''}\t{feeds_str}"
+
)
def print_users_tsv_from_git(users: list[UserMetadata]) -> None:
···
print("Username\tDisplay Name\tEmail\tHomepage\tFeeds")
for user in users:
feeds_str = ",".join(user.feeds)
-
print(f"{user.username}\t{user.display_name or ''}\t{user.email or ''}\t{user.homepage or ''}\t{feeds_str}")
+
print(
+
f"{user.username}\t{user.display_name or ''}\t{user.email or ''}\t{user.homepage or ''}\t{feeds_str}"
+
)
def print_feeds_tsv(config: ThicketConfig, username: Optional[str] = None) -> None:
···
print("Username\tFeed URL\tStatus")
users = [config.find_user(username)] if username else config.users
users = [u for u in users if u is not None]
-
+
for user in users:
for feed in user.feeds:
print(f"{user.username}\t{feed}\tActive")
-
def print_feeds_tsv_from_git(git_store: GitStore, username: Optional[str] = None) -> None:
+
def print_feeds_tsv_from_git(
+
git_store: GitStore, username: Optional[str] = None
+
) -> None:
"""Print feeds from git repository in TSV format."""
print("Username\tFeed URL\tStatus")
-
+
if username:
user = git_store.get_user(username)
users = [user] if user else []
else:
index = git_store._load_index()
users = list(index.users.values())
-
+
for user in users:
for feed in user.feeds:
print(f"{user.username}\t{feed}\tActive")
···
def print_entries_tsv(entries_by_user: list[list], usernames: list[str]) -> None:
"""Print entries in TSV format."""
print("User\tAtom ID\tTitle\tUpdated\tURL")
-
+
# Combine all entries with usernames
all_entries = []
for entries, username in zip(entries_by_user, usernames):
for entry in entries:
all_entries.append((username, entry))
-
+
# Sort by updated time (newest first)
all_entries.sort(key=lambda x: x[1].updated, reverse=True)
-
+
for username, entry in all_entries:
# Format updated time
updated_str = entry.updated.strftime("%Y-%m-%d %H:%M")
-
+
# Escape tabs and newlines in title to preserve TSV format
-
title = entry.title.replace('\t', ' ').replace('\n', ' ').replace('\r', ' ')
-
+
title = entry.title.replace("\t", " ").replace("\n", " ").replace("\r", " ")
+
print(f"{username}\t{entry.id}\t{title}\t{updated_str}\t{entry.link}")
+84 -55
src/thicket/core/feed_parser.py
···
"""Initialize the feed parser."""
self.user_agent = user_agent
self.allowed_tags = [
-
"a", "abbr", "acronym", "b", "blockquote", "br", "code", "em",
-
"i", "li", "ol", "p", "pre", "strong", "ul", "h1", "h2", "h3",
-
"h4", "h5", "h6", "img", "div", "span",
+
"a",
+
"abbr",
+
"acronym",
+
"b",
+
"blockquote",
+
"br",
+
"code",
+
"em",
+
"i",
+
"li",
+
"ol",
+
"p",
+
"pre",
+
"strong",
+
"ul",
+
"h1",
+
"h2",
+
"h3",
+
"h4",
+
"h5",
+
"h6",
+
"img",
+
"div",
+
"span",
]
self.allowed_attributes = {
"a": ["href", "title"],
···
response.raise_for_status()
return response.text
-
def parse_feed(self, content: str, source_url: Optional[HttpUrl] = None) -> tuple[FeedMetadata, list[AtomEntry]]:
+
def parse_feed(
+
self, content: str, source_url: Optional[HttpUrl] = None
+
) -> tuple[FeedMetadata, list[AtomEntry]]:
"""Parse feed content and return metadata and entries."""
parsed = feedparser.parse(content)
···
author_email = None
author_uri = None
-
if hasattr(feed, 'author_detail'):
-
author_name = feed.author_detail.get('name')
-
author_email = feed.author_detail.get('email')
-
author_uri = feed.author_detail.get('href')
-
elif hasattr(feed, 'author'):
+
if hasattr(feed, "author_detail"):
+
author_name = feed.author_detail.get("name")
+
author_email = feed.author_detail.get("email")
+
author_uri = feed.author_detail.get("href")
+
elif hasattr(feed, "author"):
author_name = feed.author
# Parse managing editor for RSS feeds
-
if not author_email and hasattr(feed, 'managingEditor'):
+
if not author_email and hasattr(feed, "managingEditor"):
author_email = feed.managingEditor
# Parse feed link
feed_link = None
-
if hasattr(feed, 'link'):
+
if hasattr(feed, "link"):
try:
feed_link = HttpUrl(feed.link)
except ValidationError:
···
icon = None
image_url = None
-
if hasattr(feed, 'image'):
+
if hasattr(feed, "image"):
try:
-
image_url = HttpUrl(feed.image.get('href', feed.image.get('url', '')))
+
image_url = HttpUrl(feed.image.get("href", feed.image.get("url", "")))
except (ValidationError, AttributeError):
pass
-
if hasattr(feed, 'icon'):
+
if hasattr(feed, "icon"):
try:
icon = HttpUrl(feed.icon)
except ValidationError:
pass
-
if hasattr(feed, 'logo'):
+
if hasattr(feed, "logo"):
try:
logo = HttpUrl(feed.logo)
except ValidationError:
pass
return FeedMetadata(
-
title=getattr(feed, 'title', None),
+
title=getattr(feed, "title", None),
author_name=author_name,
author_email=author_email,
author_uri=HttpUrl(author_uri) if author_uri else None,
···
logo=logo,
icon=icon,
image_url=image_url,
-
description=getattr(feed, 'description', None),
+
description=getattr(feed, "description", None),
)
-
def _normalize_entry(self, entry: feedparser.FeedParserDict, source_url: Optional[HttpUrl] = None) -> AtomEntry:
+
def _normalize_entry(
+
self, entry: feedparser.FeedParserDict, source_url: Optional[HttpUrl] = None
+
) -> AtomEntry:
"""Normalize an entry to Atom format."""
# Parse timestamps
-
updated = self._parse_timestamp(entry.get('updated_parsed') or entry.get('published_parsed'))
-
published = self._parse_timestamp(entry.get('published_parsed'))
+
updated = self._parse_timestamp(
+
entry.get("updated_parsed") or entry.get("published_parsed")
+
)
+
published = self._parse_timestamp(entry.get("published_parsed"))
# Parse content
content = self._extract_content(entry)
···
# Parse categories/tags
categories = []
-
if hasattr(entry, 'tags'):
-
categories = [tag.get('term', '') for tag in entry.tags if tag.get('term')]
+
if hasattr(entry, "tags"):
+
categories = [tag.get("term", "") for tag in entry.tags if tag.get("term")]
# Sanitize HTML content
if content:
content = self._sanitize_html(content)
-
summary = entry.get('summary', '')
+
summary = entry.get("summary", "")
if summary:
summary = self._sanitize_html(summary)
return AtomEntry(
-
id=entry.get('id', entry.get('link', '')),
-
title=entry.get('title', ''),
-
link=HttpUrl(entry.get('link', '')),
+
id=entry.get("id", entry.get("link", "")),
+
title=entry.get("title", ""),
+
link=HttpUrl(entry.get("link", "")),
updated=updated,
published=published,
summary=summary or None,
···
content_type=content_type,
author=author,
categories=categories,
-
rights=entry.get('rights', None),
+
rights=entry.get("rights", None),
source=str(source_url) if source_url else None,
)
···
def _extract_content(self, entry: feedparser.FeedParserDict) -> Optional[str]:
"""Extract the best content from an entry."""
# Prefer content over summary
-
if hasattr(entry, 'content') and entry.content:
+
if hasattr(entry, "content") and entry.content:
# Find the best content (prefer text/html, then text/plain)
for content_item in entry.content:
-
if content_item.get('type') in ['text/html', 'html']:
-
return content_item.get('value', '')
-
elif content_item.get('type') in ['text/plain', 'text']:
-
return content_item.get('value', '')
+
if content_item.get("type") in ["text/html", "html"]:
+
return content_item.get("value", "")
+
elif content_item.get("type") in ["text/plain", "text"]:
+
return content_item.get("value", "")
# Fallback to first content item
-
return entry.content[0].get('value', '')
+
return entry.content[0].get("value", "")
# Fallback to summary
-
return entry.get('summary', '')
+
return entry.get("summary", "")
def _extract_content_type(self, entry: feedparser.FeedParserDict) -> str:
"""Extract content type from entry."""
-
if hasattr(entry, 'content') and entry.content:
-
content_type = entry.content[0].get('type', 'html')
+
if hasattr(entry, "content") and entry.content:
+
content_type = entry.content[0].get("type", "html")
# Normalize content type
-
if content_type in ['text/html', 'html']:
-
return 'html'
-
elif content_type in ['text/plain', 'text']:
-
return 'text'
-
elif content_type == 'xhtml':
-
return 'xhtml'
-
return 'html'
+
if content_type in ["text/html", "html"]:
+
return "html"
+
elif content_type in ["text/plain", "text"]:
+
return "text"
+
elif content_type == "xhtml":
+
return "xhtml"
+
return "html"
def _extract_author(self, entry: feedparser.FeedParserDict) -> Optional[dict]:
"""Extract author information from entry."""
author = {}
-
if hasattr(entry, 'author_detail'):
-
author.update({
-
'name': entry.author_detail.get('name'),
-
'email': entry.author_detail.get('email'),
-
'uri': entry.author_detail.get('href'),
-
})
-
elif hasattr(entry, 'author'):
-
author['name'] = entry.author
+
if hasattr(entry, "author_detail"):
+
author.update(
+
{
+
"name": entry.author_detail.get("name"),
+
"email": entry.author_detail.get("email"),
+
"uri": entry.author_detail.get("href"),
+
}
+
)
+
elif hasattr(entry, "author"):
+
author["name"] = entry.author
return author if author else None
···
# Start with the path component
if parsed.path:
# Remove leading slash and replace problematic characters
-
safe_id = parsed.path.lstrip('/').replace('/', '_').replace('\\', '_')
+
safe_id = parsed.path.lstrip("/").replace("/", "_").replace("\\", "_")
else:
# Use the entire ID as fallback
safe_id = entry_id
···
# Replace problematic characters
safe_chars = []
for char in safe_id:
-
if char.isalnum() or char in '-_.':
+
if char.isalnum() or char in "-_.":
safe_chars.append(char)
else:
-
safe_chars.append('_')
+
safe_chars.append("_")
-
safe_id = ''.join(safe_chars)
+
safe_id = "".join(safe_chars)
# Ensure it's not too long (max 200 chars)
if len(safe_id) > 200:
+107 -18
src/thicket/core/git_store.py
···
"""Save the index to index.json."""
index_path = self.repo_path / "index.json"
with open(index_path, "w") as f:
-
json.dump(index.model_dump(mode="json", exclude_none=True), f, indent=2, default=str)
+
json.dump(
+
index.model_dump(mode="json", exclude_none=True),
+
f,
+
indent=2,
+
default=str,
+
)
def _load_index(self) -> GitStoreIndex:
"""Load the index from index.json."""
···
return DuplicateMap(**data)
-
def add_user(self, username: str, display_name: Optional[str] = None,
-
email: Optional[str] = None, homepage: Optional[str] = None,
-
icon: Optional[str] = None, feeds: Optional[list[str]] = None) -> UserMetadata:
+
def add_user(
+
self,
+
username: str,
+
display_name: Optional[str] = None,
+
email: Optional[str] = None,
+
homepage: Optional[str] = None,
+
icon: Optional[str] = None,
+
feeds: Optional[list[str]] = None,
+
) -> UserMetadata:
"""Add a new user to the Git store."""
index = self._load_index()
···
created=datetime.now(),
last_updated=datetime.now(),
)
-
# Update index
index.add_user(user_metadata)
···
user.update_timestamp()
-
# Update index
index.add_user(user)
self._save_index(index)
return True
+
def add_zulip_association(self, username: str, server: str, user_id: str) -> bool:
+
"""Add a Zulip association to a user."""
+
index = self._load_index()
+
user = index.get_user(username)
+
+
if not user:
+
return False
+
+
result = user.add_zulip_association(server, user_id)
+
if result:
+
index.add_user(user)
+
self._save_index(index)
+
+
return result
+
+
def remove_zulip_association(
+
self, username: str, server: str, user_id: str
+
) -> bool:
+
"""Remove a Zulip association from a user."""
+
index = self._load_index()
+
user = index.get_user(username)
+
+
if not user:
+
return False
+
+
result = user.remove_zulip_association(server, user_id)
+
if result:
+
index.add_user(user)
+
self._save_index(index)
+
+
return result
+
+
def get_zulip_associations(self, username: str) -> list:
+
"""Get all Zulip associations for a user."""
+
user = self.get_user(username)
+
if user:
+
return user.zulip_associations
+
return []
+
def store_entry(self, username: str, entry: AtomEntry) -> bool:
"""Store an entry in the user's directory."""
user = self.get_user(username)
···
# Sanitize entry ID for filename
from .feed_parser import FeedParser
+
parser = FeedParser()
safe_id = parser.sanitize_entry_id(entry.id)
···
# Save entry
with open(entry_path, "w") as f:
-
json.dump(entry.model_dump(mode="json", exclude_none=True), f, indent=2, default=str)
+
json.dump(
+
entry.model_dump(mode="json", exclude_none=True),
+
f,
+
indent=2,
+
default=str,
+
)
# Update user metadata if new entry
if not entry_exists:
···
# Sanitize entry ID
from .feed_parser import FeedParser
+
parser = FeedParser()
safe_id = parser.sanitize_entry_id(entry_id)
···
return AtomEntry(**data)
-
def list_entries(self, username: str, limit: Optional[int] = None) -> list[AtomEntry]:
+
def list_entries(
+
self, username: str, limit: Optional[int] = None
+
) -> list[AtomEntry]:
"""List entries for a user."""
user = self.get_user(username)
if not user:
···
return []
entries = []
-
entry_files = sorted(user_dir.glob("*.json"), key=lambda p: p.stat().st_mtime, reverse=True)
-
+
entry_files = sorted(
+
user_dir.glob("*.json"), key=lambda p: p.stat().st_mtime, reverse=True
+
)
if limit:
entry_files = entry_files[:limit]
···
"total_entries": index.total_entries,
"total_duplicates": len(duplicates.duplicates),
"last_updated": index.last_updated,
-
"repository_size": sum(f.stat().st_size for f in self.repo_path.rglob("*") if f.is_file()),
+
"repository_size": sum(
+
f.stat().st_size for f in self.repo_path.rglob("*") if f.is_file()
+
),
}
-
def search_entries(self, query: str, username: Optional[str] = None,
-
limit: Optional[int] = None) -> list[tuple[str, AtomEntry]]:
+
def search_entries(
+
self, query: str, username: Optional[str] = None, limit: Optional[int] = None
+
) -> list[tuple[str, AtomEntry]]:
"""Search entries by content."""
results = []
···
entry = AtomEntry(**data)
# Simple text search in title, summary, and content
-
searchable_text = " ".join(filter(None, [
-
entry.title,
-
entry.summary or "",
-
entry.content or "",
-
])).lower()
+
searchable_text = " ".join(
+
filter(
+
None,
+
[
+
entry.title,
+
entry.summary or "",
+
entry.content or "",
+
],
+
)
+
).lower()
if query.lower() in searchable_text:
results.append((user.username, entry))
···
results.sort(key=lambda x: x[1].updated, reverse=True)
return results[:limit] if limit else results
+
+
def list_users(self) -> list[str]:
+
"""Get list of all usernames in the git store."""
+
index = self._load_index()
+
return list(index.users.keys())
+
+
def get_user_feeds(self, username: str) -> list[str]:
+
"""Get list of feed URLs for a specific user from their metadata."""
+
user = self.get_user(username)
+
if not user:
+
return []
+
+
# Feed URLs are stored in the user metadata
+
return user.feeds
+
+
def list_all_users_with_feeds(self) -> list[tuple[str, list[str]]]:
+
"""Get all users and their feed URLs."""
+
result = []
+
for username in self.list_users():
+
feeds = self.get_user_feeds(username)
+
if feeds: # Only include users that have feeds configured
+
result.append((username, feeds))
+
return result
+166
src/thicket/core/opml_generator.py
···
+
"""OPML generation for thicket."""
+
+
import xml.etree.ElementTree as ET
+
from datetime import datetime
+
from pathlib import Path
+
from typing import Optional
+
from xml.dom import minidom
+
+
from ..models import UserMetadata
+
+
+
class OPMLGenerator:
+
"""Generates OPML files from feed collections."""
+
+
def __init__(self) -> None:
+
"""Initialize the OPML generator."""
+
pass
+
+
def generate_opml(
+
self,
+
users: dict[str, UserMetadata],
+
title: str = "Thicket Feeds",
+
output_path: Optional[Path] = None,
+
) -> str:
+
"""Generate OPML XML content from user metadata.
+
+
Args:
+
users: Dictionary of username -> UserMetadata
+
title: Title for the OPML file
+
output_path: Optional path to write the OPML file
+
+
Returns:
+
OPML XML content as string
+
"""
+
# Create root OPML element
+
opml = ET.Element("opml", version="2.0")
+
+
# Create head section
+
head = ET.SubElement(opml, "head")
+
title_elem = ET.SubElement(head, "title")
+
title_elem.text = title
+
+
date_created = ET.SubElement(head, "dateCreated")
+
date_created.text = datetime.now().strftime("%a, %d %b %Y %H:%M:%S %z")
+
+
date_modified = ET.SubElement(head, "dateModified")
+
date_modified.text = datetime.now().strftime("%a, %d %b %Y %H:%M:%S %z")
+
+
# Create body section
+
body = ET.SubElement(opml, "body")
+
+
# Add each user as an outline with their feeds as sub-outlines
+
for username, user_metadata in sorted(users.items()):
+
user_outline = ET.SubElement(body, "outline")
+
user_outline.set("text", user_metadata.display_name or username)
+
user_outline.set("title", user_metadata.display_name or username)
+
+
# Add user metadata as attributes if available
+
if user_metadata.homepage:
+
user_outline.set("htmlUrl", user_metadata.homepage)
+
if user_metadata.email:
+
user_outline.set("email", user_metadata.email)
+
+
# Add each feed as a sub-outline
+
for feed_url in sorted(user_metadata.feeds):
+
feed_outline = ET.SubElement(user_outline, "outline")
+
feed_outline.set("type", "rss")
+
feed_outline.set("text", feed_url)
+
feed_outline.set("title", feed_url)
+
feed_outline.set("xmlUrl", feed_url)
+
feed_outline.set("htmlUrl", feed_url)
+
+
# Convert to pretty-printed XML string
+
xml_str = self._prettify_xml(opml)
+
+
# Write to file if path provided
+
if output_path:
+
output_path.write_text(xml_str, encoding="utf-8")
+
+
return xml_str
+
+
def _prettify_xml(self, elem: ET.Element) -> str:
+
"""Return a pretty-printed XML string for the Element."""
+
rough_string = ET.tostring(elem, encoding="unicode")
+
reparsed = minidom.parseString(rough_string)
+
return reparsed.toprettyxml(indent=" ")
+
+
def generate_flat_opml(
+
self,
+
users: dict[str, UserMetadata],
+
title: str = "Thicket Feeds (Flat)",
+
output_path: Optional[Path] = None,
+
) -> str:
+
"""Generate a flat OPML file with all feeds at the top level.
+
+
This format may be more compatible with some feed readers.
+
+
Args:
+
users: Dictionary of username -> UserMetadata
+
title: Title for the OPML file
+
output_path: Optional path to write the OPML file
+
+
Returns:
+
OPML XML content as string
+
"""
+
# Create root OPML element
+
opml = ET.Element("opml", version="2.0")
+
+
# Create head section
+
head = ET.SubElement(opml, "head")
+
title_elem = ET.SubElement(head, "title")
+
title_elem.text = title
+
+
date_created = ET.SubElement(head, "dateCreated")
+
date_created.text = datetime.now().strftime("%a, %d %b %Y %H:%M:%S %z")
+
+
date_modified = ET.SubElement(head, "dateModified")
+
date_modified.text = datetime.now().strftime("%a, %d %b %Y %H:%M:%S %z")
+
+
# Create body section
+
body = ET.SubElement(opml, "body")
+
+
# Collect all feeds with their associated user info
+
all_feeds = []
+
for username, user_metadata in users.items():
+
for feed_url in user_metadata.feeds:
+
all_feeds.append(
+
{
+
"url": feed_url,
+
"username": username,
+
"display_name": user_metadata.display_name or username,
+
"homepage": user_metadata.homepage,
+
}
+
)
+
+
# Sort feeds by URL for consistency
+
all_feeds.sort(key=lambda f: f["url"] or "")
+
+
# Add each feed as a top-level outline
+
for feed_info in all_feeds:
+
feed_outline = ET.SubElement(body, "outline")
+
feed_outline.set("type", "rss")
+
+
# Create a descriptive title that includes the user
+
title_text = f"{feed_info['display_name']}: {feed_info['url']}"
+
feed_outline.set("text", title_text)
+
feed_outline.set("title", title_text)
+
url = feed_info["url"] or ""
+
feed_outline.set("xmlUrl", url)
+
homepage_url = feed_info.get("homepage") or url
+
feed_outline.set("htmlUrl", homepage_url or "")
+
+
# Add custom attributes for user info
+
feed_outline.set("thicketUser", feed_info["username"] or "")
+
homepage = feed_info.get("homepage")
+
if homepage:
+
feed_outline.set("thicketHomepage", homepage)
+
+
# Convert to pretty-printed XML string
+
xml_str = self._prettify_xml(opml)
+
+
# Write to file if path provided
+
if output_path:
+
output_path.write_text(xml_str, encoding="utf-8")
+
+
return xml_str
-301
src/thicket/core/reference_parser.py
···
-
"""Reference detection and parsing for blog entries."""
-
-
import re
-
from typing import Optional
-
from urllib.parse import urlparse
-
-
from ..models import AtomEntry
-
-
-
class BlogReference:
-
"""Represents a reference from one blog entry to another."""
-
-
def __init__(
-
self,
-
source_entry_id: str,
-
source_username: str,
-
target_url: str,
-
target_username: Optional[str] = None,
-
target_entry_id: Optional[str] = None,
-
):
-
self.source_entry_id = source_entry_id
-
self.source_username = source_username
-
self.target_url = target_url
-
self.target_username = target_username
-
self.target_entry_id = target_entry_id
-
-
def to_dict(self) -> dict:
-
"""Convert to dictionary for JSON serialization."""
-
result = {
-
"source_entry_id": self.source_entry_id,
-
"source_username": self.source_username,
-
"target_url": self.target_url,
-
}
-
-
# Only include optional fields if they are not None
-
if self.target_username is not None:
-
result["target_username"] = self.target_username
-
if self.target_entry_id is not None:
-
result["target_entry_id"] = self.target_entry_id
-
-
return result
-
-
@classmethod
-
def from_dict(cls, data: dict) -> "BlogReference":
-
"""Create from dictionary."""
-
return cls(
-
source_entry_id=data["source_entry_id"],
-
source_username=data["source_username"],
-
target_url=data["target_url"],
-
target_username=data.get("target_username"),
-
target_entry_id=data.get("target_entry_id"),
-
)
-
-
-
class ReferenceIndex:
-
"""Index of blog-to-blog references for creating threaded views."""
-
-
def __init__(self):
-
self.references: list[BlogReference] = []
-
self.outbound_refs: dict[
-
str, list[BlogReference]
-
] = {} # entry_id -> outbound refs
-
self.inbound_refs: dict[
-
str, list[BlogReference]
-
] = {} # entry_id -> inbound refs
-
self.user_domains: dict[str, set[str]] = {} # username -> set of domains
-
-
def add_reference(self, ref: BlogReference) -> None:
-
"""Add a reference to the index."""
-
self.references.append(ref)
-
-
# Update outbound references
-
source_key = f"{ref.source_username}:{ref.source_entry_id}"
-
if source_key not in self.outbound_refs:
-
self.outbound_refs[source_key] = []
-
self.outbound_refs[source_key].append(ref)
-
-
# Update inbound references if we can identify the target
-
if ref.target_username and ref.target_entry_id:
-
target_key = f"{ref.target_username}:{ref.target_entry_id}"
-
if target_key not in self.inbound_refs:
-
self.inbound_refs[target_key] = []
-
self.inbound_refs[target_key].append(ref)
-
-
def get_outbound_refs(self, username: str, entry_id: str) -> list[BlogReference]:
-
"""Get all outbound references from an entry."""
-
key = f"{username}:{entry_id}"
-
return self.outbound_refs.get(key, [])
-
-
def get_inbound_refs(self, username: str, entry_id: str) -> list[BlogReference]:
-
"""Get all inbound references to an entry."""
-
key = f"{username}:{entry_id}"
-
return self.inbound_refs.get(key, [])
-
-
def get_thread_members(self, username: str, entry_id: str) -> set[tuple[str, str]]:
-
"""Get all entries that are part of the same thread."""
-
visited = set()
-
to_visit = [(username, entry_id)]
-
thread_members = set()
-
-
while to_visit:
-
current_user, current_entry = to_visit.pop()
-
if (current_user, current_entry) in visited:
-
continue
-
-
visited.add((current_user, current_entry))
-
thread_members.add((current_user, current_entry))
-
-
# Add outbound references
-
for ref in self.get_outbound_refs(current_user, current_entry):
-
if ref.target_username and ref.target_entry_id:
-
to_visit.append((ref.target_username, ref.target_entry_id))
-
-
# Add inbound references
-
for ref in self.get_inbound_refs(current_user, current_entry):
-
to_visit.append((ref.source_username, ref.source_entry_id))
-
-
return thread_members
-
-
def to_dict(self) -> dict:
-
"""Convert to dictionary for JSON serialization."""
-
return {
-
"references": [ref.to_dict() for ref in self.references],
-
"user_domains": {k: list(v) for k, v in self.user_domains.items()},
-
}
-
-
@classmethod
-
def from_dict(cls, data: dict) -> "ReferenceIndex":
-
"""Create from dictionary."""
-
index = cls()
-
for ref_data in data.get("references", []):
-
ref = BlogReference.from_dict(ref_data)
-
index.add_reference(ref)
-
-
for username, domains in data.get("user_domains", {}).items():
-
index.user_domains[username] = set(domains)
-
-
return index
-
-
-
class ReferenceParser:
-
"""Parses blog entries to detect references to other blogs."""
-
-
def __init__(self):
-
# Common blog platforms and patterns
-
self.blog_patterns = [
-
r"https?://[^/]+\.(?:org|com|net|io|dev|me|co\.uk)/.*", # Common blog domains
-
r"https?://[^/]+\.github\.io/.*", # GitHub Pages
-
r"https?://[^/]+\.substack\.com/.*", # Substack
-
r"https?://medium\.com/.*", # Medium
-
r"https?://[^/]+\.wordpress\.com/.*", # WordPress.com
-
r"https?://[^/]+\.blogspot\.com/.*", # Blogger
-
]
-
-
# Compile regex patterns
-
self.link_pattern = re.compile(
-
r'<a[^>]+href="([^"]+)"[^>]*>(.*?)</a>', re.IGNORECASE | re.DOTALL
-
)
-
self.url_pattern = re.compile(r'https?://[^\s<>"]+')
-
-
def extract_links_from_html(self, html_content: str) -> list[tuple[str, str]]:
-
"""Extract all links from HTML content."""
-
links = []
-
-
# Extract links from <a> tags
-
for match in self.link_pattern.finditer(html_content):
-
url = match.group(1)
-
text = re.sub(
-
r"<[^>]+>", "", match.group(2)
-
).strip() # Remove HTML tags from link text
-
links.append((url, text))
-
-
return links
-
-
def is_blog_url(self, url: str) -> bool:
-
"""Check if a URL likely points to a blog post."""
-
for pattern in self.blog_patterns:
-
if re.match(pattern, url):
-
return True
-
return False
-
-
def resolve_target_user(
-
self, url: str, user_domains: dict[str, set[str]]
-
) -> Optional[str]:
-
"""Try to resolve a URL to a known user based on domain mapping."""
-
parsed_url = urlparse(url)
-
domain = parsed_url.netloc.lower()
-
-
for username, domains in user_domains.items():
-
if domain in domains:
-
return username
-
-
return None
-
-
def extract_references(
-
self, entry: AtomEntry, username: str, user_domains: dict[str, set[str]]
-
) -> list[BlogReference]:
-
"""Extract all blog references from an entry."""
-
references = []
-
-
# Combine all text content for analysis
-
content_to_search = []
-
if entry.content:
-
content_to_search.append(entry.content)
-
if entry.summary:
-
content_to_search.append(entry.summary)
-
-
for content in content_to_search:
-
links = self.extract_links_from_html(content)
-
-
for url, _link_text in links:
-
# Skip internal links (same domain as the entry)
-
entry_domain = (
-
urlparse(str(entry.link)).netloc.lower() if entry.link else ""
-
)
-
link_domain = urlparse(url).netloc.lower()
-
-
if link_domain == entry_domain:
-
continue
-
-
# Check if this looks like a blog URL
-
if not self.is_blog_url(url):
-
continue
-
-
# Try to resolve to a known user
-
target_username = self.resolve_target_user(url, user_domains)
-
-
ref = BlogReference(
-
source_entry_id=entry.id,
-
source_username=username,
-
target_url=url,
-
target_username=target_username,
-
target_entry_id=None, # Will be resolved later if possible
-
)
-
-
references.append(ref)
-
-
return references
-
-
def build_user_domain_mapping(self, git_store: "GitStore") -> dict[str, set[str]]:
-
"""Build mapping of usernames to their known domains."""
-
user_domains = {}
-
index = git_store._load_index()
-
-
for username, user_metadata in index.users.items():
-
domains = set()
-
-
# Add domains from feeds
-
for feed_url in user_metadata.feeds:
-
domain = urlparse(feed_url).netloc.lower()
-
if domain:
-
domains.add(domain)
-
-
# Add domain from homepage
-
if user_metadata.homepage:
-
domain = urlparse(str(user_metadata.homepage)).netloc.lower()
-
if domain:
-
domains.add(domain)
-
-
user_domains[username] = domains
-
-
return user_domains
-
-
def resolve_target_entry_ids(
-
self, references: list[BlogReference], git_store: "GitStore"
-
) -> list[BlogReference]:
-
"""Resolve target_entry_id for references that have target_username but no target_entry_id."""
-
resolved_refs = []
-
-
for ref in references:
-
# If we already have a target_entry_id, keep the reference as-is
-
if ref.target_entry_id is not None:
-
resolved_refs.append(ref)
-
continue
-
-
# If we don't have a target_username, we can't resolve it
-
if ref.target_username is None:
-
resolved_refs.append(ref)
-
continue
-
-
# Try to find the entry by matching the URL
-
entries = git_store.list_entries(ref.target_username)
-
resolved_entry_id = None
-
-
for entry in entries:
-
# Check if the entry's link matches the target URL
-
if entry.link and str(entry.link) == ref.target_url:
-
resolved_entry_id = entry.id
-
break
-
-
# Create a new reference with the resolved target_entry_id
-
resolved_ref = BlogReference(
-
source_entry_id=ref.source_entry_id,
-
source_username=ref.source_username,
-
target_url=ref.target_url,
-
target_username=ref.target_username,
-
target_entry_id=resolved_entry_id,
-
)
-
resolved_refs.append(resolved_ref)
-
-
return resolved_refs
+428
src/thicket/core/typesense_client.py
···
+
"""Typesense integration for thicket."""
+
+
import json
+
import logging
+
from datetime import datetime
+
from typing import Any, Optional
+
from urllib.parse import urlparse
+
+
import typesense
+
from pydantic import BaseModel, ConfigDict
+
+
from ..models.config import ThicketConfig, UserConfig
+
from ..models.feed import AtomEntry
+
from ..models.user import UserMetadata
+
from .git_store import GitStore
+
+
logger = logging.getLogger(__name__)
+
+
+
class TypesenseConfig(BaseModel):
+
"""Configuration for Typesense connection."""
+
+
model_config = ConfigDict(str_strip_whitespace=True)
+
+
host: str
+
port: int = 8108
+
protocol: str = "http"
+
api_key: str
+
connection_timeout: int = 5
+
collection_name: str = "thicket_entries"
+
+
@classmethod
+
def from_url(
+
cls, url: str, api_key: str, collection_name: str = "thicket_entries"
+
) -> "TypesenseConfig":
+
"""Create config from Typesense URL."""
+
parsed = urlparse(url)
+
return cls(
+
host=parsed.hostname or "localhost",
+
port=parsed.port or (443 if parsed.scheme == "https" else 8108),
+
protocol=parsed.scheme or "http",
+
api_key=api_key,
+
collection_name=collection_name,
+
)
+
+
+
class TypesenseDocument(BaseModel):
+
"""Document model for Typesense indexing."""
+
+
model_config = ConfigDict(
+
json_encoders={datetime: lambda v: int(v.timestamp())},
+
str_strip_whitespace=True,
+
)
+
+
# Primary fields from AtomEntry
+
id: str # Sanitized entry ID
+
original_id: str # Original Atom ID
+
title: str
+
link: str
+
updated: int # Unix timestamp
+
published: Optional[int] = None # Unix timestamp
+
summary: Optional[str] = None
+
content: Optional[str] = None
+
content_type: str = "html"
+
categories: list[str] = []
+
rights: Optional[str] = None
+
source: Optional[str] = None
+
+
# User/feed metadata
+
username: str
+
user_display_name: Optional[str] = None
+
user_email: Optional[str] = None
+
user_homepage: Optional[str] = None
+
user_icon: Optional[str] = None
+
+
# Author information from entry
+
author_name: Optional[str] = None
+
author_email: Optional[str] = None
+
author_uri: Optional[str] = None
+
+
# Searchable text fields for embedding/semantic search
+
searchable_content: str # Combined title + summary + content
+
searchable_metadata: str # Combined user info + categories + author
+
+
@classmethod
+
def from_atom_entry_with_metadata(
+
cls,
+
entry: AtomEntry,
+
sanitized_id: str,
+
user_metadata: "UserMetadata", # Import will be added at top
+
) -> "TypesenseDocument":
+
"""Create TypesenseDocument from AtomEntry and UserMetadata from git store."""
+
# Extract author information if available
+
author_name = None
+
author_email = None
+
author_uri = None
+
if entry.author:
+
author_name = entry.author.get("name")
+
author_email = entry.author.get("email")
+
author_uri = entry.author.get("uri")
+
+
# Create searchable content combining all text fields
+
content_parts = [entry.title]
+
if entry.summary:
+
content_parts.append(entry.summary)
+
if entry.content:
+
content_parts.append(entry.content)
+
searchable_content = " ".join(content_parts)
+
+
# Create searchable metadata
+
metadata_parts = [user_metadata.username]
+
if user_metadata.display_name:
+
metadata_parts.append(user_metadata.display_name)
+
if author_name:
+
metadata_parts.append(author_name)
+
if entry.categories:
+
metadata_parts.extend(entry.categories)
+
searchable_metadata = " ".join(metadata_parts)
+
+
return cls(
+
id=sanitized_id,
+
original_id=entry.id,
+
title=entry.title,
+
link=str(entry.link),
+
updated=int(entry.updated.timestamp()),
+
published=int(entry.published.timestamp()) if entry.published else None,
+
summary=entry.summary,
+
content=entry.content,
+
content_type=entry.content_type or "html",
+
categories=entry.categories,
+
rights=entry.rights,
+
source=entry.source,
+
username=user_metadata.username,
+
user_display_name=user_metadata.display_name,
+
user_email=user_metadata.email,
+
user_homepage=user_metadata.homepage,
+
user_icon=user_metadata.icon if user_metadata.icon != "None" else None,
+
author_name=author_name,
+
author_email=author_email,
+
author_uri=author_uri,
+
searchable_content=searchable_content,
+
searchable_metadata=searchable_metadata,
+
)
+
+
@classmethod
+
def from_atom_entry(
+
cls,
+
entry: AtomEntry,
+
sanitized_id: str,
+
user_config: UserConfig,
+
) -> "TypesenseDocument":
+
"""Create TypesenseDocument from AtomEntry and UserConfig."""
+
# Extract author information if available
+
author_name = None
+
author_email = None
+
author_uri = None
+
if entry.author:
+
author_name = entry.author.get("name")
+
author_email = entry.author.get("email")
+
author_uri = entry.author.get("uri")
+
+
# Create searchable content combining all text fields
+
content_parts = [entry.title]
+
if entry.summary:
+
content_parts.append(entry.summary)
+
if entry.content:
+
content_parts.append(entry.content)
+
searchable_content = " ".join(content_parts)
+
+
# Create searchable metadata
+
metadata_parts = [user_config.username]
+
if user_config.display_name:
+
metadata_parts.append(user_config.display_name)
+
if author_name:
+
metadata_parts.append(author_name)
+
if entry.categories:
+
metadata_parts.extend(entry.categories)
+
searchable_metadata = " ".join(metadata_parts)
+
+
return cls(
+
id=sanitized_id,
+
original_id=entry.id,
+
title=entry.title,
+
link=str(entry.link),
+
updated=int(entry.updated.timestamp()),
+
published=int(entry.published.timestamp()) if entry.published else None,
+
summary=entry.summary,
+
content=entry.content,
+
content_type=entry.content_type or "html",
+
categories=entry.categories,
+
rights=entry.rights,
+
source=entry.source,
+
username=user_config.username,
+
user_display_name=user_config.display_name,
+
user_email=str(user_config.email) if user_config.email else None,
+
user_homepage=str(user_config.homepage) if user_config.homepage else None,
+
user_icon=str(user_config.icon) if user_config.icon else None,
+
author_name=author_name,
+
author_email=author_email,
+
author_uri=author_uri,
+
searchable_content=searchable_content,
+
searchable_metadata=searchable_metadata,
+
)
+
+
+
class TypesenseClient:
+
"""Client for interacting with Typesense search engine."""
+
+
def __init__(self, config: TypesenseConfig):
+
"""Initialize Typesense client."""
+
self.config = config
+
self.client = typesense.Client(
+
{
+
"nodes": [
+
{
+
"host": config.host,
+
"port": config.port,
+
"protocol": config.protocol,
+
}
+
],
+
"api_key": config.api_key,
+
"connection_timeout_seconds": config.connection_timeout,
+
}
+
)
+
+
def get_collection_schema(self) -> dict[str, Any]:
+
"""Get the Typesense collection schema for thicket entries."""
+
return {
+
"name": self.config.collection_name,
+
"fields": [
+
# Primary identifiers
+
{"name": "id", "type": "string", "facet": False},
+
{"name": "original_id", "type": "string", "facet": False},
+
# Content fields - optimized for search
+
{"name": "title", "type": "string", "facet": False},
+
{"name": "summary", "type": "string", "optional": True, "facet": False},
+
{"name": "content", "type": "string", "optional": True, "facet": False},
+
{"name": "content_type", "type": "string", "facet": True},
+
# Searchable combined fields for embeddings/semantic search
+
{"name": "searchable_content", "type": "string", "facet": False},
+
{"name": "searchable_metadata", "type": "string", "facet": False},
+
# Temporal fields
+
{"name": "updated", "type": "int64", "facet": False, "sort": True},
+
{
+
"name": "published",
+
"type": "int64",
+
"optional": True,
+
"facet": False,
+
"sort": True,
+
},
+
# Link and source
+
{"name": "link", "type": "string", "facet": False},
+
{"name": "source", "type": "string", "optional": True, "facet": False},
+
# Categories and classification
+
{
+
"name": "categories",
+
"type": "string[]",
+
"facet": True,
+
"optional": True,
+
},
+
{"name": "rights", "type": "string", "optional": True, "facet": False},
+
# User/feed metadata - facetable for filtering
+
{"name": "username", "type": "string", "facet": True},
+
{
+
"name": "user_display_name",
+
"type": "string",
+
"optional": True,
+
"facet": True,
+
},
+
{
+
"name": "user_email",
+
"type": "string",
+
"optional": True,
+
"facet": False,
+
},
+
{
+
"name": "user_homepage",
+
"type": "string",
+
"optional": True,
+
"facet": False,
+
},
+
{
+
"name": "user_icon",
+
"type": "string",
+
"optional": True,
+
"facet": False,
+
},
+
# Author information from entries
+
{
+
"name": "author_name",
+
"type": "string",
+
"optional": True,
+
"facet": True,
+
},
+
{
+
"name": "author_email",
+
"type": "string",
+
"optional": True,
+
"facet": False,
+
},
+
{
+
"name": "author_uri",
+
"type": "string",
+
"optional": True,
+
"facet": False,
+
},
+
],
+
"default_sorting_field": "updated",
+
}
+
+
def create_collection(self) -> dict[str, Any]:
+
"""Create the Typesense collection with the appropriate schema."""
+
try:
+
# Try to delete existing collection first
+
try:
+
self.client.collections[self.config.collection_name].delete()
+
logger.info(
+
f"Deleted existing collection: {self.config.collection_name}"
+
)
+
except typesense.exceptions.ObjectNotFound:
+
logger.info(
+
f"Collection {self.config.collection_name} does not exist, creating new one"
+
)
+
+
# Create new collection
+
schema = self.get_collection_schema()
+
result = self.client.collections.create(schema)
+
logger.info(f"Created collection: {self.config.collection_name}")
+
return result
+
+
except Exception as e:
+
logger.error(f"Failed to create collection: {e}")
+
raise
+
+
def index_documents(self, documents: list[TypesenseDocument]) -> dict[str, Any]:
+
"""Index a batch of documents in Typesense."""
+
try:
+
# Convert documents to dict format for Typesense
+
document_dicts = [doc.model_dump() for doc in documents]
+
+
# Use import endpoint for batch indexing
+
result = self.client.collections[
+
self.config.collection_name
+
].documents.import_(
+
document_dicts,
+
{"action": "upsert"}, # Update if exists, insert if not
+
)
+
+
logger.info(f"Indexed {len(documents)} documents")
+
return result
+
+
except Exception as e:
+
logger.error(f"Failed to index documents: {e}")
+
raise
+
+
def upload_from_git_store(
+
self, git_store: GitStore, config: ThicketConfig
+
) -> dict[str, Any]:
+
"""Upload all entries from the Git store to Typesense."""
+
logger.info("Starting Typesense upload from Git store")
+
+
# Create collection
+
self.create_collection()
+
+
documents = []
+
index = git_store._load_index()
+
+
for username, user_metadata in index.users.items():
+
logger.info(f"Processing entries for user: {username}")
+
+
# Load user entries from directory
+
try:
+
user_dir = git_store.repo_path / user_metadata.directory
+
if not user_dir.exists():
+
logger.warning(
+
f"Directory not found for user {username}: {user_dir}"
+
)
+
continue
+
+
entry_files = list(user_dir.glob("*.json"))
+
logger.info(f"Found {len(entry_files)} entry files for {username}")
+
+
for entry_file in entry_files:
+
try:
+
with open(entry_file) as f:
+
data = json.load(f)
+
+
entry = AtomEntry(**data)
+
sanitized_id = entry_file.stem # filename without extension
+
+
doc = TypesenseDocument.from_atom_entry_with_metadata(
+
entry, sanitized_id, user_metadata
+
)
+
documents.append(doc)
+
except Exception as e:
+
logger.error(
+
f"Failed to convert entry {entry_file} to document: {e}"
+
)
+
+
except Exception as e:
+
logger.error(f"Failed to load entries for user {username}: {e}")
+
+
if documents:
+
logger.info(f"Uploading {len(documents)} documents to Typesense")
+
result = self.index_documents(documents)
+
logger.info("Upload completed successfully")
+
return result
+
else:
+
logger.warning("No documents to upload")
+
return {}
+
+
def search(
+
self, query: str, search_parameters: Optional[dict[str, Any]] = None
+
) -> dict[str, Any]:
+
"""Search the collection."""
+
default_params = {
+
"q": query,
+
"query_by": "title,searchable_content,searchable_metadata",
+
"sort_by": "updated:desc",
+
"per_page": 20,
+
}
+
+
if search_parameters:
+
default_params.update(search_parameters)
+
+
return self.client.collections[self.config.collection_name].documents.search(
+
default_params
+
)
+2 -1
src/thicket/models/__init__.py
···
from .config import ThicketConfig, UserConfig
from .feed import AtomEntry, DuplicateMap, FeedMetadata
-
from .user import GitStoreIndex, UserMetadata
+
from .user import GitStoreIndex, UserMetadata, ZulipAssociation
__all__ = [
"ThicketConfig",
···
"FeedMetadata",
"GitStoreIndex",
"UserMetadata",
+
"ZulipAssociation",
]
+24
src/thicket/models/config.py
···
git_store: Path
cache_dir: Path
users: list[UserConfig] = []
+
+
def find_user(self, username: str) -> Optional[UserConfig]:
+
"""Find a user by username."""
+
for user in self.users:
+
if user.username == username:
+
return user
+
return None
+
+
def add_user(self, user: UserConfig) -> bool:
+
"""Add a user to the configuration. Returns True if added, False if already exists."""
+
if self.find_user(user.username) is not None:
+
return False
+
self.users.append(user)
+
return True
+
+
def add_feed_to_user(self, username: str, feed_url: HttpUrl) -> bool:
+
"""Add a feed to an existing user. Returns True if added, False if user not found or feed already exists."""
+
user = self.find_user(username)
+
if user is None:
+
return False
+
if feed_url in user.feeds:
+
return False
+
user.feeds.append(feed_url)
+
return True
+2 -2
src/thicket/models/feed.py
···
"""Feed and entry models for thicket."""
from datetime import datetime
-
from typing import TYPE_CHECKING, Optional
+
from typing import TYPE_CHECKING, Any, Optional
from pydantic import BaseModel, ConfigDict, EmailStr, HttpUrl
···
summary: Optional[str] = None
content: Optional[str] = None # Full body content from Atom entry
content_type: Optional[str] = "html" # text, html, xhtml
-
author: Optional[dict] = None
+
author: Optional[dict[str, Any]] = None
categories: list[str] = []
rights: Optional[str] = None # Copyright info
source: Optional[str] = None # Source feed URL
+41 -4
src/thicket/models/user.py
···
from datetime import datetime
from typing import Optional
-
from pydantic import BaseModel, ConfigDict
+
from pydantic import BaseModel, ConfigDict, Field
+
+
+
class ZulipAssociation(BaseModel):
+
"""Association between a user and their Zulip identity."""
+
+
server: str # Zulip server URL (e.g., "yourorg.zulipchat.com")
+
user_id: str # Zulip user ID or email for @mentions
+
+
def __hash__(self) -> int:
+
"""Make hashable for use in sets."""
+
return hash((self.server, self.user_id))
class UserMetadata(BaseModel):
···
homepage: Optional[str] = None
icon: Optional[str] = None
feeds: list[str] = []
+
zulip_associations: list[ZulipAssociation] = Field(
+
default_factory=list
+
) # Zulip server/user pairs
directory: str # Directory name in Git store
created: datetime
last_updated: datetime
···
self.entry_count += count
self.update_timestamp()
+
def add_zulip_association(self, server: str, user_id: str) -> bool:
+
"""Add a Zulip association if it doesn't exist. Returns True if added."""
+
association = ZulipAssociation(server=server, user_id=user_id)
+
if association not in self.zulip_associations:
+
self.zulip_associations.append(association)
+
self.update_timestamp()
+
return True
+
return False
+
+
def remove_zulip_association(self, server: str, user_id: str) -> bool:
+
"""Remove a Zulip association. Returns True if removed."""
+
association = ZulipAssociation(server=server, user_id=user_id)
+
if association in self.zulip_associations:
+
self.zulip_associations.remove(association)
+
self.update_timestamp()
+
return True
+
return False
+
+
def get_zulip_mention(self, server: str) -> Optional[str]:
+
"""Get the Zulip user_id for @mentions on a specific server."""
+
for association in self.zulip_associations:
+
if association.server == server:
+
return association.user_id
+
return None
+
class GitStoreIndex(BaseModel):
"""Index of all users and their directories in the Git store."""
-
model_config = ConfigDict(
-
json_encoders={datetime: lambda v: v.isoformat()}
-
)
+
model_config = ConfigDict(json_encoders={datetime: lambda v: v.isoformat()})
users: dict[str, UserMetadata] = {} # username -> UserMetadata
created: datetime
+297
tests/test_bot.py
···
+
"""Tests for the Thicket Zulip bot."""
+
+
import pytest
+
+
from thicket.bots.test_bot import (
+
BotTester,
+
MockBotHandler,
+
create_test_entry,
+
create_test_message,
+
)
+
from thicket.bots.thicket_bot import ThicketBotHandler
+
+
+
class TestThicketBot:
+
"""Test suite for ThicketBotHandler."""
+
+
def setup_method(self) -> None:
+
"""Set up test environment."""
+
self.bot = ThicketBotHandler()
+
self.handler = MockBotHandler()
+
+
def test_usage(self) -> None:
+
"""Test bot usage message."""
+
usage = self.bot.usage()
+
assert "Thicket Feed Bot" in usage
+
assert "@thicket status" in usage
+
assert "@thicket config" in usage
+
+
def test_help_command(self) -> None:
+
"""Test help command response."""
+
message = create_test_message("@thicket help")
+
self.bot.handle_message(message, self.handler)
+
+
assert len(self.handler.sent_messages) == 1
+
response = self.handler.sent_messages[0]["content"]
+
assert "Thicket Feed Bot" in response
+
+
def test_status_command_unconfigured(self) -> None:
+
"""Test status command when bot is not configured."""
+
message = create_test_message("@thicket status")
+
self.bot.handle_message(message, self.handler)
+
+
assert len(self.handler.sent_messages) == 1
+
response = self.handler.sent_messages[0]["content"]
+
assert "Not configured" in response
+
assert "Stream:" in response
+
assert "Topic:" in response
+
+
def test_config_stream_command(self) -> None:
+
"""Test setting stream configuration."""
+
message = create_test_message("@thicket config stream general")
+
self.bot.handle_message(message, self.handler)
+
+
assert len(self.handler.sent_messages) == 1
+
response = self.handler.sent_messages[0]["content"]
+
assert "Stream set to: **general**" in response
+
assert self.bot.stream_name == "general"
+
+
def test_config_topic_command(self) -> None:
+
"""Test setting topic configuration."""
+
message = create_test_message("@thicket config topic 'Feed Updates'")
+
self.bot.handle_message(message, self.handler)
+
+
assert len(self.handler.sent_messages) == 1
+
response = self.handler.sent_messages[0]["content"]
+
assert "Topic set to:" in response and "Feed Updates" in response
+
assert self.bot.topic_name == "'Feed Updates'"
+
+
def test_config_interval_command(self) -> None:
+
"""Test setting sync interval."""
+
message = create_test_message("@thicket config interval 600")
+
self.bot.handle_message(message, self.handler)
+
+
assert len(self.handler.sent_messages) == 1
+
response = self.handler.sent_messages[0]["content"]
+
assert "Sync interval set to: **600s**" in response
+
assert self.bot.sync_interval == 600
+
+
def test_config_interval_too_small(self) -> None:
+
"""Test setting sync interval that's too small."""
+
message = create_test_message("@thicket config interval 30")
+
self.bot.handle_message(message, self.handler)
+
+
assert len(self.handler.sent_messages) == 1
+
response = self.handler.sent_messages[0]["content"]
+
assert "must be at least 60 seconds" in response
+
assert self.bot.sync_interval != 30
+
+
def test_config_path_nonexistent(self) -> None:
+
"""Test setting config path that doesn't exist."""
+
message = create_test_message("@thicket config path /nonexistent/config.yaml")
+
self.bot.handle_message(message, self.handler)
+
+
assert len(self.handler.sent_messages) == 1
+
response = self.handler.sent_messages[0]["content"]
+
assert "Config file not found" in response
+
+
def test_unknown_command(self) -> None:
+
"""Test unknown command handling."""
+
message = create_test_message("@thicket unknown")
+
self.bot.handle_message(message, self.handler)
+
+
assert len(self.handler.sent_messages) == 1
+
response = self.handler.sent_messages[0]["content"]
+
assert "Unknown command: unknown" in response
+
+
def test_config_persistence(self) -> None:
+
"""Test that configuration is persisted."""
+
# Set some config
+
self.bot.stream_name = "test-stream"
+
self.bot.topic_name = "test-topic"
+
self.bot.sync_interval = 600
+
+
# Save config
+
self.bot._save_bot_config(self.handler)
+
+
# Create new bot instance
+
new_bot = ThicketBotHandler()
+
new_bot._load_bot_config(self.handler)
+
+
# Check config was loaded
+
assert new_bot.stream_name == "test-stream"
+
assert new_bot.topic_name == "test-topic"
+
assert new_bot.sync_interval == 600
+
+
def test_posted_entries_persistence(self) -> None:
+
"""Test that posted entries are persisted."""
+
# Add some entries
+
self.bot.posted_entries = {"user1:entry1", "user2:entry2"}
+
+
# Save entries
+
self.bot._save_posted_entries(self.handler)
+
+
# Create new bot instance
+
new_bot = ThicketBotHandler()
+
new_bot._load_posted_entries(self.handler)
+
+
# Check entries were loaded
+
assert new_bot.posted_entries == {"user1:entry1", "user2:entry2"}
+
+
def test_mention_detection(self) -> None:
+
"""Test bot mention detection."""
+
assert self.bot._is_mentioned("@Thicket Bot help", self.handler)
+
assert self.bot._is_mentioned("@thicket status", self.handler)
+
assert not self.bot._is_mentioned("regular message", self.handler)
+
+
def test_mention_cleaning(self) -> None:
+
"""Test cleaning mentions from messages."""
+
cleaned = self.bot._clean_mention("@Thicket Bot status", self.handler)
+
assert cleaned == "status"
+
+
cleaned = self.bot._clean_mention("@thicket help", self.handler)
+
assert cleaned == "help"
+
+
def test_sync_now_uninitialized(self) -> None:
+
"""Test sync now command when not initialized."""
+
message = create_test_message("@thicket sync now")
+
self.bot.handle_message(message, self.handler)
+
+
assert len(self.handler.sent_messages) == 1
+
response = self.handler.sent_messages[0]["content"]
+
assert "not initialized" in response.lower()
+
+
def test_debug_mode_initialization(self) -> None:
+
"""Test debug mode initialization."""
+
import os
+
+
# Mock environment variable
+
os.environ["THICKET_DEBUG_USER"] = "testuser"
+
+
try:
+
bot = ThicketBotHandler()
+
# Simulate initialize call
+
bot.debug_user = os.getenv("THICKET_DEBUG_USER")
+
+
assert bot.debug_user == "testuser"
+
assert bot.debug_zulip_user_id is None # Not validated yet
+
finally:
+
# Clean up
+
if "THICKET_DEBUG_USER" in os.environ:
+
del os.environ["THICKET_DEBUG_USER"]
+
+
def test_debug_mode_status(self) -> None:
+
"""Test status command in debug mode."""
+
self.bot.debug_user = "testuser"
+
self.bot.debug_zulip_user_id = "test.user"
+
+
message = create_test_message("@thicket status")
+
self.bot.handle_message(message, self.handler)
+
+
assert len(self.handler.sent_messages) == 1
+
response = self.handler.sent_messages[0]["content"]
+
assert "**Debug Mode:** ENABLED" in response
+
assert "**Debug User:** testuser" in response
+
assert "**Debug Zulip ID:** test.user" in response
+
+
def test_debug_mode_check_initialization(self) -> None:
+
"""Test initialization check in debug mode."""
+
from unittest.mock import Mock
+
+
# Setup mock git store and config
+
self.bot.git_store = Mock()
+
self.bot.config = Mock()
+
self.bot.debug_user = "testuser"
+
self.bot.debug_zulip_user_id = "test.user"
+
+
message = create_test_message("@thicket sync now")
+
+
# Should pass with debug mode properly set up
+
result = self.bot._check_initialization(message, self.handler)
+
assert result is True
+
+
# Should fail if debug_zulip_user_id is missing
+
self.bot.debug_zulip_user_id = None
+
result = self.bot._check_initialization(message, self.handler)
+
assert result is False
+
assert len(self.handler.sent_messages) == 1
+
assert (
+
"Debug mode validation failed" in self.handler.sent_messages[0]["content"]
+
)
+
+
def test_debug_mode_dm_posting(self) -> None:
+
"""Test that debug mode posts DMs instead of stream messages."""
+
from unittest.mock import Mock
+
+
# Setup bot in debug mode
+
self.bot.debug_user = "testuser"
+
self.bot.debug_zulip_user_id = "test.user@example.com"
+
self.bot.git_store = Mock()
+
+
# Create a test entry
+
entry = create_test_entry()
+
+
# Mock the handler config
+
self.handler.config_info = {
+
"full_name": "Thicket Bot",
+
"email": "thicket-bot@example.com",
+
"site": "https://example.zulipchat.com",
+
}
+
+
# Mock git store user
+
mock_user = Mock()
+
mock_user.get_zulip_mention.return_value = "author.user"
+
self.bot.git_store.get_user.return_value = mock_user
+
+
# Post entry
+
self.bot._post_entry_to_zulip(entry, self.handler, "testauthor")
+
+
# Check that a DM was sent
+
assert len(self.handler.sent_messages) == 1
+
message = self.handler.sent_messages[0]
+
+
# Verify it's a DM
+
assert message["type"] == "private"
+
assert message["to"] == ["test.user@example.com"]
+
assert "DEBUG:" in message["content"]
+
assert entry.title in message["content"]
+
assert "@**author.user** posted:" in message["content"]
+
+
+
class TestBotTester:
+
"""Test the bot testing utilities."""
+
+
def test_bot_tester_basic(self) -> None:
+
"""Test basic bot tester functionality."""
+
tester = BotTester()
+
+
# Test help command
+
responses = tester.send_command("help")
+
assert len(responses) == 1
+
assert "Thicket Feed Bot" in tester.get_last_response_content()
+
+
def test_bot_tester_config(self) -> None:
+
"""Test bot tester configuration."""
+
tester = BotTester()
+
+
# Configure stream
+
tester.send_command("config stream general")
+
tester.assert_response_contains("Stream set to")
+
+
# Configure topic
+
tester.send_command("config topic test")
+
tester.assert_response_contains("Topic set to")
+
+
def test_assert_response_contains(self) -> None:
+
"""Test response assertion helper."""
+
tester = BotTester()
+
+
# Send command
+
tester.send_command("help")
+
+
# This should pass
+
tester.assert_response_contains("Thicket Feed Bot")
+
+
# This should fail
+
with pytest.raises(AssertionError):
+
tester.assert_response_contains("nonexistent text")
+2 -1
tests/test_feed_parser.py
···
html_with_attrs = '<a href="https://example.com" onclick="alert()">Link</a>'
sanitized = parser._sanitize_html(html_with_attrs)
assert 'href="https://example.com"' in sanitized
-
assert 'onclick' not in sanitized
+
assert "onclick" not in sanitized
def test_extract_feed_metadata(self):
"""Test feed metadata extraction."""
···
# Test with feedparser parsed data
import feedparser
+
parsed = feedparser.parse("""<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
<title>Test Feed</title>
+7 -2
tests/test_git_store.py
···
duplicates = store.get_duplicates()
assert len(duplicates.duplicates) == 1
assert duplicates.is_duplicate("https://example.com/dup")
-
assert duplicates.get_canonical("https://example.com/dup") == "https://example.com/canonical"
+
assert (
+
duplicates.get_canonical("https://example.com/dup")
+
== "https://example.com/canonical"
+
)
# Remove duplicate
result = store.remove_duplicate("https://example.com/dup")
···
entry = AtomEntry(
id=f"https://example.com/entry/{title.lower().replace(' ', '-')}",
title=title,
-
link=HttpUrl(f"https://example.com/entry/{title.lower().replace(' ', '-')}"),
+
link=HttpUrl(
+
f"https://example.com/entry/{title.lower().replace(' ', '-')}"
+
),
updated=datetime.now(),
summary=summary,
)
+88 -4
tests/test_models.py
···
ThicketConfig,
UserConfig,
UserMetadata,
+
ZulipAssociation,
)
···
git_store=temp_dir / "git_store",
cache_dir=temp_dir / "cache",
users=[
-
UserConfig(username="testuser", feeds=["https://example.com/feed1.xml"]),
+
UserConfig(
+
username="testuser", feeds=["https://example.com/feed1.xml"]
+
),
],
)
-
result = config.add_feed_to_user("testuser", HttpUrl("https://example.com/feed2.xml"))
+
result = config.add_feed_to_user(
+
"testuser", HttpUrl("https://example.com/feed2.xml")
+
)
assert result is True
user = config.find_user("testuser")
···
assert HttpUrl("https://example.com/feed2.xml") in user.feeds
# Test adding to non-existent user
-
result = config.add_feed_to_user("nonexistent", HttpUrl("https://example.com/feed.xml"))
+
result = config.add_feed_to_user(
+
"nonexistent", HttpUrl("https://example.com/feed.xml")
+
)
assert result is False
···
user_config = metadata.to_user_config("testuser", feed_url)
assert user_config.display_name == "Test Feed" # Falls back to title
-
assert user_config.homepage == HttpUrl("https://example.com") # Falls back to link
+
assert user_config.homepage == HttpUrl(
+
"https://example.com"
+
) # Falls back to link
assert user_config.icon == HttpUrl("https://example.com/icon.png")
assert user_config.email is None
···
assert metadata.entry_count == original_count + 3
assert metadata.last_updated > original_time
+
+
def test_zulip_associations(self):
+
"""Test Zulip association methods."""
+
metadata = UserMetadata(
+
username="testuser",
+
directory="testuser",
+
created=datetime.now(),
+
last_updated=datetime.now(),
+
)
+
+
# Test adding association
+
result = metadata.add_zulip_association("example.zulipchat.com", "alice")
+
assert result is True
+
assert len(metadata.zulip_associations) == 1
+
assert metadata.zulip_associations[0].server == "example.zulipchat.com"
+
assert metadata.zulip_associations[0].user_id == "alice"
+
+
# Test adding duplicate association
+
result = metadata.add_zulip_association("example.zulipchat.com", "alice")
+
assert result is False
+
assert len(metadata.zulip_associations) == 1
+
+
# Test adding different association
+
result = metadata.add_zulip_association("other.zulipchat.com", "alice")
+
assert result is True
+
assert len(metadata.zulip_associations) == 2
+
+
# Test get_zulip_mention
+
mention = metadata.get_zulip_mention("example.zulipchat.com")
+
assert mention == "alice"
+
+
mention = metadata.get_zulip_mention("other.zulipchat.com")
+
assert mention == "alice"
+
+
mention = metadata.get_zulip_mention("nonexistent.zulipchat.com")
+
assert mention is None
+
+
# Test removing association
+
result = metadata.remove_zulip_association("example.zulipchat.com", "alice")
+
assert result is True
+
assert len(metadata.zulip_associations) == 1
+
+
# Test removing non-existent association
+
result = metadata.remove_zulip_association("example.zulipchat.com", "alice")
+
assert result is False
+
assert len(metadata.zulip_associations) == 1
+
+
+
class TestZulipAssociation:
+
"""Test ZulipAssociation model."""
+
+
def test_valid_association(self):
+
"""Test creating valid Zulip association."""
+
assoc = ZulipAssociation(
+
server="example.zulipchat.com", user_id="alice@example.com"
+
)
+
+
assert assoc.server == "example.zulipchat.com"
+
assert assoc.user_id == "alice@example.com"
+
+
def test_association_hash(self):
+
"""Test that associations are hashable."""
+
assoc1 = ZulipAssociation(server="example.zulipchat.com", user_id="alice")
+
assoc2 = ZulipAssociation(server="example.zulipchat.com", user_id="alice")
+
assoc3 = ZulipAssociation(server="other.zulipchat.com", user_id="alice")
+
+
# Same associations should have same hash
+
assert hash(assoc1) == hash(assoc2)
+
+
# Different associations should have different hash
+
assert hash(assoc1) != hash(assoc3)
+
+
# Can be used in sets
+
assoc_set = {assoc1, assoc2, assoc3}
+
assert len(assoc_set) == 2 # assoc1 and assoc2 are considered the same
+345 -1
uv.lock
···
version = 1
-
revision = 2
+
revision = 3
requires-python = ">=3.9"
resolution-markers = [
"python_full_version >= '3.10'",
···
sdist = { url = "https://files.pythonhosted.org/packages/95/7d/4c1bd541d4dffa1b52bd83fb8527089e097a106fc90b467a7313b105f840/anyio-4.9.0.tar.gz", hash = "sha256:673c0c244e15788651a4ff38710fea9675823028a6f08a5eda409e0c9840a028", size = 190949, upload-time = "2025-03-17T00:02:54.77Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/a1/ee/48ca1a7c89ffec8b6a0c5d02b89c305671d5ffd8d3c94acf8b8c408575bb/anyio-4.9.0-py3-none-any.whl", hash = "sha256:9f76d541cad6e36af7beb62e978876f3b41e3e04f2c1fbf0884604c0a9c4d93c", size = 100916, upload-time = "2025-03-17T00:02:52.713Z" },
+
]
+
+
[[package]]
+
name = "beautifulsoup4"
+
version = "4.13.4"
+
source = { registry = "https://pypi.org/simple" }
+
dependencies = [
+
{ name = "soupsieve" },
+
{ name = "typing-extensions" },
+
]
+
sdist = { url = "https://files.pythonhosted.org/packages/d8/e4/0c4c39e18fd76d6a628d4dd8da40543d136ce2d1752bd6eeeab0791f4d6b/beautifulsoup4-4.13.4.tar.gz", hash = "sha256:dbb3c4e1ceae6aefebdaf2423247260cd062430a410e38c66f2baa50a8437195", size = 621067, upload-time = "2025-04-15T17:05:13.836Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/50/cd/30110dc0ffcf3b131156077b90e9f60ed75711223f306da4db08eff8403b/beautifulsoup4-4.13.4-py3-none-any.whl", hash = "sha256:9bbbb14bfde9d79f38b8cd5f8c7c85f4b8f2523190ebed90e950a8dea4cb1c4b", size = 187285, upload-time = "2025-04-15T17:05:12.221Z" },
]
[[package]]
···
]
[[package]]
+
name = "charset-normalizer"
+
version = "3.4.3"
+
source = { registry = "https://pypi.org/simple" }
+
sdist = { url = "https://files.pythonhosted.org/packages/83/2d/5fd176ceb9b2fc619e63405525573493ca23441330fcdaee6bef9460e924/charset_normalizer-3.4.3.tar.gz", hash = "sha256:6fce4b8500244f6fcb71465d4a4930d132ba9ab8e71a7859e6a5d59851068d14", size = 122371, upload-time = "2025-08-09T07:57:28.46Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/d6/98/f3b8013223728a99b908c9344da3aa04ee6e3fa235f19409033eda92fb78/charset_normalizer-3.4.3-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:fb7f67a1bfa6e40b438170ebdc8158b78dc465a5a67b6dde178a46987b244a72", size = 207695, upload-time = "2025-08-09T07:55:36.452Z" },
+
{ url = "https://files.pythonhosted.org/packages/21/40/5188be1e3118c82dcb7c2a5ba101b783822cfb413a0268ed3be0468532de/charset_normalizer-3.4.3-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:cc9370a2da1ac13f0153780040f465839e6cccb4a1e44810124b4e22483c93fe", size = 147153, upload-time = "2025-08-09T07:55:38.467Z" },
+
{ url = "https://files.pythonhosted.org/packages/37/60/5d0d74bc1e1380f0b72c327948d9c2aca14b46a9efd87604e724260f384c/charset_normalizer-3.4.3-cp310-cp310-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:07a0eae9e2787b586e129fdcbe1af6997f8d0e5abaa0bc98c0e20e124d67e601", size = 160428, upload-time = "2025-08-09T07:55:40.072Z" },
+
{ url = "https://files.pythonhosted.org/packages/85/9a/d891f63722d9158688de58d050c59dc3da560ea7f04f4c53e769de5140f5/charset_normalizer-3.4.3-cp310-cp310-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:74d77e25adda8581ffc1c720f1c81ca082921329452eba58b16233ab1842141c", size = 157627, upload-time = "2025-08-09T07:55:41.706Z" },
+
{ url = "https://files.pythonhosted.org/packages/65/1a/7425c952944a6521a9cfa7e675343f83fd82085b8af2b1373a2409c683dc/charset_normalizer-3.4.3-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d0e909868420b7049dafd3a31d45125b31143eec59235311fc4c57ea26a4acd2", size = 152388, upload-time = "2025-08-09T07:55:43.262Z" },
+
{ url = "https://files.pythonhosted.org/packages/f0/c9/a2c9c2a355a8594ce2446085e2ec97fd44d323c684ff32042e2a6b718e1d/charset_normalizer-3.4.3-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:c6f162aabe9a91a309510d74eeb6507fab5fff92337a15acbe77753d88d9dcf0", size = 150077, upload-time = "2025-08-09T07:55:44.903Z" },
+
{ url = "https://files.pythonhosted.org/packages/3b/38/20a1f44e4851aa1c9105d6e7110c9d020e093dfa5836d712a5f074a12bf7/charset_normalizer-3.4.3-cp310-cp310-musllinux_1_2_ppc64le.whl", hash = "sha256:4ca4c094de7771a98d7fbd67d9e5dbf1eb73efa4f744a730437d8a3a5cf994f0", size = 161631, upload-time = "2025-08-09T07:55:46.346Z" },
+
{ url = "https://files.pythonhosted.org/packages/a4/fa/384d2c0f57edad03d7bec3ebefb462090d8905b4ff5a2d2525f3bb711fac/charset_normalizer-3.4.3-cp310-cp310-musllinux_1_2_s390x.whl", hash = "sha256:02425242e96bcf29a49711b0ca9f37e451da7c70562bc10e8ed992a5a7a25cc0", size = 159210, upload-time = "2025-08-09T07:55:47.539Z" },
+
{ url = "https://files.pythonhosted.org/packages/33/9e/eca49d35867ca2db336b6ca27617deed4653b97ebf45dfc21311ce473c37/charset_normalizer-3.4.3-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:78deba4d8f9590fe4dae384aeff04082510a709957e968753ff3c48399f6f92a", size = 153739, upload-time = "2025-08-09T07:55:48.744Z" },
+
{ url = "https://files.pythonhosted.org/packages/2a/91/26c3036e62dfe8de8061182d33be5025e2424002125c9500faff74a6735e/charset_normalizer-3.4.3-cp310-cp310-win32.whl", hash = "sha256:d79c198e27580c8e958906f803e63cddb77653731be08851c7df0b1a14a8fc0f", size = 99825, upload-time = "2025-08-09T07:55:50.305Z" },
+
{ url = "https://files.pythonhosted.org/packages/e2/c6/f05db471f81af1fa01839d44ae2a8bfeec8d2a8b4590f16c4e7393afd323/charset_normalizer-3.4.3-cp310-cp310-win_amd64.whl", hash = "sha256:c6e490913a46fa054e03699c70019ab869e990270597018cef1d8562132c2669", size = 107452, upload-time = "2025-08-09T07:55:51.461Z" },
+
{ url = "https://files.pythonhosted.org/packages/7f/b5/991245018615474a60965a7c9cd2b4efbaabd16d582a5547c47ee1c7730b/charset_normalizer-3.4.3-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:b256ee2e749283ef3ddcff51a675ff43798d92d746d1a6e4631bf8c707d22d0b", size = 204483, upload-time = "2025-08-09T07:55:53.12Z" },
+
{ url = "https://files.pythonhosted.org/packages/c7/2a/ae245c41c06299ec18262825c1569c5d3298fc920e4ddf56ab011b417efd/charset_normalizer-3.4.3-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:13faeacfe61784e2559e690fc53fa4c5ae97c6fcedb8eb6fb8d0a15b475d2c64", size = 145520, upload-time = "2025-08-09T07:55:54.712Z" },
+
{ url = "https://files.pythonhosted.org/packages/3a/a4/b3b6c76e7a635748c4421d2b92c7b8f90a432f98bda5082049af37ffc8e3/charset_normalizer-3.4.3-cp311-cp311-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:00237675befef519d9af72169d8604a067d92755e84fe76492fef5441db05b91", size = 158876, upload-time = "2025-08-09T07:55:56.024Z" },
+
{ url = "https://files.pythonhosted.org/packages/e2/e6/63bb0e10f90a8243c5def74b5b105b3bbbfb3e7bb753915fe333fb0c11ea/charset_normalizer-3.4.3-cp311-cp311-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:585f3b2a80fbd26b048a0be90c5aae8f06605d3c92615911c3a2b03a8a3b796f", size = 156083, upload-time = "2025-08-09T07:55:57.582Z" },
+
{ url = "https://files.pythonhosted.org/packages/87/df/b7737ff046c974b183ea9aa111b74185ac8c3a326c6262d413bd5a1b8c69/charset_normalizer-3.4.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:0e78314bdc32fa80696f72fa16dc61168fda4d6a0c014e0380f9d02f0e5d8a07", size = 150295, upload-time = "2025-08-09T07:55:59.147Z" },
+
{ url = "https://files.pythonhosted.org/packages/61/f1/190d9977e0084d3f1dc169acd060d479bbbc71b90bf3e7bf7b9927dec3eb/charset_normalizer-3.4.3-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:96b2b3d1a83ad55310de8c7b4a2d04d9277d5591f40761274856635acc5fcb30", size = 148379, upload-time = "2025-08-09T07:56:00.364Z" },
+
{ url = "https://files.pythonhosted.org/packages/4c/92/27dbe365d34c68cfe0ca76f1edd70e8705d82b378cb54ebbaeabc2e3029d/charset_normalizer-3.4.3-cp311-cp311-musllinux_1_2_ppc64le.whl", hash = "sha256:939578d9d8fd4299220161fdd76e86c6a251987476f5243e8864a7844476ba14", size = 160018, upload-time = "2025-08-09T07:56:01.678Z" },
+
{ url = "https://files.pythonhosted.org/packages/99/04/baae2a1ea1893a01635d475b9261c889a18fd48393634b6270827869fa34/charset_normalizer-3.4.3-cp311-cp311-musllinux_1_2_s390x.whl", hash = "sha256:fd10de089bcdcd1be95a2f73dbe6254798ec1bda9f450d5828c96f93e2536b9c", size = 157430, upload-time = "2025-08-09T07:56:02.87Z" },
+
{ url = "https://files.pythonhosted.org/packages/2f/36/77da9c6a328c54d17b960c89eccacfab8271fdaaa228305330915b88afa9/charset_normalizer-3.4.3-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:1e8ac75d72fa3775e0b7cb7e4629cec13b7514d928d15ef8ea06bca03ef01cae", size = 151600, upload-time = "2025-08-09T07:56:04.089Z" },
+
{ url = "https://files.pythonhosted.org/packages/64/d4/9eb4ff2c167edbbf08cdd28e19078bf195762e9bd63371689cab5ecd3d0d/charset_normalizer-3.4.3-cp311-cp311-win32.whl", hash = "sha256:6cf8fd4c04756b6b60146d98cd8a77d0cdae0e1ca20329da2ac85eed779b6849", size = 99616, upload-time = "2025-08-09T07:56:05.658Z" },
+
{ url = "https://files.pythonhosted.org/packages/f4/9c/996a4a028222e7761a96634d1820de8a744ff4327a00ada9c8942033089b/charset_normalizer-3.4.3-cp311-cp311-win_amd64.whl", hash = "sha256:31a9a6f775f9bcd865d88ee350f0ffb0e25936a7f930ca98995c05abf1faf21c", size = 107108, upload-time = "2025-08-09T07:56:07.176Z" },
+
{ url = "https://files.pythonhosted.org/packages/e9/5e/14c94999e418d9b87682734589404a25854d5f5d0408df68bc15b6ff54bb/charset_normalizer-3.4.3-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:e28e334d3ff134e88989d90ba04b47d84382a828c061d0d1027b1b12a62b39b1", size = 205655, upload-time = "2025-08-09T07:56:08.475Z" },
+
{ url = "https://files.pythonhosted.org/packages/7d/a8/c6ec5d389672521f644505a257f50544c074cf5fc292d5390331cd6fc9c3/charset_normalizer-3.4.3-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0cacf8f7297b0c4fcb74227692ca46b4a5852f8f4f24b3c766dd94a1075c4884", size = 146223, upload-time = "2025-08-09T07:56:09.708Z" },
+
{ url = "https://files.pythonhosted.org/packages/fc/eb/a2ffb08547f4e1e5415fb69eb7db25932c52a52bed371429648db4d84fb1/charset_normalizer-3.4.3-cp312-cp312-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:c6fd51128a41297f5409deab284fecbe5305ebd7e5a1f959bee1c054622b7018", size = 159366, upload-time = "2025-08-09T07:56:11.326Z" },
+
{ url = "https://files.pythonhosted.org/packages/82/10/0fd19f20c624b278dddaf83b8464dcddc2456cb4b02bb902a6da126b87a1/charset_normalizer-3.4.3-cp312-cp312-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:3cfb2aad70f2c6debfbcb717f23b7eb55febc0bb23dcffc0f076009da10c6392", size = 157104, upload-time = "2025-08-09T07:56:13.014Z" },
+
{ url = "https://files.pythonhosted.org/packages/16/ab/0233c3231af734f5dfcf0844aa9582d5a1466c985bbed6cedab85af9bfe3/charset_normalizer-3.4.3-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:1606f4a55c0fd363d754049cdf400175ee96c992b1f8018b993941f221221c5f", size = 151830, upload-time = "2025-08-09T07:56:14.428Z" },
+
{ url = "https://files.pythonhosted.org/packages/ae/02/e29e22b4e02839a0e4a06557b1999d0a47db3567e82989b5bb21f3fbbd9f/charset_normalizer-3.4.3-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:027b776c26d38b7f15b26a5da1044f376455fb3766df8fc38563b4efbc515154", size = 148854, upload-time = "2025-08-09T07:56:16.051Z" },
+
{ url = "https://files.pythonhosted.org/packages/05/6b/e2539a0a4be302b481e8cafb5af8792da8093b486885a1ae4d15d452bcec/charset_normalizer-3.4.3-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:42e5088973e56e31e4fa58eb6bd709e42fc03799c11c42929592889a2e54c491", size = 160670, upload-time = "2025-08-09T07:56:17.314Z" },
+
{ url = "https://files.pythonhosted.org/packages/31/e7/883ee5676a2ef217a40ce0bffcc3d0dfbf9e64cbcfbdf822c52981c3304b/charset_normalizer-3.4.3-cp312-cp312-musllinux_1_2_s390x.whl", hash = "sha256:cc34f233c9e71701040d772aa7490318673aa7164a0efe3172b2981218c26d93", size = 158501, upload-time = "2025-08-09T07:56:18.641Z" },
+
{ url = "https://files.pythonhosted.org/packages/c1/35/6525b21aa0db614cf8b5792d232021dca3df7f90a1944db934efa5d20bb1/charset_normalizer-3.4.3-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:320e8e66157cc4e247d9ddca8e21f427efc7a04bbd0ac8a9faf56583fa543f9f", size = 153173, upload-time = "2025-08-09T07:56:20.289Z" },
+
{ url = "https://files.pythonhosted.org/packages/50/ee/f4704bad8201de513fdc8aac1cabc87e38c5818c93857140e06e772b5892/charset_normalizer-3.4.3-cp312-cp312-win32.whl", hash = "sha256:fb6fecfd65564f208cbf0fba07f107fb661bcd1a7c389edbced3f7a493f70e37", size = 99822, upload-time = "2025-08-09T07:56:21.551Z" },
+
{ url = "https://files.pythonhosted.org/packages/39/f5/3b3836ca6064d0992c58c7561c6b6eee1b3892e9665d650c803bd5614522/charset_normalizer-3.4.3-cp312-cp312-win_amd64.whl", hash = "sha256:86df271bf921c2ee3818f0522e9a5b8092ca2ad8b065ece5d7d9d0e9f4849bcc", size = 107543, upload-time = "2025-08-09T07:56:23.115Z" },
+
{ url = "https://files.pythonhosted.org/packages/65/ca/2135ac97709b400c7654b4b764daf5c5567c2da45a30cdd20f9eefe2d658/charset_normalizer-3.4.3-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:14c2a87c65b351109f6abfc424cab3927b3bdece6f706e4d12faaf3d52ee5efe", size = 205326, upload-time = "2025-08-09T07:56:24.721Z" },
+
{ url = "https://files.pythonhosted.org/packages/71/11/98a04c3c97dd34e49c7d247083af03645ca3730809a5509443f3c37f7c99/charset_normalizer-3.4.3-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:41d1fc408ff5fdfb910200ec0e74abc40387bccb3252f3f27c0676731df2b2c8", size = 146008, upload-time = "2025-08-09T07:56:26.004Z" },
+
{ url = "https://files.pythonhosted.org/packages/60/f5/4659a4cb3c4ec146bec80c32d8bb16033752574c20b1252ee842a95d1a1e/charset_normalizer-3.4.3-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:1bb60174149316da1c35fa5233681f7c0f9f514509b8e399ab70fea5f17e45c9", size = 159196, upload-time = "2025-08-09T07:56:27.25Z" },
+
{ url = "https://files.pythonhosted.org/packages/86/9e/f552f7a00611f168b9a5865a1414179b2c6de8235a4fa40189f6f79a1753/charset_normalizer-3.4.3-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:30d006f98569de3459c2fc1f2acde170b7b2bd265dc1943e87e1a4efe1b67c31", size = 156819, upload-time = "2025-08-09T07:56:28.515Z" },
+
{ url = "https://files.pythonhosted.org/packages/7e/95/42aa2156235cbc8fa61208aded06ef46111c4d3f0de233107b3f38631803/charset_normalizer-3.4.3-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:416175faf02e4b0810f1f38bcb54682878a4af94059a1cd63b8747244420801f", size = 151350, upload-time = "2025-08-09T07:56:29.716Z" },
+
{ url = "https://files.pythonhosted.org/packages/c2/a9/3865b02c56f300a6f94fc631ef54f0a8a29da74fb45a773dfd3dcd380af7/charset_normalizer-3.4.3-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:6aab0f181c486f973bc7262a97f5aca3ee7e1437011ef0c2ec04b5a11d16c927", size = 148644, upload-time = "2025-08-09T07:56:30.984Z" },
+
{ url = "https://files.pythonhosted.org/packages/77/d9/cbcf1a2a5c7d7856f11e7ac2d782aec12bdfea60d104e60e0aa1c97849dc/charset_normalizer-3.4.3-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:fdabf8315679312cfa71302f9bd509ded4f2f263fb5b765cf1433b39106c3cc9", size = 160468, upload-time = "2025-08-09T07:56:32.252Z" },
+
{ url = "https://files.pythonhosted.org/packages/f6/42/6f45efee8697b89fda4d50580f292b8f7f9306cb2971d4b53f8914e4d890/charset_normalizer-3.4.3-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:bd28b817ea8c70215401f657edef3a8aa83c29d447fb0b622c35403780ba11d5", size = 158187, upload-time = "2025-08-09T07:56:33.481Z" },
+
{ url = "https://files.pythonhosted.org/packages/70/99/f1c3bdcfaa9c45b3ce96f70b14f070411366fa19549c1d4832c935d8e2c3/charset_normalizer-3.4.3-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:18343b2d246dc6761a249ba1fb13f9ee9a2bcd95decc767319506056ea4ad4dc", size = 152699, upload-time = "2025-08-09T07:56:34.739Z" },
+
{ url = "https://files.pythonhosted.org/packages/a3/ad/b0081f2f99a4b194bcbb1934ef3b12aa4d9702ced80a37026b7607c72e58/charset_normalizer-3.4.3-cp313-cp313-win32.whl", hash = "sha256:6fb70de56f1859a3f71261cbe41005f56a7842cc348d3aeb26237560bfa5e0ce", size = 99580, upload-time = "2025-08-09T07:56:35.981Z" },
+
{ url = "https://files.pythonhosted.org/packages/9a/8f/ae790790c7b64f925e5c953b924aaa42a243fb778fed9e41f147b2a5715a/charset_normalizer-3.4.3-cp313-cp313-win_amd64.whl", hash = "sha256:cf1ebb7d78e1ad8ec2a8c4732c7be2e736f6e5123a4146c5b89c9d1f585f8cef", size = 107366, upload-time = "2025-08-09T07:56:37.339Z" },
+
{ url = "https://files.pythonhosted.org/packages/8e/91/b5a06ad970ddc7a0e513112d40113e834638f4ca1120eb727a249fb2715e/charset_normalizer-3.4.3-cp314-cp314-macosx_10_13_universal2.whl", hash = "sha256:3cd35b7e8aedeb9e34c41385fda4f73ba609e561faedfae0a9e75e44ac558a15", size = 204342, upload-time = "2025-08-09T07:56:38.687Z" },
+
{ url = "https://files.pythonhosted.org/packages/ce/ec/1edc30a377f0a02689342f214455c3f6c2fbedd896a1d2f856c002fc3062/charset_normalizer-3.4.3-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:b89bc04de1d83006373429975f8ef9e7932534b8cc9ca582e4db7d20d91816db", size = 145995, upload-time = "2025-08-09T07:56:40.048Z" },
+
{ url = "https://files.pythonhosted.org/packages/17/e5/5e67ab85e6d22b04641acb5399c8684f4d37caf7558a53859f0283a650e9/charset_normalizer-3.4.3-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:2001a39612b241dae17b4687898843f254f8748b796a2e16f1051a17078d991d", size = 158640, upload-time = "2025-08-09T07:56:41.311Z" },
+
{ url = "https://files.pythonhosted.org/packages/f1/e5/38421987f6c697ee3722981289d554957c4be652f963d71c5e46a262e135/charset_normalizer-3.4.3-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:8dcfc373f888e4fb39a7bc57e93e3b845e7f462dacc008d9749568b1c4ece096", size = 156636, upload-time = "2025-08-09T07:56:43.195Z" },
+
{ url = "https://files.pythonhosted.org/packages/a0/e4/5a075de8daa3ec0745a9a3b54467e0c2967daaaf2cec04c845f73493e9a1/charset_normalizer-3.4.3-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:18b97b8404387b96cdbd30ad660f6407799126d26a39ca65729162fd810a99aa", size = 150939, upload-time = "2025-08-09T07:56:44.819Z" },
+
{ url = "https://files.pythonhosted.org/packages/02/f7/3611b32318b30974131db62b4043f335861d4d9b49adc6d57c1149cc49d4/charset_normalizer-3.4.3-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:ccf600859c183d70eb47e05a44cd80a4ce77394d1ac0f79dbd2dd90a69a3a049", size = 148580, upload-time = "2025-08-09T07:56:46.684Z" },
+
{ url = "https://files.pythonhosted.org/packages/7e/61/19b36f4bd67f2793ab6a99b979b4e4f3d8fc754cbdffb805335df4337126/charset_normalizer-3.4.3-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:53cd68b185d98dde4ad8990e56a58dea83a4162161b1ea9272e5c9182ce415e0", size = 159870, upload-time = "2025-08-09T07:56:47.941Z" },
+
{ url = "https://files.pythonhosted.org/packages/06/57/84722eefdd338c04cf3030ada66889298eaedf3e7a30a624201e0cbe424a/charset_normalizer-3.4.3-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:30a96e1e1f865f78b030d65241c1ee850cdf422d869e9028e2fc1d5e4db73b92", size = 157797, upload-time = "2025-08-09T07:56:49.756Z" },
+
{ url = "https://files.pythonhosted.org/packages/72/2a/aff5dd112b2f14bcc3462c312dce5445806bfc8ab3a7328555da95330e4b/charset_normalizer-3.4.3-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:d716a916938e03231e86e43782ca7878fb602a125a91e7acb8b5112e2e96ac16", size = 152224, upload-time = "2025-08-09T07:56:51.369Z" },
+
{ url = "https://files.pythonhosted.org/packages/b7/8c/9839225320046ed279c6e839d51f028342eb77c91c89b8ef2549f951f3ec/charset_normalizer-3.4.3-cp314-cp314-win32.whl", hash = "sha256:c6dbd0ccdda3a2ba7c2ecd9d77b37f3b5831687d8dc1b6ca5f56a4880cc7b7ce", size = 100086, upload-time = "2025-08-09T07:56:52.722Z" },
+
{ url = "https://files.pythonhosted.org/packages/ee/7a/36fbcf646e41f710ce0a563c1c9a343c6edf9be80786edeb15b6f62e17db/charset_normalizer-3.4.3-cp314-cp314-win_amd64.whl", hash = "sha256:73dc19b562516fc9bcf6e5d6e596df0b4eb98d87e4f79f3ae71840e6ed21361c", size = 107400, upload-time = "2025-08-09T07:56:55.172Z" },
+
{ url = "https://files.pythonhosted.org/packages/c2/ca/9a0983dd5c8e9733565cf3db4df2b0a2e9a82659fd8aa2a868ac6e4a991f/charset_normalizer-3.4.3-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:70bfc5f2c318afece2f5838ea5e4c3febada0be750fcf4775641052bbba14d05", size = 207520, upload-time = "2025-08-09T07:57:11.026Z" },
+
{ url = "https://files.pythonhosted.org/packages/39/c6/99271dc37243a4f925b09090493fb96c9333d7992c6187f5cfe5312008d2/charset_normalizer-3.4.3-cp39-cp39-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:23b6b24d74478dc833444cbd927c338349d6ae852ba53a0d02a2de1fce45b96e", size = 147307, upload-time = "2025-08-09T07:57:12.4Z" },
+
{ url = "https://files.pythonhosted.org/packages/e4/69/132eab043356bba06eb333cc2cc60c6340857d0a2e4ca6dc2b51312886b3/charset_normalizer-3.4.3-cp39-cp39-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:34a7f768e3f985abdb42841e20e17b330ad3aaf4bb7e7aeeb73db2e70f077b99", size = 160448, upload-time = "2025-08-09T07:57:13.712Z" },
+
{ url = "https://files.pythonhosted.org/packages/04/9a/914d294daa4809c57667b77470533e65def9c0be1ef8b4c1183a99170e9d/charset_normalizer-3.4.3-cp39-cp39-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:fb731e5deb0c7ef82d698b0f4c5bb724633ee2a489401594c5c88b02e6cb15f7", size = 157758, upload-time = "2025-08-09T07:57:14.979Z" },
+
{ url = "https://files.pythonhosted.org/packages/b0/a8/6f5bcf1bcf63cb45625f7c5cadca026121ff8a6c8a3256d8d8cd59302663/charset_normalizer-3.4.3-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:257f26fed7d7ff59921b78244f3cd93ed2af1800ff048c33f624c87475819dd7", size = 152487, upload-time = "2025-08-09T07:57:16.332Z" },
+
{ url = "https://files.pythonhosted.org/packages/c4/72/d3d0e9592f4e504f9dea08b8db270821c909558c353dc3b457ed2509f2fb/charset_normalizer-3.4.3-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:1ef99f0456d3d46a50945c98de1774da86f8e992ab5c77865ea8b8195341fc19", size = 150054, upload-time = "2025-08-09T07:57:17.576Z" },
+
{ url = "https://files.pythonhosted.org/packages/20/30/5f64fe3981677fe63fa987b80e6c01042eb5ff653ff7cec1b7bd9268e54e/charset_normalizer-3.4.3-cp39-cp39-musllinux_1_2_ppc64le.whl", hash = "sha256:2c322db9c8c89009a990ef07c3bcc9f011a3269bc06782f916cd3d9eed7c9312", size = 161703, upload-time = "2025-08-09T07:57:20.012Z" },
+
{ url = "https://files.pythonhosted.org/packages/e1/ef/dd08b2cac9284fd59e70f7d97382c33a3d0a926e45b15fc21b3308324ffd/charset_normalizer-3.4.3-cp39-cp39-musllinux_1_2_s390x.whl", hash = "sha256:511729f456829ef86ac41ca78c63a5cb55240ed23b4b737faca0eb1abb1c41bc", size = 159096, upload-time = "2025-08-09T07:57:21.329Z" },
+
{ url = "https://files.pythonhosted.org/packages/45/8c/dcef87cfc2b3f002a6478f38906f9040302c68aebe21468090e39cde1445/charset_normalizer-3.4.3-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:88ab34806dea0671532d3f82d82b85e8fc23d7b2dd12fa837978dad9bb392a34", size = 153852, upload-time = "2025-08-09T07:57:22.608Z" },
+
{ url = "https://files.pythonhosted.org/packages/63/86/9cbd533bd37883d467fcd1bd491b3547a3532d0fbb46de2b99feeebf185e/charset_normalizer-3.4.3-cp39-cp39-win32.whl", hash = "sha256:16a8770207946ac75703458e2c743631c79c59c5890c80011d536248f8eaa432", size = 99840, upload-time = "2025-08-09T07:57:23.883Z" },
+
{ url = "https://files.pythonhosted.org/packages/ce/d6/7e805c8e5c46ff9729c49950acc4ee0aeb55efb8b3a56687658ad10c3216/charset_normalizer-3.4.3-cp39-cp39-win_amd64.whl", hash = "sha256:d22dbedd33326a4a5190dd4fe9e9e693ef12160c77382d9e87919bce54f3d4ca", size = 107438, upload-time = "2025-08-09T07:57:25.287Z" },
+
{ url = "https://files.pythonhosted.org/packages/8a/1f/f041989e93b001bc4e44bb1669ccdcf54d3f00e628229a85b08d330615c5/charset_normalizer-3.4.3-py3-none-any.whl", hash = "sha256:ce571ab16d890d23b5c278547ba694193a45011ff86a9162a71307ed9f86759a", size = 53175, upload-time = "2025-08-09T07:57:26.864Z" },
+
]
+
+
[[package]]
name = "click"
version = "8.1.8"
source = { registry = "https://pypi.org/simple" }
···
]
[[package]]
+
name = "distro"
+
version = "1.9.0"
+
source = { registry = "https://pypi.org/simple" }
+
sdist = { url = "https://files.pythonhosted.org/packages/fc/f8/98eea607f65de6527f8a2e8885fc8015d3e6f5775df186e443e0964a11c3/distro-1.9.0.tar.gz", hash = "sha256:2fa77c6fd8940f116ee1d6b94a2f90b13b5ea8d019b98bc8bafdcabcdd9bdbed", size = 60722, upload-time = "2023-12-24T09:54:32.31Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/12/b3/231ffd4ab1fc9d679809f356cebee130ac7daa00d6d6f3206dd4fd137e9e/distro-1.9.0-py3-none-any.whl", hash = "sha256:7bffd925d65168f85027d8da9af6bddab658135b840670a223589bc0c8ef02b2", size = 20277, upload-time = "2023-12-24T09:54:30.421Z" },
+
]
+
+
[[package]]
name = "dnspython"
version = "2.7.0"
source = { registry = "https://pypi.org/simple" }
···
]
[[package]]
+
name = "html2text"
+
version = "2025.4.15"
+
source = { registry = "https://pypi.org/simple" }
+
sdist = { url = "https://files.pythonhosted.org/packages/f8/27/e158d86ba1e82967cc2f790b0cb02030d4a8bef58e0c79a8590e9678107f/html2text-2025.4.15.tar.gz", hash = "sha256:948a645f8f0bc3abe7fd587019a2197a12436cd73d0d4908af95bfc8da337588", size = 64316, upload-time = "2025-04-15T04:02:30.045Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/1d/84/1a0f9555fd5f2b1c924ff932d99b40a0f8a6b12f6dd625e2a47f415b00ea/html2text-2025.4.15-py3-none-any.whl", hash = "sha256:00569167ffdab3d7767a4cdf589b7f57e777a5ed28d12907d8c58769ec734acc", size = 34656, upload-time = "2025-04-15T04:02:28.44Z" },
+
]
+
+
[[package]]
name = "httpcore"
version = "1.0.9"
source = { registry = "https://pypi.org/simple" }
···
]
[[package]]
+
name = "importlib-metadata"
+
version = "8.7.0"
+
source = { registry = "https://pypi.org/simple" }
+
dependencies = [
+
{ name = "zipp" },
+
]
+
sdist = { url = "https://files.pythonhosted.org/packages/76/66/650a33bd90f786193e4de4b3ad86ea60b53c89b669a5c7be931fac31cdb0/importlib_metadata-8.7.0.tar.gz", hash = "sha256:d13b81ad223b890aa16c5471f2ac3056cf76c5f10f82d6f9292f0b415f389000", size = 56641, upload-time = "2025-04-27T15:29:01.736Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/20/b0/36bd937216ec521246249be3bf9855081de4c5e06a0c9b4219dbeda50373/importlib_metadata-8.7.0-py3-none-any.whl", hash = "sha256:e5dd1551894c77868a30651cef00984d50e1002d06942a7101d34870c5f02afd", size = 27656, upload-time = "2025-04-27T15:29:00.214Z" },
+
]
+
+
[[package]]
name = "iniconfig"
version = "2.1.0"
source = { registry = "https://pypi.org/simple" }
···
]
[[package]]
+
name = "lxml"
+
version = "6.0.0"
+
source = { registry = "https://pypi.org/simple" }
+
sdist = { url = "https://files.pythonhosted.org/packages/c5/ed/60eb6fa2923602fba988d9ca7c5cdbd7cf25faa795162ed538b527a35411/lxml-6.0.0.tar.gz", hash = "sha256:032e65120339d44cdc3efc326c9f660f5f7205f3a535c1fdbf898b29ea01fb72", size = 4096938, upload-time = "2025-06-26T16:28:19.373Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/4b/e9/9c3ca02fbbb7585116c2e274b354a2d92b5c70561687dd733ec7b2018490/lxml-6.0.0-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:35bc626eec405f745199200ccb5c6b36f202675d204aa29bb52e27ba2b71dea8", size = 8399057, upload-time = "2025-06-26T16:25:02.169Z" },
+
{ url = "https://files.pythonhosted.org/packages/86/25/10a6e9001191854bf283515020f3633b1b1f96fd1b39aa30bf8fff7aa666/lxml-6.0.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:246b40f8a4aec341cbbf52617cad8ab7c888d944bfe12a6abd2b1f6cfb6f6082", size = 4569676, upload-time = "2025-06-26T16:25:05.431Z" },
+
{ url = "https://files.pythonhosted.org/packages/f5/a5/378033415ff61d9175c81de23e7ad20a3ffb614df4ffc2ffc86bc6746ffd/lxml-6.0.0-cp310-cp310-manylinux2010_i686.manylinux2014_i686.manylinux_2_12_i686.manylinux_2_17_i686.whl", hash = "sha256:2793a627e95d119e9f1e19720730472f5543a6d84c50ea33313ce328d870f2dd", size = 5291361, upload-time = "2025-06-26T16:25:07.901Z" },
+
{ url = "https://files.pythonhosted.org/packages/5a/a6/19c87c4f3b9362b08dc5452a3c3bce528130ac9105fc8fff97ce895ce62e/lxml-6.0.0-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:46b9ed911f36bfeb6338e0b482e7fe7c27d362c52fde29f221fddbc9ee2227e7", size = 5008290, upload-time = "2025-06-28T18:47:13.196Z" },
+
{ url = "https://files.pythonhosted.org/packages/09/d1/e9b7ad4b4164d359c4d87ed8c49cb69b443225cb495777e75be0478da5d5/lxml-6.0.0-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:2b4790b558bee331a933e08883c423f65bbcd07e278f91b2272489e31ab1e2b4", size = 5163192, upload-time = "2025-06-28T18:47:17.279Z" },
+
{ url = "https://files.pythonhosted.org/packages/56/d6/b3eba234dc1584744b0b374a7f6c26ceee5dc2147369a7e7526e25a72332/lxml-6.0.0-cp310-cp310-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e2030956cf4886b10be9a0285c6802e078ec2391e1dd7ff3eb509c2c95a69b76", size = 5076973, upload-time = "2025-06-26T16:25:10.936Z" },
+
{ url = "https://files.pythonhosted.org/packages/8e/47/897142dd9385dcc1925acec0c4afe14cc16d310ce02c41fcd9010ac5d15d/lxml-6.0.0-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4d23854ecf381ab1facc8f353dcd9adeddef3652268ee75297c1164c987c11dc", size = 5297795, upload-time = "2025-06-26T16:25:14.282Z" },
+
{ url = "https://files.pythonhosted.org/packages/fb/db/551ad84515c6f415cea70193a0ff11d70210174dc0563219f4ce711655c6/lxml-6.0.0-cp310-cp310-manylinux_2_31_armv7l.whl", hash = "sha256:43fe5af2d590bf4691531b1d9a2495d7aab2090547eaacd224a3afec95706d76", size = 4776547, upload-time = "2025-06-26T16:25:17.123Z" },
+
{ url = "https://files.pythonhosted.org/packages/e0/14/c4a77ab4f89aaf35037a03c472f1ccc54147191888626079bd05babd6808/lxml-6.0.0-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:74e748012f8c19b47f7d6321ac929a9a94ee92ef12bc4298c47e8b7219b26541", size = 5124904, upload-time = "2025-06-26T16:25:19.485Z" },
+
{ url = "https://files.pythonhosted.org/packages/70/b4/12ae6a51b8da106adec6a2e9c60f532350a24ce954622367f39269e509b1/lxml-6.0.0-cp310-cp310-musllinux_1_2_armv7l.whl", hash = "sha256:43cfbb7db02b30ad3926e8fceaef260ba2fb7df787e38fa2df890c1ca7966c3b", size = 4805804, upload-time = "2025-06-26T16:25:21.949Z" },
+
{ url = "https://files.pythonhosted.org/packages/a9/b6/2e82d34d49f6219cdcb6e3e03837ca5fb8b7f86c2f35106fb8610ac7f5b8/lxml-6.0.0-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:34190a1ec4f1e84af256495436b2d196529c3f2094f0af80202947567fdbf2e7", size = 5323477, upload-time = "2025-06-26T16:25:24.475Z" },
+
{ url = "https://files.pythonhosted.org/packages/a1/e6/b83ddc903b05cd08a5723fefd528eee84b0edd07bdf87f6c53a1fda841fd/lxml-6.0.0-cp310-cp310-win32.whl", hash = "sha256:5967fe415b1920a3877a4195e9a2b779249630ee49ece22021c690320ff07452", size = 3613840, upload-time = "2025-06-26T16:25:27.345Z" },
+
{ url = "https://files.pythonhosted.org/packages/40/af/874fb368dd0c663c030acb92612341005e52e281a102b72a4c96f42942e1/lxml-6.0.0-cp310-cp310-win_amd64.whl", hash = "sha256:f3389924581d9a770c6caa4df4e74b606180869043b9073e2cec324bad6e306e", size = 3993584, upload-time = "2025-06-26T16:25:29.391Z" },
+
{ url = "https://files.pythonhosted.org/packages/4a/f4/d296bc22c17d5607653008f6dd7b46afdfda12efd31021705b507df652bb/lxml-6.0.0-cp310-cp310-win_arm64.whl", hash = "sha256:522fe7abb41309e9543b0d9b8b434f2b630c5fdaf6482bee642b34c8c70079c8", size = 3681400, upload-time = "2025-06-26T16:25:31.421Z" },
+
{ url = "https://files.pythonhosted.org/packages/7c/23/828d4cc7da96c611ec0ce6147bbcea2fdbde023dc995a165afa512399bbf/lxml-6.0.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:4ee56288d0df919e4aac43b539dd0e34bb55d6a12a6562038e8d6f3ed07f9e36", size = 8438217, upload-time = "2025-06-26T16:25:34.349Z" },
+
{ url = "https://files.pythonhosted.org/packages/f1/33/5ac521212c5bcb097d573145d54b2b4a3c9766cda88af5a0e91f66037c6e/lxml-6.0.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:b8dd6dd0e9c1992613ccda2bcb74fc9d49159dbe0f0ca4753f37527749885c25", size = 4590317, upload-time = "2025-06-26T16:25:38.103Z" },
+
{ url = "https://files.pythonhosted.org/packages/2b/2e/45b7ca8bee304c07f54933c37afe7dd4d39ff61ba2757f519dcc71bc5d44/lxml-6.0.0-cp311-cp311-manylinux2010_i686.manylinux2014_i686.manylinux_2_12_i686.manylinux_2_17_i686.whl", hash = "sha256:d7ae472f74afcc47320238b5dbfd363aba111a525943c8a34a1b657c6be934c3", size = 5221628, upload-time = "2025-06-26T16:25:40.878Z" },
+
{ url = "https://files.pythonhosted.org/packages/32/23/526d19f7eb2b85da1f62cffb2556f647b049ebe2a5aa8d4d41b1fb2c7d36/lxml-6.0.0-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:5592401cdf3dc682194727c1ddaa8aa0f3ddc57ca64fd03226a430b955eab6f6", size = 4949429, upload-time = "2025-06-28T18:47:20.046Z" },
+
{ url = "https://files.pythonhosted.org/packages/ac/cc/f6be27a5c656a43a5344e064d9ae004d4dcb1d3c9d4f323c8189ddfe4d13/lxml-6.0.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:58ffd35bd5425c3c3b9692d078bf7ab851441434531a7e517c4984d5634cd65b", size = 5087909, upload-time = "2025-06-28T18:47:22.834Z" },
+
{ url = "https://files.pythonhosted.org/packages/3b/e6/8ec91b5bfbe6972458bc105aeb42088e50e4b23777170404aab5dfb0c62d/lxml-6.0.0-cp311-cp311-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:f720a14aa102a38907c6d5030e3d66b3b680c3e6f6bc95473931ea3c00c59967", size = 5031713, upload-time = "2025-06-26T16:25:43.226Z" },
+
{ url = "https://files.pythonhosted.org/packages/33/cf/05e78e613840a40e5be3e40d892c48ad3e475804db23d4bad751b8cadb9b/lxml-6.0.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c2a5e8d207311a0170aca0eb6b160af91adc29ec121832e4ac151a57743a1e1e", size = 5232417, upload-time = "2025-06-26T16:25:46.111Z" },
+
{ url = "https://files.pythonhosted.org/packages/ac/8c/6b306b3e35c59d5f0b32e3b9b6b3b0739b32c0dc42a295415ba111e76495/lxml-6.0.0-cp311-cp311-manylinux_2_31_armv7l.whl", hash = "sha256:2dd1cc3ea7e60bfb31ff32cafe07e24839df573a5e7c2d33304082a5019bcd58", size = 4681443, upload-time = "2025-06-26T16:25:48.837Z" },
+
{ url = "https://files.pythonhosted.org/packages/59/43/0bd96bece5f7eea14b7220476835a60d2b27f8e9ca99c175f37c085cb154/lxml-6.0.0-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:2cfcf84f1defed7e5798ef4f88aa25fcc52d279be731ce904789aa7ccfb7e8d2", size = 5074542, upload-time = "2025-06-26T16:25:51.65Z" },
+
{ url = "https://files.pythonhosted.org/packages/e2/3d/32103036287a8ca012d8518071f8852c68f2b3bfe048cef2a0202eb05910/lxml-6.0.0-cp311-cp311-musllinux_1_2_armv7l.whl", hash = "sha256:a52a4704811e2623b0324a18d41ad4b9fabf43ce5ff99b14e40a520e2190c851", size = 4729471, upload-time = "2025-06-26T16:25:54.571Z" },
+
{ url = "https://files.pythonhosted.org/packages/ca/a8/7be5d17df12d637d81854bd8648cd329f29640a61e9a72a3f77add4a311b/lxml-6.0.0-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:c16304bba98f48a28ae10e32a8e75c349dd742c45156f297e16eeb1ba9287a1f", size = 5256285, upload-time = "2025-06-26T16:25:56.997Z" },
+
{ url = "https://files.pythonhosted.org/packages/cd/d0/6cb96174c25e0d749932557c8d51d60c6e292c877b46fae616afa23ed31a/lxml-6.0.0-cp311-cp311-win32.whl", hash = "sha256:f8d19565ae3eb956d84da3ef367aa7def14a2735d05bd275cd54c0301f0d0d6c", size = 3612004, upload-time = "2025-06-26T16:25:59.11Z" },
+
{ url = "https://files.pythonhosted.org/packages/ca/77/6ad43b165dfc6dead001410adeb45e88597b25185f4479b7ca3b16a5808f/lxml-6.0.0-cp311-cp311-win_amd64.whl", hash = "sha256:b2d71cdefda9424adff9a3607ba5bbfc60ee972d73c21c7e3c19e71037574816", size = 4003470, upload-time = "2025-06-26T16:26:01.655Z" },
+
{ url = "https://files.pythonhosted.org/packages/a0/bc/4c50ec0eb14f932a18efc34fc86ee936a66c0eb5f2fe065744a2da8a68b2/lxml-6.0.0-cp311-cp311-win_arm64.whl", hash = "sha256:8a2e76efbf8772add72d002d67a4c3d0958638696f541734304c7f28217a9cab", size = 3682477, upload-time = "2025-06-26T16:26:03.808Z" },
+
{ url = "https://files.pythonhosted.org/packages/89/c3/d01d735c298d7e0ddcedf6f028bf556577e5ab4f4da45175ecd909c79378/lxml-6.0.0-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:78718d8454a6e928470d511bf8ac93f469283a45c354995f7d19e77292f26108", size = 8429515, upload-time = "2025-06-26T16:26:06.776Z" },
+
{ url = "https://files.pythonhosted.org/packages/06/37/0e3eae3043d366b73da55a86274a590bae76dc45aa004b7042e6f97803b1/lxml-6.0.0-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:84ef591495ffd3f9dcabffd6391db7bb70d7230b5c35ef5148354a134f56f2be", size = 4601387, upload-time = "2025-06-26T16:26:09.511Z" },
+
{ url = "https://files.pythonhosted.org/packages/a3/28/e1a9a881e6d6e29dda13d633885d13acb0058f65e95da67841c8dd02b4a8/lxml-6.0.0-cp312-cp312-manylinux2010_i686.manylinux2014_i686.manylinux_2_12_i686.manylinux_2_17_i686.whl", hash = "sha256:2930aa001a3776c3e2601cb8e0a15d21b8270528d89cc308be4843ade546b9ab", size = 5228928, upload-time = "2025-06-26T16:26:12.337Z" },
+
{ url = "https://files.pythonhosted.org/packages/9a/55/2cb24ea48aa30c99f805921c1c7860c1f45c0e811e44ee4e6a155668de06/lxml-6.0.0-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:219e0431ea8006e15005767f0351e3f7f9143e793e58519dc97fe9e07fae5563", size = 4952289, upload-time = "2025-06-28T18:47:25.602Z" },
+
{ url = "https://files.pythonhosted.org/packages/31/c0/b25d9528df296b9a3306ba21ff982fc5b698c45ab78b94d18c2d6ae71fd9/lxml-6.0.0-cp312-cp312-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:bd5913b4972681ffc9718bc2d4c53cde39ef81415e1671ff93e9aa30b46595e7", size = 5111310, upload-time = "2025-06-28T18:47:28.136Z" },
+
{ url = "https://files.pythonhosted.org/packages/e9/af/681a8b3e4f668bea6e6514cbcb297beb6de2b641e70f09d3d78655f4f44c/lxml-6.0.0-cp312-cp312-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:390240baeb9f415a82eefc2e13285016f9c8b5ad71ec80574ae8fa9605093cd7", size = 5025457, upload-time = "2025-06-26T16:26:15.068Z" },
+
{ url = "https://files.pythonhosted.org/packages/99/b6/3a7971aa05b7be7dfebc7ab57262ec527775c2c3c5b2f43675cac0458cad/lxml-6.0.0-cp312-cp312-manylinux_2_27_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:d6e200909a119626744dd81bae409fc44134389e03fbf1d68ed2a55a2fb10991", size = 5657016, upload-time = "2025-07-03T19:19:06.008Z" },
+
{ url = "https://files.pythonhosted.org/packages/69/f8/693b1a10a891197143c0673fcce5b75fc69132afa81a36e4568c12c8faba/lxml-6.0.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:ca50bd612438258a91b5b3788c6621c1f05c8c478e7951899f492be42defc0da", size = 5257565, upload-time = "2025-06-26T16:26:17.906Z" },
+
{ url = "https://files.pythonhosted.org/packages/a8/96/e08ff98f2c6426c98c8964513c5dab8d6eb81dadcd0af6f0c538ada78d33/lxml-6.0.0-cp312-cp312-manylinux_2_31_armv7l.whl", hash = "sha256:c24b8efd9c0f62bad0439283c2c795ef916c5a6b75f03c17799775c7ae3c0c9e", size = 4713390, upload-time = "2025-06-26T16:26:20.292Z" },
+
{ url = "https://files.pythonhosted.org/packages/a8/83/6184aba6cc94d7413959f6f8f54807dc318fdcd4985c347fe3ea6937f772/lxml-6.0.0-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:afd27d8629ae94c5d863e32ab0e1d5590371d296b87dae0a751fb22bf3685741", size = 5066103, upload-time = "2025-06-26T16:26:22.765Z" },
+
{ url = "https://files.pythonhosted.org/packages/ee/01/8bf1f4035852d0ff2e36a4d9aacdbcc57e93a6cd35a54e05fa984cdf73ab/lxml-6.0.0-cp312-cp312-musllinux_1_2_armv7l.whl", hash = "sha256:54c4855eabd9fc29707d30141be99e5cd1102e7d2258d2892314cf4c110726c3", size = 4791428, upload-time = "2025-06-26T16:26:26.461Z" },
+
{ url = "https://files.pythonhosted.org/packages/29/31/c0267d03b16954a85ed6b065116b621d37f559553d9339c7dcc4943a76f1/lxml-6.0.0-cp312-cp312-musllinux_1_2_ppc64le.whl", hash = "sha256:c907516d49f77f6cd8ead1322198bdfd902003c3c330c77a1c5f3cc32a0e4d16", size = 5678523, upload-time = "2025-07-03T19:19:09.837Z" },
+
{ url = "https://files.pythonhosted.org/packages/5c/f7/5495829a864bc5f8b0798d2b52a807c89966523140f3d6fa3a58ab6720ea/lxml-6.0.0-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:36531f81c8214e293097cd2b7873f178997dae33d3667caaae8bdfb9666b76c0", size = 5281290, upload-time = "2025-06-26T16:26:29.406Z" },
+
{ url = "https://files.pythonhosted.org/packages/79/56/6b8edb79d9ed294ccc4e881f4db1023af56ba451909b9ce79f2a2cd7c532/lxml-6.0.0-cp312-cp312-win32.whl", hash = "sha256:690b20e3388a7ec98e899fd54c924e50ba6693874aa65ef9cb53de7f7de9d64a", size = 3613495, upload-time = "2025-06-26T16:26:31.588Z" },
+
{ url = "https://files.pythonhosted.org/packages/0b/1e/cc32034b40ad6af80b6fd9b66301fc0f180f300002e5c3eb5a6110a93317/lxml-6.0.0-cp312-cp312-win_amd64.whl", hash = "sha256:310b719b695b3dd442cdfbbe64936b2f2e231bb91d998e99e6f0daf991a3eba3", size = 4014711, upload-time = "2025-06-26T16:26:33.723Z" },
+
{ url = "https://files.pythonhosted.org/packages/55/10/dc8e5290ae4c94bdc1a4c55865be7e1f31dfd857a88b21cbba68b5fea61b/lxml-6.0.0-cp312-cp312-win_arm64.whl", hash = "sha256:8cb26f51c82d77483cdcd2b4a53cda55bbee29b3c2f3ddeb47182a2a9064e4eb", size = 3674431, upload-time = "2025-06-26T16:26:35.959Z" },
+
{ url = "https://files.pythonhosted.org/packages/79/21/6e7c060822a3c954ff085e5e1b94b4a25757c06529eac91e550f3f5cd8b8/lxml-6.0.0-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:6da7cd4f405fd7db56e51e96bff0865b9853ae70df0e6720624049da76bde2da", size = 8414372, upload-time = "2025-06-26T16:26:39.079Z" },
+
{ url = "https://files.pythonhosted.org/packages/a4/f6/051b1607a459db670fc3a244fa4f06f101a8adf86cda263d1a56b3a4f9d5/lxml-6.0.0-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:b34339898bb556a2351a1830f88f751679f343eabf9cf05841c95b165152c9e7", size = 4593940, upload-time = "2025-06-26T16:26:41.891Z" },
+
{ url = "https://files.pythonhosted.org/packages/8e/74/dd595d92a40bda3c687d70d4487b2c7eff93fd63b568acd64fedd2ba00fe/lxml-6.0.0-cp313-cp313-manylinux2010_i686.manylinux2014_i686.manylinux_2_12_i686.manylinux_2_17_i686.whl", hash = "sha256:51a5e4c61a4541bd1cd3ba74766d0c9b6c12d6a1a4964ef60026832aac8e79b3", size = 5214329, upload-time = "2025-06-26T16:26:44.669Z" },
+
{ url = "https://files.pythonhosted.org/packages/52/46/3572761efc1bd45fcafb44a63b3b0feeb5b3f0066886821e94b0254f9253/lxml-6.0.0-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:d18a25b19ca7307045581b18b3ec9ead2b1db5ccd8719c291f0cd0a5cec6cb81", size = 4947559, upload-time = "2025-06-28T18:47:31.091Z" },
+
{ url = "https://files.pythonhosted.org/packages/94/8a/5e40de920e67c4f2eef9151097deb9b52d86c95762d8ee238134aff2125d/lxml-6.0.0-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:d4f0c66df4386b75d2ab1e20a489f30dc7fd9a06a896d64980541506086be1f1", size = 5102143, upload-time = "2025-06-28T18:47:33.612Z" },
+
{ url = "https://files.pythonhosted.org/packages/7c/4b/20555bdd75d57945bdabfbc45fdb1a36a1a0ff9eae4653e951b2b79c9209/lxml-6.0.0-cp313-cp313-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:9f4b481b6cc3a897adb4279216695150bbe7a44c03daba3c894f49d2037e0a24", size = 5021931, upload-time = "2025-06-26T16:26:47.503Z" },
+
{ url = "https://files.pythonhosted.org/packages/b6/6e/cf03b412f3763d4ca23b25e70c96a74cfece64cec3addf1c4ec639586b13/lxml-6.0.0-cp313-cp313-manylinux_2_27_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:8a78d6c9168f5bcb20971bf3329c2b83078611fbe1f807baadc64afc70523b3a", size = 5645469, upload-time = "2025-07-03T19:19:13.32Z" },
+
{ url = "https://files.pythonhosted.org/packages/d4/dd/39c8507c16db6031f8c1ddf70ed95dbb0a6d466a40002a3522c128aba472/lxml-6.0.0-cp313-cp313-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2ae06fbab4f1bb7db4f7c8ca9897dc8db4447d1a2b9bee78474ad403437bcc29", size = 5247467, upload-time = "2025-06-26T16:26:49.998Z" },
+
{ url = "https://files.pythonhosted.org/packages/4d/56/732d49def0631ad633844cfb2664563c830173a98d5efd9b172e89a4800d/lxml-6.0.0-cp313-cp313-manylinux_2_31_armv7l.whl", hash = "sha256:1fa377b827ca2023244a06554c6e7dc6828a10aaf74ca41965c5d8a4925aebb4", size = 4720601, upload-time = "2025-06-26T16:26:52.564Z" },
+
{ url = "https://files.pythonhosted.org/packages/8f/7f/6b956fab95fa73462bca25d1ea7fc8274ddf68fb8e60b78d56c03b65278e/lxml-6.0.0-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:1676b56d48048a62ef77a250428d1f31f610763636e0784ba67a9740823988ca", size = 5060227, upload-time = "2025-06-26T16:26:55.054Z" },
+
{ url = "https://files.pythonhosted.org/packages/97/06/e851ac2924447e8b15a294855caf3d543424364a143c001014d22c8ca94c/lxml-6.0.0-cp313-cp313-musllinux_1_2_armv7l.whl", hash = "sha256:0e32698462aacc5c1cf6bdfebc9c781821b7e74c79f13e5ffc8bfe27c42b1abf", size = 4790637, upload-time = "2025-06-26T16:26:57.384Z" },
+
{ url = "https://files.pythonhosted.org/packages/06/d4/fd216f3cd6625022c25b336c7570d11f4a43adbaf0a56106d3d496f727a7/lxml-6.0.0-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:4d6036c3a296707357efb375cfc24bb64cd955b9ec731abf11ebb1e40063949f", size = 5662049, upload-time = "2025-07-03T19:19:16.409Z" },
+
{ url = "https://files.pythonhosted.org/packages/52/03/0e764ce00b95e008d76b99d432f1807f3574fb2945b496a17807a1645dbd/lxml-6.0.0-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:7488a43033c958637b1a08cddc9188eb06d3ad36582cebc7d4815980b47e27ef", size = 5272430, upload-time = "2025-06-26T16:27:00.031Z" },
+
{ url = "https://files.pythonhosted.org/packages/5f/01/d48cc141bc47bc1644d20fe97bbd5e8afb30415ec94f146f2f76d0d9d098/lxml-6.0.0-cp313-cp313-win32.whl", hash = "sha256:5fcd7d3b1d8ecb91445bd71b9c88bdbeae528fefee4f379895becfc72298d181", size = 3612896, upload-time = "2025-06-26T16:27:04.251Z" },
+
{ url = "https://files.pythonhosted.org/packages/f4/87/6456b9541d186ee7d4cb53bf1b9a0d7f3b1068532676940fdd594ac90865/lxml-6.0.0-cp313-cp313-win_amd64.whl", hash = "sha256:2f34687222b78fff795feeb799a7d44eca2477c3d9d3a46ce17d51a4f383e32e", size = 4013132, upload-time = "2025-06-26T16:27:06.415Z" },
+
{ url = "https://files.pythonhosted.org/packages/b7/42/85b3aa8f06ca0d24962f8100f001828e1f1f1a38c954c16e71154ed7d53a/lxml-6.0.0-cp313-cp313-win_arm64.whl", hash = "sha256:21db1ec5525780fd07251636eb5f7acb84003e9382c72c18c542a87c416ade03", size = 3672642, upload-time = "2025-06-26T16:27:09.888Z" },
+
{ url = "https://files.pythonhosted.org/packages/dc/04/a53941fb0d7c60eed08301942c70aa63650a59308d15e05eb823acbce41d/lxml-6.0.0-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:85b14a4689d5cff426c12eefe750738648706ea2753b20c2f973b2a000d3d261", size = 8407699, upload-time = "2025-06-26T16:27:28.167Z" },
+
{ url = "https://files.pythonhosted.org/packages/44/d2/e1d4526e903afebe147f858322f1c0b36e44969d5c87e5d243c23f81987f/lxml-6.0.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:f64ccf593916e93b8d36ed55401bb7fe9c7d5de3180ce2e10b08f82a8f397316", size = 4574678, upload-time = "2025-06-26T16:27:30.888Z" },
+
{ url = "https://files.pythonhosted.org/packages/61/aa/b0a8ee233c00f2f437dbb6e7bd2df115a996d8211b7d03f4ab029b8e3378/lxml-6.0.0-cp39-cp39-manylinux2010_i686.manylinux2014_i686.manylinux_2_12_i686.manylinux_2_17_i686.whl", hash = "sha256:b372d10d17a701b0945f67be58fae4664fd056b85e0ff0fbc1e6c951cdbc0512", size = 5292694, upload-time = "2025-06-26T16:27:34.037Z" },
+
{ url = "https://files.pythonhosted.org/packages/53/7f/e6f377489b2ac4289418b879c34ed664e5a1174b2a91590936ec4174e773/lxml-6.0.0-cp39-cp39-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:a674c0948789e9136d69065cc28009c1b1874c6ea340253db58be7622ce6398f", size = 5009177, upload-time = "2025-06-28T18:47:39.377Z" },
+
{ url = "https://files.pythonhosted.org/packages/c6/05/ae239e997374680741b768044545251a29abc21ada42248638dbed749a0a/lxml-6.0.0-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:edf6e4c8fe14dfe316939711e3ece3f9a20760aabf686051b537a7562f4da91a", size = 5163787, upload-time = "2025-06-28T18:47:42.452Z" },
+
{ url = "https://files.pythonhosted.org/packages/2a/da/4f27222570d008fd2386e19d6923af6e64c317ee6116bbb2b98247f98f31/lxml-6.0.0-cp39-cp39-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:048a930eb4572829604982e39a0c7289ab5dc8abc7fc9f5aabd6fbc08c154e93", size = 5075755, upload-time = "2025-06-26T16:27:36.611Z" },
+
{ url = "https://files.pythonhosted.org/packages/1f/65/12552caf7b3e3b9b9aba12349370dc53a36d4058e4ed482811f1d262deee/lxml-6.0.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:c0b5fa5eda84057a4f1bbb4bb77a8c28ff20ae7ce211588d698ae453e13c6281", size = 5297070, upload-time = "2025-06-26T16:27:39.232Z" },
+
{ url = "https://files.pythonhosted.org/packages/3e/6a/f053a8369fdf4e3b8127a6ffb079c519167e684e956a1281392c5c3679b6/lxml-6.0.0-cp39-cp39-manylinux_2_31_armv7l.whl", hash = "sha256:c352fc8f36f7e9727db17adbf93f82499457b3d7e5511368569b4c5bd155a922", size = 4779864, upload-time = "2025-06-26T16:27:41.713Z" },
+
{ url = "https://files.pythonhosted.org/packages/df/7b/b2a392ad34ce37a17d1cf3aec303e15125768061cf0e355a92d292d20d37/lxml-6.0.0-cp39-cp39-musllinux_1_2_aarch64.whl", hash = "sha256:8db5dc617cb937ae17ff3403c3a70a7de9df4852a046f93e71edaec678f721d0", size = 5122039, upload-time = "2025-06-26T16:27:44.252Z" },
+
{ url = "https://files.pythonhosted.org/packages/80/0e/6459ff8ae7d87188e1f99f11691d0f32831caa6429599c3b289de9f08b21/lxml-6.0.0-cp39-cp39-musllinux_1_2_armv7l.whl", hash = "sha256:2181e4b1d07dde53986023482673c0f1fba5178ef800f9ab95ad791e8bdded6a", size = 4805117, upload-time = "2025-06-26T16:27:46.769Z" },
+
{ url = "https://files.pythonhosted.org/packages/ca/78/4186f573805ff623d28a8736788a3b29eeaf589afdcf0233de2c9bb9fc50/lxml-6.0.0-cp39-cp39-musllinux_1_2_x86_64.whl", hash = "sha256:b3c98d5b24c6095e89e03d65d5c574705be3d49c0d8ca10c17a8a4b5201b72f5", size = 5322300, upload-time = "2025-06-26T16:27:49.278Z" },
+
{ url = "https://files.pythonhosted.org/packages/e8/97/352e07992901473529c8e19dbfdba6430ba6a37f6b46a4d0fa93321f8fee/lxml-6.0.0-cp39-cp39-win32.whl", hash = "sha256:04d67ceee6db4bcb92987ccb16e53bef6b42ced872509f333c04fb58a3315256", size = 3615832, upload-time = "2025-06-26T16:27:51.728Z" },
+
{ url = "https://files.pythonhosted.org/packages/71/93/8f3b880e2618e548fb0ca157349abb526d81cb4f01ef5ea3a0f22bd4d0df/lxml-6.0.0-cp39-cp39-win_amd64.whl", hash = "sha256:e0b1520ef900e9ef62e392dd3d7ae4f5fa224d1dd62897a792cf353eb20b6cae", size = 4038551, upload-time = "2025-06-26T16:27:54.193Z" },
+
{ url = "https://files.pythonhosted.org/packages/e7/8a/046cbf5b262dd2858c6e65833339100fd5f1c017b37b26bc47c92d4584d7/lxml-6.0.0-cp39-cp39-win_arm64.whl", hash = "sha256:e35e8aaaf3981489f42884b59726693de32dabfc438ac10ef4eb3409961fd402", size = 3684237, upload-time = "2025-06-26T16:27:57.117Z" },
+
{ url = "https://files.pythonhosted.org/packages/66/e1/2c22a3cff9e16e1d717014a1e6ec2bf671bf56ea8716bb64466fcf820247/lxml-6.0.0-pp310-pypy310_pp73-macosx_10_15_x86_64.whl", hash = "sha256:dbdd7679a6f4f08152818043dbb39491d1af3332128b3752c3ec5cebc0011a72", size = 3898804, upload-time = "2025-06-26T16:27:59.751Z" },
+
{ url = "https://files.pythonhosted.org/packages/2b/3a/d68cbcb4393a2a0a867528741fafb7ce92dac5c9f4a1680df98e5e53e8f5/lxml-6.0.0-pp310-pypy310_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:40442e2a4456e9910875ac12951476d36c0870dcb38a68719f8c4686609897c4", size = 4216406, upload-time = "2025-06-28T18:47:45.518Z" },
+
{ url = "https://files.pythonhosted.org/packages/15/8f/d9bfb13dff715ee3b2a1ec2f4a021347ea3caf9aba93dea0cfe54c01969b/lxml-6.0.0-pp310-pypy310_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:db0efd6bae1c4730b9c863fc4f5f3c0fa3e8f05cae2c44ae141cb9dfc7d091dc", size = 4326455, upload-time = "2025-06-28T18:47:48.411Z" },
+
{ url = "https://files.pythonhosted.org/packages/01/8b/fde194529ee8a27e6f5966d7eef05fa16f0567e4a8e8abc3b855ef6b3400/lxml-6.0.0-pp310-pypy310_pp73-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:9ab542c91f5a47aaa58abdd8ea84b498e8e49fe4b883d67800017757a3eb78e8", size = 4268788, upload-time = "2025-06-26T16:28:02.776Z" },
+
{ url = "https://files.pythonhosted.org/packages/99/a8/3b8e2581b4f8370fc9e8dc343af4abdfadd9b9229970fc71e67bd31c7df1/lxml-6.0.0-pp310-pypy310_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:013090383863b72c62a702d07678b658fa2567aa58d373d963cca245b017e065", size = 4411394, upload-time = "2025-06-26T16:28:05.179Z" },
+
{ url = "https://files.pythonhosted.org/packages/e7/a5/899a4719e02ff4383f3f96e5d1878f882f734377f10dfb69e73b5f223e44/lxml-6.0.0-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:c86df1c9af35d903d2b52d22ea3e66db8058d21dc0f59842ca5deb0595921141", size = 3517946, upload-time = "2025-06-26T16:28:07.665Z" },
+
{ url = "https://files.pythonhosted.org/packages/93/e3/ef14f1d23aea1dec1eccbe2c07a93b6d0be693fd9d5f248a47155e436701/lxml-6.0.0-pp39-pypy39_pp73-macosx_10_15_x86_64.whl", hash = "sha256:4337e4aec93b7c011f7ee2e357b0d30562edd1955620fdd4aeab6aacd90d43c5", size = 3892325, upload-time = "2025-06-26T16:28:10.024Z" },
+
{ url = "https://files.pythonhosted.org/packages/09/8a/1410b9e1ec43f606f9aac0661d09892509d86032e229711798906e1b5e7a/lxml-6.0.0-pp39-pypy39_pp73-manylinux2014_aarch64.manylinux_2_17_aarch64.whl", hash = "sha256:ae74f7c762270196d2dda56f8dd7309411f08a4084ff2dfcc0b095a218df2e06", size = 4210839, upload-time = "2025-06-28T18:47:50.768Z" },
+
{ url = "https://files.pythonhosted.org/packages/79/cb/6696ce0d1712c5ae94b18bdf225086a5fb04b23938ac4d2011b323b3860b/lxml-6.0.0-pp39-pypy39_pp73-manylinux2014_x86_64.manylinux_2_17_x86_64.whl", hash = "sha256:059c4cbf3973a621b62ea3132934ae737da2c132a788e6cfb9b08d63a0ef73f9", size = 4321235, upload-time = "2025-06-28T18:47:53.338Z" },
+
{ url = "https://files.pythonhosted.org/packages/f3/98/04997f61d720cf320a0daee66b3096e3a3b57453e15549c14b87058c2acd/lxml-6.0.0-pp39-pypy39_pp73-manylinux_2_27_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:17f090a9bc0ce8da51a5632092f98a7e7f84bca26f33d161a98b57f7fb0004ca", size = 4265071, upload-time = "2025-06-26T16:28:12.367Z" },
+
{ url = "https://files.pythonhosted.org/packages/e6/86/e5f6fa80154a5f5bf2c1e89d6265892299942edeb115081ca72afe7c7199/lxml-6.0.0-pp39-pypy39_pp73-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:9da022c14baeec36edfcc8daf0e281e2f55b950249a455776f0d1adeeada4734", size = 4406816, upload-time = "2025-06-26T16:28:14.744Z" },
+
{ url = "https://files.pythonhosted.org/packages/18/a6/ae69e0e6f5fb6293eb8cbfbf8a259e37d71608bbae3658a768dd26b69f3e/lxml-6.0.0-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:a55da151d0b0c6ab176b4e761670ac0e2667817a1e0dadd04a01d0561a219349", size = 3515499, upload-time = "2025-06-26T16:28:17.035Z" },
+
]
+
+
[[package]]
name = "markdown-it-py"
version = "3.0.0"
source = { registry = "https://pypi.org/simple" }
···
]
[[package]]
+
name = "markdownify"
+
version = "1.2.0"
+
source = { registry = "https://pypi.org/simple" }
+
dependencies = [
+
{ name = "beautifulsoup4" },
+
{ name = "six" },
+
]
+
sdist = { url = "https://files.pythonhosted.org/packages/83/1b/6f2697b51eaca81f08852fd2734745af15718fea10222a1d40f8a239c4ea/markdownify-1.2.0.tar.gz", hash = "sha256:f6c367c54eb24ee953921804dfe6d6575c5e5b42c643955e7242034435de634c", size = 18771, upload-time = "2025-08-09T17:44:15.302Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/6a/e2/7af643acb4cae0741dffffaa7f3f7c9e7ab4046724543ba1777c401d821c/markdownify-1.2.0-py3-none-any.whl", hash = "sha256:48e150a1c4993d4d50f282f725c0111bd9eb25645d41fa2f543708fd44161351", size = 15561, upload-time = "2025-08-09T17:44:14.074Z" },
+
]
+
+
[[package]]
name = "mdurl"
version = "0.1.2"
source = { registry = "https://pypi.org/simple" }
···
{ url = "https://files.pythonhosted.org/packages/0b/c7/d3654a790129684d0e8dc04707cb6d75633d7b102a962c6dc0f862c64c25/pendulum-3.1.0-pp39-pypy39_pp73-musllinux_1_1_x86_64.whl", hash = "sha256:e4cbd933a40c915ed5c41b083115cca15c7afa8179363b2a61db167c64fa0670", size = 526685, upload-time = "2025-04-19T14:02:31.523Z" },
{ url = "https://files.pythonhosted.org/packages/50/d9/4a166256386b7973e36ff44135e8d009f4afb25d6c72df5380ccfd6fbb89/pendulum-3.1.0-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:3363a470b5d67dbf8d9fd1bf77dcdbf720788bc3be4a10bdcd28ae5d7dbd26c4", size = 261170, upload-time = "2025-04-19T14:02:33.099Z" },
{ url = "https://files.pythonhosted.org/packages/6e/23/e98758924d1b3aac11a626268eabf7f3cf177e7837c28d47bf84c64532d0/pendulum-3.1.0-py3-none-any.whl", hash = "sha256:f9178c2a8e291758ade1e8dd6371b1d26d08371b4c7730a6e9a3ef8b16ebae0f", size = 111799, upload-time = "2025-04-19T14:02:34.739Z" },
+
]
+
+
[[package]]
+
name = "pip"
+
version = "25.2"
+
source = { registry = "https://pypi.org/simple" }
+
sdist = { url = "https://files.pythonhosted.org/packages/20/16/650289cd3f43d5a2fadfd98c68bd1e1e7f2550a1a5326768cddfbcedb2c5/pip-25.2.tar.gz", hash = "sha256:578283f006390f85bb6282dffb876454593d637f5d1be494b5202ce4877e71f2", size = 1840021, upload-time = "2025-07-30T21:50:15.401Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/b7/3f/945ef7ab14dc4f9d7f40288d2df998d1837ee0888ec3659c813487572faa/pip-25.2-py3-none-any.whl", hash = "sha256:6d67a2b4e7f14d8b31b8b52648866fa717f45a1eb70e83002f4331d07e953717", size = 1752557, upload-time = "2025-07-30T21:50:13.323Z" },
]
[[package]]
···
]
[[package]]
+
name = "requests"
+
version = "2.32.4"
+
source = { registry = "https://pypi.org/simple" }
+
dependencies = [
+
{ name = "certifi" },
+
{ name = "charset-normalizer" },
+
{ name = "idna" },
+
{ name = "urllib3" },
+
]
+
sdist = { url = "https://files.pythonhosted.org/packages/e1/0a/929373653770d8a0d7ea76c37de6e41f11eb07559b103b1c02cafb3f7cf8/requests-2.32.4.tar.gz", hash = "sha256:27d0316682c8a29834d3264820024b62a36942083d52caf2f14c0591336d3422", size = 135258, upload-time = "2025-06-09T16:43:07.34Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/7c/e4/56027c4a6b4ae70ca9de302488c5ca95ad4a39e190093d6c1a8ace08341b/requests-2.32.4-py3-none-any.whl", hash = "sha256:27babd3cda2a6d50b30443204ee89830707d396671944c998b5975b031ac2b2c", size = 64847, upload-time = "2025-06-09T16:43:05.728Z" },
+
]
+
+
[[package]]
name = "rich"
version = "14.0.0"
source = { registry = "https://pypi.org/simple" }
···
sdist = { url = "https://files.pythonhosted.org/packages/a2/87/a6771e1546d97e7e041b6ae58d80074f81b7d5121207425c964ddf5cfdbd/sniffio-1.3.1.tar.gz", hash = "sha256:f4324edc670a0f49750a81b895f35c3adb843cca46f0530f79fc1babb23789dc", size = 20372, upload-time = "2024-02-25T23:20:04.057Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/e9/44/75a9c9421471a6c4805dbf2356f7c181a29c1879239abab1ea2cc8f38b40/sniffio-1.3.1-py3-none-any.whl", hash = "sha256:2f6da418d1f1e0fddd844478f41680e794e6051915791a034ff65e5f100525a2", size = 10235, upload-time = "2024-02-25T23:20:01.196Z" },
+
]
+
+
[[package]]
+
name = "soupsieve"
+
version = "2.7"
+
source = { registry = "https://pypi.org/simple" }
+
sdist = { url = "https://files.pythonhosted.org/packages/3f/f4/4a80cd6ef364b2e8b65b15816a843c0980f7a5a2b4dc701fc574952aa19f/soupsieve-2.7.tar.gz", hash = "sha256:ad282f9b6926286d2ead4750552c8a6142bc4c783fd66b0293547c8fe6ae126a", size = 103418, upload-time = "2025-04-20T18:50:08.518Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/e7/9c/0e6afc12c269578be5c0c1c9f4b49a8d32770a080260c333ac04cc1c832d/soupsieve-2.7-py3-none-any.whl", hash = "sha256:6e60cc5c1ffaf1cebcc12e8188320b72071e922c2e897f737cadce79ad5d30c4", size = 36677, upload-time = "2025-04-20T18:50:07.196Z" },
]
[[package]]
···
{ name = "feedparser" },
{ name = "gitpython" },
{ name = "httpx" },
+
{ name = "importlib-metadata" },
+
{ name = "markdownify" },
{ name = "pendulum" },
{ name = "platformdirs" },
{ name = "pydantic" },
···
{ name = "pyyaml" },
{ name = "rich" },
{ name = "typer" },
+
{ name = "typesense" },
+
{ name = "zulip" },
+
{ name = "zulip-bots" },
]
[package.optional-dependencies]
···
{ name = "pytest-cov" },
{ name = "ruff" },
{ name = "types-pyyaml" },
+
]
+
+
[package.dev-dependencies]
+
dev = [
+
{ name = "mypy" },
+
{ name = "pytest" },
]
[package.metadata]
···
{ name = "feedparser", specifier = ">=6.0.11" },
{ name = "gitpython", specifier = ">=3.1.40" },
{ name = "httpx", specifier = ">=0.28.0" },
+
{ name = "importlib-metadata", specifier = ">=8.7.0" },
+
{ name = "markdownify", specifier = ">=1.2.0" },
{ name = "mypy", marker = "extra == 'dev'", specifier = ">=1.13.0" },
{ name = "pendulum", specifier = ">=3.0.0" },
{ name = "platformdirs", specifier = ">=4.0.0" },
···
{ name = "ruff", marker = "extra == 'dev'", specifier = ">=0.8.0" },
{ name = "typer", specifier = ">=0.15.0" },
{ name = "types-pyyaml", marker = "extra == 'dev'", specifier = ">=6.0.0" },
+
{ name = "typesense", specifier = ">=1.1.1" },
+
{ name = "zulip", specifier = ">=0.9.0" },
+
{ name = "zulip-bots", specifier = ">=0.9.0" },
]
provides-extras = ["dev"]
+
+
[package.metadata.requires-dev]
+
dev = [
+
{ name = "mypy", specifier = ">=1.17.0" },
+
{ name = "pytest", specifier = ">=8.4.1" },
+
]
[[package]]
name = "tomli"
···
]
[[package]]
+
name = "typesense"
+
version = "1.1.1"
+
source = { registry = "https://pypi.org/simple" }
+
dependencies = [
+
{ name = "requests" },
+
]
+
sdist = { url = "https://files.pythonhosted.org/packages/9b/2c/6f012a17934d50f73d20f1138b3bc42cfb7ec465052bd8e56c0dcf8ce92d/typesense-1.1.1.tar.gz", hash = "sha256:876280e5f2bb8a4a24ae427863ee8216d2e9e76cfe96e0a87a379e66078dc591", size = 45214, upload-time = "2025-05-20T18:13:32.865Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/1b/8f/6306446e5ce28ddddd8babf407597b9afa3fff521794fe2dcfb16f12e16a/typesense-1.1.1-py3-none-any.whl", hash = "sha256:633aeb26c24e17be654ea22f20d3f76f87c804f259d0a560b7e0ae817f24077a", size = 70604, upload-time = "2025-05-20T18:13:30.975Z" },
+
]
+
+
[[package]]
name = "typing-extensions"
version = "4.14.1"
source = { registry = "https://pypi.org/simple" }
···
sdist = { url = "https://files.pythonhosted.org/packages/95/32/1a225d6164441be760d75c2c42e2780dc0873fe382da3e98a2e1e48361e5/tzdata-2025.2.tar.gz", hash = "sha256:b60a638fcc0daffadf82fe0f57e53d06bdec2f36c4df66280ae79bce6bd6f2b9", size = 196380, upload-time = "2025-03-23T13:54:43.652Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/5c/23/c7abc0ca0a1526a0774eca151daeb8de62ec457e77262b66b359c3c7679e/tzdata-2025.2-py2.py3-none-any.whl", hash = "sha256:1a403fada01ff9221ca8044d701868fa132215d84beb92242d9acd2147f667a8", size = 347839, upload-time = "2025-03-23T13:54:41.845Z" },
+
]
+
+
[[package]]
+
name = "urllib3"
+
version = "2.5.0"
+
source = { registry = "https://pypi.org/simple" }
+
sdist = { url = "https://files.pythonhosted.org/packages/15/22/9ee70a2574a4f4599c47dd506532914ce044817c7752a79b6a51286319bc/urllib3-2.5.0.tar.gz", hash = "sha256:3fc47733c7e419d4bc3f6b3dc2b4f890bb743906a30d56ba4a5bfa4bbff92760", size = 393185, upload-time = "2025-06-18T14:07:41.644Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/a7/c2/fe1e52489ae3122415c51f387e221dd0773709bad6c6cdaa599e8a2c5185/urllib3-2.5.0-py3-none-any.whl", hash = "sha256:e6b01673c0fa6a13e374b50871808eb3bf7046c4b125b216f6bf1cc604cff0dc", size = 129795, upload-time = "2025-06-18T14:07:40.39Z" },
[[package]]
···
wheels = [
{ url = "https://files.pythonhosted.org/packages/f4/24/2a3e3df732393fed8b3ebf2ec078f05546de641fe1b667ee316ec1dcf3b7/webencodings-0.5.1-py2.py3-none-any.whl", hash = "sha256:a0af1213f3c2226497a97e2b3aa01a7e4bee4f403f95be16fc9acd2947514a78", size = 11774, upload-time = "2017-04-05T20:21:32.581Z" },
+
+
[[package]]
+
name = "zipp"
+
version = "3.23.0"
+
source = { registry = "https://pypi.org/simple" }
+
sdist = { url = "https://files.pythonhosted.org/packages/e3/02/0f2892c661036d50ede074e376733dca2ae7c6eb617489437771209d4180/zipp-3.23.0.tar.gz", hash = "sha256:a07157588a12518c9d4034df3fbbee09c814741a33ff63c05fa29d26a2404166", size = 25547, upload-time = "2025-06-08T17:06:39.4Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/2e/54/647ade08bf0db230bfea292f893923872fd20be6ac6f53b2b936ba839d75/zipp-3.23.0-py3-none-any.whl", hash = "sha256:071652d6115ed432f5ce1d34c336c0adfd6a884660d1e9712a256d3d3bd4b14e", size = 10276, upload-time = "2025-06-08T17:06:38.034Z" },
+
]
+
+
[[package]]
+
name = "zulip"
+
version = "0.9.0"
+
source = { registry = "https://pypi.org/simple" }
+
dependencies = [
+
{ name = "click", version = "8.1.8", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version < '3.10'" },
+
{ name = "click", version = "8.2.1", source = { registry = "https://pypi.org/simple" }, marker = "python_full_version >= '3.10'" },
+
{ name = "distro" },
+
{ name = "requests" },
+
{ name = "typing-extensions" },
+
]
+
sdist = { url = "https://files.pythonhosted.org/packages/7e/85/754c025bf7e5ff2622b89c555ff3e1ecc3dd501874745a7ec2c3b59fc743/zulip-0.9.0.tar.gz", hash = "sha256:7a14149e5d9e3fcc53b13e998719fd1f6ccb8289bc60fccbaa1aafcd0a9d0843", size = 134624, upload-time = "2023-11-15T00:28:39.338Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/db/ed/81e42dbfe0dd538f60514d0e4849b872d949a1caa7a2c80bbe6aa4c1bae9/zulip-0.9.0-py3-none-any.whl", hash = "sha256:a315db3e990c6b94aef323540b7f386485e8fc359dbd26af526c20dbe9068217", size = 289297, upload-time = "2023-11-15T00:28:33.172Z" },
+
]
+
+
[[package]]
+
name = "zulip-bots"
+
version = "0.9.0"
+
source = { registry = "https://pypi.org/simple" }
+
dependencies = [
+
{ name = "beautifulsoup4" },
+
{ name = "html2text" },
+
{ name = "importlib-metadata", marker = "python_full_version < '3.10'" },
+
{ name = "lxml" },
+
{ name = "pip" },
+
{ name = "typing-extensions" },
+
{ name = "zulip" },
+
]
+
sdist = { url = "https://files.pythonhosted.org/packages/a5/39/6e60bea336fbfd4ad55dbdbb5fbd6d62dc32b08ad240688f119d145a29b3/zulip_bots-0.9.0.tar.gz", hash = "sha256:94925a4bd7c3558bf0e0cc3e83021d6a2f2139824745081abaa605a3d012e37a", size = 2268775, upload-time = "2023-11-15T00:28:36.507Z" }
+
wheels = [
+
{ url = "https://files.pythonhosted.org/packages/e6/c9/c242abc63de86d1a20b02e5d8e507c38d4889b9c01f663a5b80eb050effd/zulip_bots-0.9.0-py3-none-any.whl", hash = "sha256:1c46b011002fdf375f27fbf0c17394149e77ea36b33aa762b58368db14229e37", size = 2317628, upload-time = "2023-11-15T00:28:26.312Z" },
+
]