# Netdata Zulip Bot - Development Instructions This repository implements a production-ready Zulip bot that receives incoming webhook notifications from Netdata Cloud and posts the resulting notifications to a Zulip topic. ## Core Requirements ### Technology Stack - **Language**: Python with `uv` package manager - **Framework**: Python `zulip_bots` PyPI package for Zulip integration - **Web Server**: FastAPI for webhook endpoint - **Reverse Proxy**: Caddy for HTTPS and mutual TLS handling - **Deployment**: Standalone service behind reverse proxy ### Netdata Integration - **Webhook Format**: Follow the Netdata Cloud webhook notification format from: - URL: `https://raw.githubusercontent.com/netdata/netdata/refs/heads/master/integrations/cloud-notifications/metadata.yaml` - Section: `id: notify-cloud-webhook` - **Notification Types**: Handle both alert notifications and reachability notifications ### Zulip Configuration - **Configuration Source**: The `.zuliprc` file should contain: - Zulip site URL - Bot email and API key - Target stream/channel for posting messages - **Topic Organization**: Messages should be posted to topics based on severity: - `critical`, `warning`, `clear` for alerts - `reachability` for host status changes - **Message Format**: Rich markdown with: - Alert details and timestamps - Markdown-formatted alert URLs for easy access to Netdata Cloud ### Security Requirements - **HTTP Only**: The bot service listens on HTTP internally - **Reverse Proxy**: Caddy handles HTTPS with Let's Encrypt certificates - **Mutual TLS**: Caddy validates Netdata's client certificates - Client certificate validation at the reverse proxy level - Netdata CA certificate configured in Caddyfile ### Service Architecture - **Backend Service**: FastAPI bot listening on HTTP (port 8080) - **Reverse Proxy**: Caddy handling HTTPS, Let's Encrypt, and mutual TLS - **Webhook Endpoint**: `/webhook/netdata` for receiving notifications - **Health Check**: `/health` endpoint for monitoring - **Structured Logging**: JSON-structured logs for production monitoring ## Implementation Notes ### Configuration Management - Support both `.zuliprc` file and environment variables - Provide sample configuration files with `--create-config` flag - Server configuration via environment variables: - `SERVER_HOST`: Bind address (default: 0.0.0.0) - `SERVER_PORT`: HTTP port (default: 8080) - Reverse proxy configuration in `Caddyfile` ### Message Processing 1. Receive Netdata webhook POST request 2. Validate and parse JSON payload 3. Determine notification type (alert vs reachability) 4. Format message with appropriate emoji and markdown 5. Send to configured Zulip stream/topic ### Error Handling - Validate all incoming payloads - Log errors without exposing internal details - Return appropriate HTTP status codes - Implement retry logic for Zulip API failures ## Testing Run tests with: ```bash uv run python -m pytest tests/ -v ``` ## Deployment The service should be deployable via: - Systemd service (see `examples/netdata-zulip-bot.service`) - Docker container (see `Dockerfile` and `docker-compose.yml`) - Automated setup script (`scripts/setup.sh`) - Caddy reverse proxy configuration (`Caddyfile`) ### Reverse Proxy Setup 1. Install Caddy on your server 2. Update `Caddyfile` with your domain name 3. Place the Netdata CA certificate in `netdata-ca.pem` 4. Start both the bot service and Caddy ## Development Commands - **Install dependencies**: `uv sync` - **Create config samples**: `uv run netdata-zulip-bot --create-config` - **Run tests**: `uv run python -m pytest tests/` - **Start service**: `uv run netdata-zulip-bot` ## Important Reminders - Always validate Netdata webhook payloads before processing - Ensure Caddy and reverse proxy are properly configured before production deployment - Test mutual TLS authentication with actual Netdata Cloud webhooks - Monitor service logs for webhook processing errors - Keep Zulip API credentials secure and never commit them to the repository - Update the Netdata CA certificate in `netdata-ca.pem` as needed