# Netdata Zulip Bot *100% vibe coded, use at your peril* A webhook service that receives notifications from Netdata Cloud and forwards them to Zulip channels. Features HTTPS with Let's Encrypt certificates and mutual TLS authentication for secure communication with Netdata Cloud. ## Features - 🔐 **Automated SSL Certificates**: Built-in Let's Encrypt integration with automatic renewal - 🤝 **Mutual TLS**: Secure authentication with Netdata Cloud - 📊 **Rich Formatting**: Beautiful Zulip messages with emojis and markdown - 🏷️ **Topic Organization**: Automatic topic routing by severity level - 📝 **Structured Logging**: JSON-structured logs for monitoring - ⚡ **High Performance**: FastAPI-based webhook endpoint - 🚀 **Standalone**: No external dependencies like certbot required ## Quick Start ### 1. Install Dependencies ```bash # Using uv (recommended) uv sync # Or using pip pip install -e . ``` ### 2. Create Configuration ```bash # Generate sample configuration files netdata-zulip-bot --create-config # Copy and customize cp .zuliprc.sample ~/.zuliprc ``` ### 3. Configure Zulip Settings Edit `~/.zuliprc`: ```ini [api] site=https://yourorg.zulipchat.com email=netdata-bot@yourorg.zulipchat.com key=your-zulip-api-key stream=netdata-alerts ``` ### 4. Set Server Environment Variables ```bash export SERVER_DOMAIN=your-webhook-domain.com export SERVER_PORT=8443 export SERVER_ENABLE_MTLS=true # For automated SSL certificates (recommended) export SERVER_AUTO_CERT=true export SERVER_CERT_EMAIL=admin@example.com # Use staging for testing (optional) export SERVER_CERT_STAGING=false ``` ### 5. Run the Service ```bash # With automated SSL certificates netdata-zulip-bot # The bot will automatically: # 1. Obtain SSL certificates from Let's Encrypt # 2. Start the HTTPS server # 3. Renew certificates before expiration ``` ## Configuration ### Zulip Configuration The bot supports two configuration methods: #### Method 1: Zuliprc File (Recommended) Create `~/.zuliprc`: ```ini [api] site=https://yourorg.zulipchat.com email=netdata-bot@yourorg.zulipchat.com key=your-zulip-api-key stream=netdata-alerts ``` #### Method 2: Environment Variables ```bash export ZULIP_SITE=https://yourorg.zulipchat.com export ZULIP_EMAIL=netdata-bot@yourorg.zulipchat.com export ZULIP_API_KEY=your-api-key export ZULIP_STREAM=netdata-alerts ``` Use `--env-config` flag to use environment variables instead of zuliprc. ### Server Configuration Set these environment variables: - `SERVER_DOMAIN`: Your public domain (required) - `SERVER_HOST`: Bind address (default: `0.0.0.0`) - `SERVER_PORT`: HTTPS port (default: `8443`) - `SERVER_ENABLE_MTLS`: Enable mutual TLS (default: `true`) #### Automated SSL Configuration (Recommended) - `SERVER_AUTO_CERT`: Enable automatic certificate management (default: `false`) - `SERVER_CERT_EMAIL`: Email for Let's Encrypt account (required when auto_cert is true) - `SERVER_CERT_PATH`: Directory for storing certificates (default: `./certs`) - `SERVER_CERT_STAGING`: Use Let's Encrypt staging server for testing (default: `false`) - `SERVER_ACME_PORT`: Port for ACME HTTP-01 challenge (default: `80`) #### Manual SSL Configuration If not using automated certificates: - `SERVER_CERT_PATH`: Path to certificate directory - Place `fullchain.pem` and `privkey.pem` in `{SERVER_CERT_PATH}/{SERVER_DOMAIN}/` ## Message Format ### Alert Notifications Messages are posted to topics based on severity level: - **Topic**: `critical`, `warning`, or `clear` - **Format**: Rich markdown with alert details, timestamps, and links Example: ``` 🔴 **High CPU Usage** **Space:** production **Chart:** system.cpu **Context:** cpu utilization **Severity:** Critical **Time:** 2024-01-15 14:30:00 UTC **Details:** CPU usage has exceeded 90% for 5 minutes **Summary:** Critical alert: High CPU usage detected [View Alert](https://app.netdata.cloud/spaces/...) ``` ### Reachability Notifications Messages are posted to the `reachability` topic: ``` ❌ **Host Unreachable** **Host:** web-server-01 **Status:** ❌ Unreachable **Severity:** Critical **Summary:** Host web-server-01 is no longer reachable [View Host](https://app.netdata.cloud/...) ``` ## Deployment ### Systemd Service Create `/etc/systemd/system/netdata-zulip-bot.service`: ```ini [Unit] Description=Netdata Zulip Bot After=network.target [Service] Type=simple User=netdata-bot WorkingDirectory=/opt/netdata-zulip-bot Environment=SERVER_DOMAIN=your-domain.com ExecStart=/opt/netdata-zulip-bot/venv/bin/netdata-zulip-bot Restart=always RestartSec=5 [Install] WantedBy=multi-user.target ``` Enable and start: ```bash sudo systemctl enable netdata-zulip-bot sudo systemctl start netdata-zulip-bot ``` ### Docker ```dockerfile FROM python:3.11-slim WORKDIR /app COPY . . RUN pip install -e . EXPOSE 8443 CMD ["netdata-zulip-bot"] ``` ## Security ### SSL Certificate Management The bot includes fully automated SSL certificate management: 1. **Automatic Provisioning**: Obtains certificates from Let's Encrypt on first run 2. **Automatic Renewal**: Checks daily and renews certificates 30 days before expiration 3. **Zero Downtime**: Certificate renewal happens in the background 4. **ACME HTTP-01 Challenge**: Built-in challenge server (requires port 80 access) ### Mutual TLS Authentication The service supports mutual TLS to authenticate Netdata Cloud webhooks: 1. **Server Certificate**: Automatically managed via built-in ACME client 2. **Client Verification**: Validates Netdata's client certificate 3. **CA Certificate**: Built-in Netdata CA certificate for client validation ### Webhook Endpoint Security - HTTPS-only communication - Request logging and monitoring - Payload validation and sanitization - Error handling without information disclosure ## Monitoring The service provides structured JSON logging for easy monitoring: ```json { "timestamp": "2024-01-15T14:30:00.000Z", "level": "info", "event": "Message sent to Zulip", "stream": "netdata-alerts", "topic": "critical", "message_id": 12345 } ``` ### Health Check ```bash curl -k https://your-domain.com:8443/health ``` Response: ```json { "status": "healthy", "service": "netdata-zulip-bot" } ``` ## Development ### Running Tests ```bash pytest ``` ### Code Formatting ```bash black . ruff check . ``` ### Local Development For development, you can disable HTTPS and mTLS: ```bash export SERVER_ENABLE_MTLS=false # Use HTTP for testing (not recommended for production) ``` ## Troubleshooting ### Common Issues 1. **Certificate Issues** - For automated certs: Ensure port 80 is accessible for ACME challenges - Domain must point to your server's IP address - Check `SERVER_CERT_EMAIL` is set for auto-cert mode - Use `SERVER_CERT_STAGING=true` for testing to avoid rate limits 2. **Zulip Connection Failed** - Verify API credentials in zuliprc - Test connection with Zulip's API 3. **Webhook Not Receiving Data** - Check firewall settings for port 8443 - Verify domain DNS resolution - Check Netdata Cloud webhook configuration ### Logs View service logs: ```bash sudo journalctl -u netdata-zulip-bot -f ``` ## License MIT License - see LICENSE file for details.