Netdata.cloud bot for Zulip
1# Netdata Zulip Bot 2 3*100% vibe coded, use at your peril* 4 5A webhook service that receives notifications from Netdata Cloud and forwards them to Zulip channels. Features HTTPS with Let's Encrypt certificates and mutual TLS authentication for secure communication with Netdata Cloud. 6 7## Features 8 9- 🔐 **Automated SSL Certificates**: Built-in Let's Encrypt integration with automatic renewal 10- 🤝 **Mutual TLS**: Secure authentication with Netdata Cloud 11- 📊 **Rich Formatting**: Beautiful Zulip messages with emojis and markdown 12- 🏷️ **Topic Organization**: Automatic topic routing by severity level 13- 📝 **Structured Logging**: JSON-structured logs for monitoring 14-**High Performance**: FastAPI-based webhook endpoint 15- 🚀 **Standalone**: No external dependencies like certbot required 16 17## Quick Start 18 19### 1. Install Dependencies 20 21```bash 22# Using uv (recommended) 23uv sync 24 25# Or using pip 26pip install -e . 27``` 28 29### 2. Create Configuration 30 31```bash 32# Generate sample configuration files 33netdata-zulip-bot --create-config 34 35# Copy and customize 36cp .zuliprc.sample ~/.zuliprc 37``` 38 39### 3. Configure Zulip Settings 40 41Edit `~/.zuliprc`: 42 43```ini 44[api] 45site=https://yourorg.zulipchat.com 46email=netdata-bot@yourorg.zulipchat.com 47key=your-zulip-api-key 48stream=netdata-alerts 49``` 50 51### 4. Set Server Environment Variables 52 53```bash 54export SERVER_DOMAIN=your-webhook-domain.com 55export SERVER_PORT=8443 56export SERVER_ENABLE_MTLS=true 57 58# For automated SSL certificates (recommended) 59export SERVER_AUTO_CERT=true 60export SERVER_CERT_EMAIL=admin@example.com 61# Use staging for testing (optional) 62export SERVER_CERT_STAGING=false 63``` 64 65### 5. Run the Service 66 67```bash 68# With automated SSL certificates 69netdata-zulip-bot 70 71# The bot will automatically: 72# 1. Obtain SSL certificates from Let's Encrypt 73# 2. Start the HTTPS server 74# 3. Renew certificates before expiration 75``` 76 77## Configuration 78 79### Zulip Configuration 80 81The bot supports two configuration methods: 82 83#### Method 1: Zuliprc File (Recommended) 84 85Create `~/.zuliprc`: 86 87```ini 88[api] 89site=https://yourorg.zulipchat.com 90email=netdata-bot@yourorg.zulipchat.com 91key=your-zulip-api-key 92stream=netdata-alerts 93``` 94 95#### Method 2: Environment Variables 96 97```bash 98export ZULIP_SITE=https://yourorg.zulipchat.com 99export ZULIP_EMAIL=netdata-bot@yourorg.zulipchat.com 100export ZULIP_API_KEY=your-api-key 101export ZULIP_STREAM=netdata-alerts 102``` 103 104Use `--env-config` flag to use environment variables instead of zuliprc. 105 106### Server Configuration 107 108Set these environment variables: 109 110- `SERVER_DOMAIN`: Your public domain (required) 111- `SERVER_HOST`: Bind address (default: `0.0.0.0`) 112- `SERVER_PORT`: HTTPS port (default: `8443`) 113- `SERVER_ENABLE_MTLS`: Enable mutual TLS (default: `true`) 114 115#### Automated SSL Configuration (Recommended) 116 117- `SERVER_AUTO_CERT`: Enable automatic certificate management (default: `false`) 118- `SERVER_CERT_EMAIL`: Email for Let's Encrypt account (required when auto_cert is true) 119- `SERVER_CERT_PATH`: Directory for storing certificates (default: `./certs`) 120- `SERVER_CERT_STAGING`: Use Let's Encrypt staging server for testing (default: `false`) 121- `SERVER_ACME_PORT`: Port for ACME HTTP-01 challenge (default: `80`) 122 123#### Manual SSL Configuration 124 125If not using automated certificates: 126- `SERVER_CERT_PATH`: Path to certificate directory 127- Place `fullchain.pem` and `privkey.pem` in `{SERVER_CERT_PATH}/{SERVER_DOMAIN}/` 128 129## Message Format 130 131### Alert Notifications 132 133Messages are posted to topics based on severity level: 134 135- **Topic**: `critical`, `warning`, or `clear` 136- **Format**: Rich markdown with alert details, timestamps, and links 137 138Example: 139``` 140🔴 **High CPU Usage** 141 142**Space:** production 143**Chart:** system.cpu 144**Context:** cpu utilization 145**Severity:** Critical 146**Time:** 2024-01-15 14:30:00 UTC 147 148**Details:** CPU usage has exceeded 90% for 5 minutes 149**Summary:** Critical alert: High CPU usage detected 150 151[View Alert](https://app.netdata.cloud/spaces/...) 152``` 153 154### Reachability Notifications 155 156Messages are posted to the `reachability` topic: 157 158``` 159❌ **Host Unreachable** 160 161**Host:** web-server-01 162**Status:** ❌ Unreachable 163**Severity:** Critical 164 165**Summary:** Host web-server-01 is no longer reachable 166 167[View Host](https://app.netdata.cloud/...) 168``` 169 170## Deployment 171 172### Systemd Service 173 174Create `/etc/systemd/system/netdata-zulip-bot.service`: 175 176```ini 177[Unit] 178Description=Netdata Zulip Bot 179After=network.target 180 181[Service] 182Type=simple 183User=netdata-bot 184WorkingDirectory=/opt/netdata-zulip-bot 185Environment=SERVER_DOMAIN=your-domain.com 186ExecStart=/opt/netdata-zulip-bot/venv/bin/netdata-zulip-bot 187Restart=always 188RestartSec=5 189 190[Install] 191WantedBy=multi-user.target 192``` 193 194Enable and start: 195```bash 196sudo systemctl enable netdata-zulip-bot 197sudo systemctl start netdata-zulip-bot 198``` 199 200### Docker 201 202```dockerfile 203FROM python:3.11-slim 204 205WORKDIR /app 206COPY . . 207RUN pip install -e . 208 209EXPOSE 8443 210 211CMD ["netdata-zulip-bot"] 212``` 213 214## Security 215 216### SSL Certificate Management 217 218The bot includes fully automated SSL certificate management: 219 2201. **Automatic Provisioning**: Obtains certificates from Let's Encrypt on first run 2212. **Automatic Renewal**: Checks daily and renews certificates 30 days before expiration 2223. **Zero Downtime**: Certificate renewal happens in the background 2234. **ACME HTTP-01 Challenge**: Built-in challenge server (requires port 80 access) 224 225### Mutual TLS Authentication 226 227The service supports mutual TLS to authenticate Netdata Cloud webhooks: 228 2291. **Server Certificate**: Automatically managed via built-in ACME client 2302. **Client Verification**: Validates Netdata's client certificate 2313. **CA Certificate**: Built-in Netdata CA certificate for client validation 232 233### Webhook Endpoint Security 234 235- HTTPS-only communication 236- Request logging and monitoring 237- Payload validation and sanitization 238- Error handling without information disclosure 239 240## Monitoring 241 242The service provides structured JSON logging for easy monitoring: 243 244```json 245{ 246 "timestamp": "2024-01-15T14:30:00.000Z", 247 "level": "info", 248 "event": "Message sent to Zulip", 249 "stream": "netdata-alerts", 250 "topic": "critical", 251 "message_id": 12345 252} 253``` 254 255### Health Check 256 257```bash 258curl -k https://your-domain.com:8443/health 259``` 260 261Response: 262```json 263{ 264 "status": "healthy", 265 "service": "netdata-zulip-bot" 266} 267``` 268 269## Development 270 271### Running Tests 272 273```bash 274pytest 275``` 276 277### Code Formatting 278 279```bash 280black . 281ruff check . 282``` 283 284### Local Development 285 286For development, you can disable HTTPS and mTLS: 287 288```bash 289export SERVER_ENABLE_MTLS=false 290# Use HTTP for testing (not recommended for production) 291``` 292 293## Troubleshooting 294 295### Common Issues 296 2971. **Certificate Issues** 298 - For automated certs: Ensure port 80 is accessible for ACME challenges 299 - Domain must point to your server's IP address 300 - Check `SERVER_CERT_EMAIL` is set for auto-cert mode 301 - Use `SERVER_CERT_STAGING=true` for testing to avoid rate limits 302 3032. **Zulip Connection Failed** 304 - Verify API credentials in zuliprc 305 - Test connection with Zulip's API 306 3073. **Webhook Not Receiving Data** 308 - Check firewall settings for port 8443 309 - Verify domain DNS resolution 310 - Check Netdata Cloud webhook configuration 311 312### Logs 313 314View service logs: 315```bash 316sudo journalctl -u netdata-zulip-bot -f 317``` 318 319## License 320 321MIT License - see LICENSE file for details.