Netdata.cloud bot for Zulip
1# Netdata Zulip Bot 2 3A production-ready webhook service that receives notifications from Netdata Cloud and forwards them to Zulip channels. Designed to run behind a reverse proxy (like Caddy) that handles HTTPS and mutual TLS authentication. 4 5## Features 6 7- 🔗 **Reverse Proxy Ready**: HTTP service designed to run behind Caddy/nginx 8- 🤝 **Mutual TLS Support**: When configured with reverse proxy 9- 📊 **Rich Formatting**: Beautiful Zulip messages with emojis and markdown 10- 🏷️ **Topic Organization**: Automatic topic routing by severity level 11- 📝 **Structured Logging**: JSON-structured logs for monitoring 12-**High Performance**: FastAPI-based webhook endpoint 13- 🔧 **Flexible Configuration**: Support for .zuliprc files or environment variables 14-**Webhook Verification**: Built-in Netdata challenge/response handling 15 16## Quick Start 17 18### 1. Install Dependencies 19 20```bash 21# Using uv (recommended) 22uv sync 23``` 24 25### 2. Create Configuration 26 27```bash 28# Generate sample configuration files 29uv run netdata-zulip-bot --create-config 30 31# Copy and customize 32cp .zuliprc.sample ~/.zuliprc 33cp .env.sample .env 34``` 35 36### 3. Configure Zulip Settings 37 38Edit `~/.zuliprc`: 39 40```ini 41[api] 42site=https://yourorg.zulipchat.com 43email=netdata-bot@yourorg.zulipchat.com 44key=your-zulip-api-key 45stream=netdata-alerts 46``` 47 48### 4. Configure Environment Variables 49 50Edit `.env` file or set environment variables: 51 52```bash 53# Server configuration (HTTP only) 54export SERVER_HOST=0.0.0.0 55export SERVER_PORT=8080 56 57# Required: Netdata webhook challenge secret 58export SERVER_CHALLENGE_SECRET=your-challenge-secret-here 59 60# Optional: Override Zulip stream 61export ZULIP_STREAM=netdata-alerts 62``` 63 64### 5. Run the Service 65 66```bash 67# Start the HTTP service 68uv run netdata-zulip-bot 69 70# Or with custom configuration 71uv run netdata-zulip-bot --zuliprc /path/to/.zuliprc 72 73# The service runs on HTTP (default: localhost:8080) 74# Use a reverse proxy like Caddy for HTTPS and mutual TLS 75``` 76 77## Configuration 78 79### Zulip Configuration 80 81The bot supports two configuration methods: 82 83#### Method 1: Zuliprc File (Recommended) 84 85Create `~/.zuliprc`: 86 87```ini 88[api] 89site=https://yourorg.zulipchat.com 90email=netdata-bot@yourorg.zulipchat.com 91key=your-zulip-api-key 92stream=netdata-alerts 93``` 94 95#### Method 2: Environment Variables 96 97```bash 98export ZULIP_SITE=https://yourorg.zulipchat.com 99export ZULIP_EMAIL=netdata-bot@yourorg.zulipchat.com 100export ZULIP_API_KEY=your-api-key 101export ZULIP_STREAM=netdata-alerts 102``` 103 104Use the `--env-config` flag to use environment variables instead of zuliprc: 105 106```bash 107uv run netdata-zulip-bot --env-config 108``` 109 110### Server Configuration 111 112Set these environment variables: 113 114- `SERVER_HOST`: Bind address (default: `0.0.0.0`) 115- `SERVER_PORT`: HTTP port (default: `8080`) 116- `SERVER_CHALLENGE_SECRET`: Netdata webhook challenge secret (required) 117 118### Reverse Proxy Setup 119 120The bot is designed to run behind a reverse proxy that handles HTTPS and mutual TLS: 121 122#### Using Caddy (Recommended) 123 1241. Update `Caddyfile` with your domain name 1252. Place Netdata CA certificate in `netdata-ca.pem` 1263. Run both services: 127 128```bash 129# Start the bot 130uv run netdata-zulip-bot & 131 132# Start Caddy 133caddy run --config Caddyfile 134``` 135 136#### Using Docker Compose 137 138```bash 139docker-compose up -d 140``` 141 142## Message Format 143 144### Alert Notifications 145 146Messages are posted to topics based on severity level: 147 148- **Topic**: `critical`, `warning`, or `clear` 149- **Format**: Rich markdown with alert details, timestamps, and links 150 151Example: 152``` 153🔴 **High CPU Usage** 154 155**Space:** production 156**Chart:** system.cpu 157**Context:** cpu utilization 158**Severity:** Critical 159**Time:** 2024-01-15 14:30:00 UTC 160 161**Details:** CPU usage has exceeded 90% for 5 minutes 162 163**Summary:** Critical alert: High CPU usage detected 164 165[View Alert](https://app.netdata.cloud/spaces/...) 166``` 167 168### Reachability Notifications 169 170Messages are posted to the `reachability` topic: 171 172``` 173❌ **Host Unreachable** 174 175**Host:** web-server-01 176**Status:** ❌ Unreachable 177**Severity:** Critical 178 179**Summary:** Host web-server-01 is no longer reachable 180 181[View Host](https://app.netdata.cloud/...) 182``` 183 184## Deployment 185 186### Systemd Service 187 188See `examples/netdata-zulip-bot.service` for a complete systemd service configuration. 189 190### Automated Setup 191 192Use the provided setup script: 193 194```bash 195sudo ./scripts/setup.sh --domain your-domain.com --email admin@example.com 196``` 197 198### Docker 199 200The included `Dockerfile` and `docker-compose.yml` provide a complete setup with Caddy reverse proxy: 201 202```bash 203docker-compose up -d 204``` 205 206## Security 207 208### Architecture 209 210The bot uses a security-focused architecture: 211 2121. **HTTP Backend**: Simple HTTP service with no direct internet exposure 2132. **Reverse Proxy**: Caddy handles HTTPS, certificates, and client authentication 2143. **Mutual TLS**: Client certificate validation at the reverse proxy level 215 216### Webhook Security 217 218- **Challenge/Response**: Built-in Netdata webhook verification using HMAC-SHA256 219- **Payload Validation**: Strict payload parsing and validation 220- **Request Logging**: Comprehensive logging of all webhook requests 221- **Error Handling**: Secure error responses without information disclosure 222 223### SSL Certificate Management 224 225SSL certificates are managed by the reverse proxy (Caddy): 226 2271. **Automatic Provisioning**: Caddy obtains Let's Encrypt certificates 2282. **Automatic Renewal**: Built-in certificate renewal 2293. **Mutual TLS**: Client certificate validation using Netdata CA certificate 230 231## Monitoring 232 233The service provides structured JSON logging for easy monitoring: 234 235```json 236{ 237 "timestamp": "2024-01-15T14:30:00.000Z", 238 "level": "info", 239 "event": "Message sent to Zulip", 240 "stream": "netdata-alerts", 241 "topic": "critical", 242 "message_id": 12345 243} 244``` 245 246### Health Check 247 248```bash 249# Direct HTTP check (backend service) 250curl http://localhost:8080/health 251 252# Through reverse proxy 253curl https://your-domain.com/health 254``` 255 256Response: 257```json 258{ 259 "status": "healthy", 260 "service": "netdata-zulip-bot" 261} 262``` 263 264## Development 265 266### Running Tests 267 268```bash 269uv run python -m pytest tests/ -v 270``` 271 272### Code Formatting 273 274```bash 275uv run black . 276uv run ruff check . 277``` 278 279### Local Development 280 281For development, you can run the HTTP service directly: 282 283```bash 284# Set required environment variables 285export SERVER_CHALLENGE_SECRET=test-secret 286 287# Run the service 288uv run netdata-zulip-bot 289 290# Test webhook endpoint 291curl -X POST http://localhost:8080/webhook/netdata?crc_token=test123 292``` 293 294## Troubleshooting 295 296### Common Issues 297 2981. **Configuration Issues** 299 - Ensure `SERVER_CHALLENGE_SECRET` is set (required for Netdata webhook verification) 300 - Verify `.zuliprc` file contains all required fields 301 - Check that Zulip bot has permission to post to the configured stream 302 3032. **Reverse Proxy Issues** 304 - Ensure Caddy configuration uses correct domain name 305 - Verify Netdata CA certificate is properly configured 306 - Check that port 80 is accessible for Let's Encrypt challenges 307 3083. **Webhook Not Receiving Data** 309 - Verify Netdata Cloud webhook URL points to your reverse proxy 310 - Check webhook challenge secret matches configuration 311 - Review service logs for error messages 312 313### Logs 314 315View service logs: 316```bash 317sudo journalctl -u netdata-zulip-bot -f 318``` 319 320## License 321 322MIT License - see LICENSE file for details.