Netdata.cloud bot for Zulip

update for release

+1 -1
CLAUDE.md
···
# Netdata Zulip Bot - Development Instructions
-
This repository implements a Zulip bot that receives incoming webhook notifications from Netdata Cloud and posts the resulting notifications to a Zulip topic.
+
This repository implements a production-ready Zulip bot that receives incoming webhook notifications from Netdata Cloud and posts the resulting notifications to a Zulip topic.
## Core Requirements
+21
LICENSE.md
···
+
MIT License
+
+
Copyright (c) 2025 Anil Madhavapeddy
+
+
Permission is hereby granted, free of charge, to any person obtaining a copy
+
of this software and associated documentation files (the "Software"), to deal
+
in the Software without restriction, including without limitation the rights
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+
copies of the Software, and to permit persons to whom the Software is
+
furnished to do so, subject to the following conditions:
+
+
The above copyright notice and this permission notice shall be included in all
+
copies or substantial portions of the Software.
+
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+
SOFTWARE.
+104 -103
README.md
···
# Netdata Zulip Bot
-
*100% vibe coded, use at your peril*
-
-
A webhook service that receives notifications from Netdata Cloud and forwards them to Zulip channels. Features HTTPS with Let's Encrypt certificates and mutual TLS authentication for secure communication with Netdata Cloud.
+
A production-ready webhook service that receives notifications from Netdata Cloud and forwards them to Zulip channels. Designed to run behind a reverse proxy (like Caddy) that handles HTTPS and mutual TLS authentication.
## Features
-
- 🔐 **Automated SSL Certificates**: Built-in Let's Encrypt integration with automatic renewal
-
- 🤝 **Mutual TLS**: Secure authentication with Netdata Cloud
+
- 🔗 **Reverse Proxy Ready**: HTTP service designed to run behind Caddy/nginx
+
- 🤝 **Mutual TLS Support**: When configured with reverse proxy
- 📊 **Rich Formatting**: Beautiful Zulip messages with emojis and markdown
- 🏷️ **Topic Organization**: Automatic topic routing by severity level
- 📝 **Structured Logging**: JSON-structured logs for monitoring
- ⚡ **High Performance**: FastAPI-based webhook endpoint
-
- 🚀 **Standalone**: No external dependencies like certbot required
+
- 🔧 **Flexible Configuration**: Support for .zuliprc files or environment variables
+
- ✅ **Webhook Verification**: Built-in Netdata challenge/response handling
## Quick Start
···
```bash
# Using uv (recommended)
uv sync
-
-
# Or using pip
-
pip install -e .
```
### 2. Create Configuration
```bash
# Generate sample configuration files
-
netdata-zulip-bot --create-config
+
uv run netdata-zulip-bot --create-config
# Copy and customize
cp .zuliprc.sample ~/.zuliprc
+
cp .env.sample .env
```
### 3. Configure Zulip Settings
···
stream=netdata-alerts
```
-
### 4. Set Server Environment Variables
+
### 4. Configure Environment Variables
+
+
Edit `.env` file or set environment variables:
```bash
-
export SERVER_DOMAIN=your-webhook-domain.com
-
export SERVER_PORT=8443
-
export SERVER_ENABLE_MTLS=true
+
# Server configuration (HTTP only)
+
export SERVER_HOST=0.0.0.0
+
export SERVER_PORT=8080
-
# For automated SSL certificates (recommended)
-
export SERVER_AUTO_CERT=true
-
export SERVER_CERT_EMAIL=admin@example.com
-
# Use staging for testing (optional)
-
export SERVER_CERT_STAGING=false
+
# Required: Netdata webhook challenge secret
+
export SERVER_CHALLENGE_SECRET=your-challenge-secret-here
+
+
# Optional: Override Zulip stream
+
export ZULIP_STREAM=netdata-alerts
```
### 5. Run the Service
```bash
-
# With automated SSL certificates
-
netdata-zulip-bot
+
# Start the HTTP service
+
uv run netdata-zulip-bot
-
# The bot will automatically:
-
# 1. Obtain SSL certificates from Let's Encrypt
-
# 2. Start the HTTPS server
-
# 3. Renew certificates before expiration
+
# Or with custom configuration
+
uv run netdata-zulip-bot --zuliprc /path/to/.zuliprc
+
+
# The service runs on HTTP (default: localhost:8080)
+
# Use a reverse proxy like Caddy for HTTPS and mutual TLS
```
## Configuration
···
export ZULIP_STREAM=netdata-alerts
```
-
Use `--env-config` flag to use environment variables instead of zuliprc.
+
Use the `--env-config` flag to use environment variables instead of zuliprc:
+
+
```bash
+
uv run netdata-zulip-bot --env-config
+
```
### Server Configuration
Set these environment variables:
-
- `SERVER_DOMAIN`: Your public domain (required)
- `SERVER_HOST`: Bind address (default: `0.0.0.0`)
-
- `SERVER_PORT`: HTTPS port (default: `8443`)
-
- `SERVER_ENABLE_MTLS`: Enable mutual TLS (default: `true`)
+
- `SERVER_PORT`: HTTP port (default: `8080`)
+
- `SERVER_CHALLENGE_SECRET`: Netdata webhook challenge secret (required)
-
#### Automated SSL Configuration (Recommended)
+
### Reverse Proxy Setup
-
- `SERVER_AUTO_CERT`: Enable automatic certificate management (default: `false`)
-
- `SERVER_CERT_EMAIL`: Email for Let's Encrypt account (required when auto_cert is true)
-
- `SERVER_CERT_PATH`: Directory for storing certificates (default: `./certs`)
-
- `SERVER_CERT_STAGING`: Use Let's Encrypt staging server for testing (default: `false`)
-
- `SERVER_ACME_PORT`: Port for ACME HTTP-01 challenge (default: `80`)
+
The bot is designed to run behind a reverse proxy that handles HTTPS and mutual TLS:
-
#### Manual SSL Configuration
+
#### Using Caddy (Recommended)
-
If not using automated certificates:
-
- `SERVER_CERT_PATH`: Path to certificate directory
-
- Place `fullchain.pem` and `privkey.pem` in `{SERVER_CERT_PATH}/{SERVER_DOMAIN}/`
+
1. Update `Caddyfile` with your domain name
+
2. Place Netdata CA certificate in `netdata-ca.pem`
+
3. Run both services:
+
+
```bash
+
# Start the bot
+
uv run netdata-zulip-bot &
+
+
# Start Caddy
+
caddy run --config Caddyfile
+
```
+
+
#### Using Docker Compose
+
+
```bash
+
docker-compose up -d
+
```
## Message Format
···
**Time:** 2024-01-15 14:30:00 UTC
**Details:** CPU usage has exceeded 90% for 5 minutes
+
**Summary:** Critical alert: High CPU usage detected
[View Alert](https://app.netdata.cloud/spaces/...)
···
### Systemd Service
-
Create `/etc/systemd/system/netdata-zulip-bot.service`:
+
See `examples/netdata-zulip-bot.service` for a complete systemd service configuration.
-
```ini
-
[Unit]
-
Description=Netdata Zulip Bot
-
After=network.target
+
### Automated Setup
-
[Service]
-
Type=simple
-
User=netdata-bot
-
WorkingDirectory=/opt/netdata-zulip-bot
-
Environment=SERVER_DOMAIN=your-domain.com
-
ExecStart=/opt/netdata-zulip-bot/venv/bin/netdata-zulip-bot
-
Restart=always
-
RestartSec=5
+
Use the provided setup script:
-
[Install]
-
WantedBy=multi-user.target
-
```
-
-
Enable and start:
```bash
-
sudo systemctl enable netdata-zulip-bot
-
sudo systemctl start netdata-zulip-bot
+
sudo ./scripts/setup.sh --domain your-domain.com --email admin@example.com
```
### Docker
-
```dockerfile
-
FROM python:3.11-slim
+
The included `Dockerfile` and `docker-compose.yml` provide a complete setup with Caddy reverse proxy:
-
WORKDIR /app
-
COPY . .
-
RUN pip install -e .
-
-
EXPOSE 8443
-
-
CMD ["netdata-zulip-bot"]
+
```bash
+
docker-compose up -d
```
## Security
-
### SSL Certificate Management
+
### Architecture
-
The bot includes fully automated SSL certificate management:
+
The bot uses a security-focused architecture:
-
1. **Automatic Provisioning**: Obtains certificates from Let's Encrypt on first run
-
2. **Automatic Renewal**: Checks daily and renews certificates 30 days before expiration
-
3. **Zero Downtime**: Certificate renewal happens in the background
-
4. **ACME HTTP-01 Challenge**: Built-in challenge server (requires port 80 access)
+
1. **HTTP Backend**: Simple HTTP service with no direct internet exposure
+
2. **Reverse Proxy**: Caddy handles HTTPS, certificates, and client authentication
+
3. **Mutual TLS**: Client certificate validation at the reverse proxy level
-
### Mutual TLS Authentication
+
### Webhook Security
-
The service supports mutual TLS to authenticate Netdata Cloud webhooks:
+
- **Challenge/Response**: Built-in Netdata webhook verification using HMAC-SHA256
+
- **Payload Validation**: Strict payload parsing and validation
+
- **Request Logging**: Comprehensive logging of all webhook requests
+
- **Error Handling**: Secure error responses without information disclosure
-
1. **Server Certificate**: Automatically managed via built-in ACME client
-
2. **Client Verification**: Validates Netdata's client certificate
-
3. **CA Certificate**: Built-in Netdata CA certificate for client validation
+
### SSL Certificate Management
-
### Webhook Endpoint Security
+
SSL certificates are managed by the reverse proxy (Caddy):
-
- HTTPS-only communication
-
- Request logging and monitoring
-
- Payload validation and sanitization
-
- Error handling without information disclosure
+
1. **Automatic Provisioning**: Caddy obtains Let's Encrypt certificates
+
2. **Automatic Renewal**: Built-in certificate renewal
+
3. **Mutual TLS**: Client certificate validation using Netdata CA certificate
## Monitoring
···
### Health Check
```bash
-
curl -k https://your-domain.com:8443/health
+
# Direct HTTP check (backend service)
+
curl http://localhost:8080/health
+
+
# Through reverse proxy
+
curl https://your-domain.com/health
```
Response:
···
### Running Tests
```bash
-
pytest
+
uv run python -m pytest tests/ -v
```
### Code Formatting
```bash
-
black .
-
ruff check .
+
uv run black .
+
uv run ruff check .
```
### Local Development
-
For development, you can disable HTTPS and mTLS:
+
For development, you can run the HTTP service directly:
```bash
-
export SERVER_ENABLE_MTLS=false
-
# Use HTTP for testing (not recommended for production)
+
# Set required environment variables
+
export SERVER_CHALLENGE_SECRET=test-secret
+
+
# Run the service
+
uv run netdata-zulip-bot
+
+
# Test webhook endpoint
+
curl -X POST http://localhost:8080/webhook/netdata?crc_token=test123
```
## Troubleshooting
### Common Issues
-
1. **Certificate Issues**
-
- For automated certs: Ensure port 80 is accessible for ACME challenges
-
- Domain must point to your server's IP address
-
- Check `SERVER_CERT_EMAIL` is set for auto-cert mode
-
- Use `SERVER_CERT_STAGING=true` for testing to avoid rate limits
+
1. **Configuration Issues**
+
- Ensure `SERVER_CHALLENGE_SECRET` is set (required for Netdata webhook verification)
+
- Verify `.zuliprc` file contains all required fields
+
- Check that Zulip bot has permission to post to the configured stream
-
2. **Zulip Connection Failed**
-
- Verify API credentials in zuliprc
-
- Test connection with Zulip's API
+
2. **Reverse Proxy Issues**
+
- Ensure Caddy configuration uses correct domain name
+
- Verify Netdata CA certificate is properly configured
+
- Check that port 80 is accessible for Let's Encrypt challenges
3. **Webhook Not Receiving Data**
-
- Check firewall settings for port 8443
-
- Verify domain DNS resolution
-
- Check Netdata Cloud webhook configuration
+
- Verify Netdata Cloud webhook URL points to your reverse proxy
+
- Check webhook challenge secret matches configuration
+
- Review service logs for error messages
### Logs
···
## License
-
MIT License - see LICENSE file for details.
+
MIT License - see LICENSE file for details.
+1 -1
pyproject.toml
···
version = "0.1.0"
description = "Zulip bot for receiving Netdata Cloud webhook notifications"
authors = [
-
{name = "Your Name", email = "your.email@example.com"}
+
{name = "Anil Madhavapeddy", email = "anil@recoil.org"}
]
readme = "README.md"
requires-python = ">=3.11"