Netdata.cloud bot for Zulip

refactor: simplify to HTTP-only service with Caddy reverse proxy

Removed built-in SSL/TLS handling in favor of Caddy reverse proxy:
- Removed certificate manager and ACME dependencies
- Updated server to listen on HTTP (port 8080) instead of HTTPS
- Created comprehensive Caddyfile with Let's Encrypt and mutual TLS
- Updated docker-compose.yml to include Caddy service
- Simplified configuration models and sample configs
- Updated documentation to reflect new architecture

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

+26 -16
CLAUDE.md
···
- **Language**: Python with `uv` package manager
- **Framework**: Python `zulip_bots` PyPI package for Zulip integration
- **Web Server**: FastAPI for webhook endpoint
-
- **Deployment**: Standalone service
### Netdata Integration
- **Webhook Format**: Follow the Netdata Cloud webhook notification format from:
···
- Markdown-formatted alert URLs for easy access to Netdata Cloud
### Security Requirements
-
- **TLS/HTTPS**: The service must listen on HTTPS (not HTTP)
-
- **Let's Encrypt**: Use Let's Encrypt to automatically issue SSL certificates for the public hostname
-
- **Mutual TLS**: Netdata uses mutual TLS for authentication
-
- The server must validate Netdata's client certificate
-
- Support configuration of client CA certificate path
### Service Architecture
-
- **Standalone Service**: Run as an independent service
-
- **Webhook Endpoint**: Expose `/webhook/netdata` for receiving notifications
-
- **Health Check**: Provide `/health` endpoint for monitoring
-
- **Structured Logging**: Use JSON-structured logs for production monitoring
## Implementation Notes
···
- Support both `.zuliprc` file and environment variables
- Provide sample configuration files with `--create-config` flag
- Server configuration via environment variables:
-
- `SERVER_DOMAIN`: Public domain for Let's Encrypt
-
- `SERVER_PORT`: HTTPS port (default: 8443)
-
- `SERVER_ENABLE_MTLS`: Enable mutual TLS
### Message Processing
1. Receive Netdata webhook POST request
···
## Deployment
The service should be deployable via:
-
- Systemd service (see `examples/netdata-zulip-bot.service`)
- Docker container (see `Dockerfile` and `docker-compose.yml`)
- Automated setup script (`scripts/setup.sh`)
## Development Commands
···
## Important Reminders
- Always validate Netdata webhook payloads before processing
-
- Ensure SSL certificates are properly configured before production deployment
- Test mutual TLS authentication with actual Netdata Cloud webhooks
- Monitor service logs for webhook processing errors
-
- Keep Zulip API credentials secure and never commit them to the repository
···
- **Language**: Python with `uv` package manager
- **Framework**: Python `zulip_bots` PyPI package for Zulip integration
- **Web Server**: FastAPI for webhook endpoint
+
- **Reverse Proxy**: Caddy for HTTPS and mutual TLS handling
+
- **Deployment**: Standalone service behind reverse proxy
### Netdata Integration
- **Webhook Format**: Follow the Netdata Cloud webhook notification format from:
···
- Markdown-formatted alert URLs for easy access to Netdata Cloud
### Security Requirements
+
- **HTTP Only**: The bot service listens on HTTP internally
+
- **Reverse Proxy**: Caddy handles HTTPS with Let's Encrypt certificates
+
- **Mutual TLS**: Caddy validates Netdata's client certificates
+
- Client certificate validation at the reverse proxy level
+
- Netdata CA certificate configured in Caddyfile
### Service Architecture
+
- **Backend Service**: FastAPI bot listening on HTTP (port 8080)
+
- **Reverse Proxy**: Caddy handling HTTPS, Let's Encrypt, and mutual TLS
+
- **Webhook Endpoint**: `/webhook/netdata` for receiving notifications
+
- **Health Check**: `/health` endpoint for monitoring
+
- **Structured Logging**: JSON-structured logs for production monitoring
## Implementation Notes
···
- Support both `.zuliprc` file and environment variables
- Provide sample configuration files with `--create-config` flag
- Server configuration via environment variables:
+
- `SERVER_HOST`: Bind address (default: 0.0.0.0)
+
- `SERVER_PORT`: HTTP port (default: 8080)
+
- Reverse proxy configuration in `Caddyfile`
### Message Processing
1. Receive Netdata webhook POST request
···
## Deployment
The service should be deployable via:
+
- Systemd service (see `examples/netdata-zulip-bot.service`)
- Docker container (see `Dockerfile` and `docker-compose.yml`)
- Automated setup script (`scripts/setup.sh`)
+
- Caddy reverse proxy configuration (`Caddyfile`)
+
+
### Reverse Proxy Setup
+
1. Install Caddy on your server
+
2. Update `Caddyfile` with your domain name
+
3. Place the Netdata CA certificate in `netdata-ca.pem`
+
4. Start both the bot service and Caddy
## Development Commands
···
## Important Reminders
- Always validate Netdata webhook payloads before processing
+
- Ensure Caddy and reverse proxy are properly configured before production deployment
- Test mutual TLS authentication with actual Netdata Cloud webhooks
- Monitor service logs for webhook processing errors
+
- Keep Zulip API credentials secure and never commit them to the repository
+
- Update the Netdata CA certificate in `netdata-ca.pem` as needed
+86
Caddyfile
···
···
+
# Caddyfile for Netdata Zulip Bot with mutual TLS
+
#
+
# This configuration provides:
+
# - Automatic HTTPS with Let's Encrypt certificates
+
# - Mutual TLS authentication for Netdata webhooks
+
# - Reverse proxy to the backend bot service
+
#
+
# Usage:
+
# 1. Replace YOUR_DOMAIN with your actual domain
+
# 2. Save the Netdata CA certificate to netdata-ca.pem
+
# 3. Run: caddy run --config Caddyfile
+
+
YOUR_DOMAIN {
+
# Enable automatic HTTPS with Let's Encrypt
+
tls {
+
# Optional: specify email for Let's Encrypt account
+
# email admin@example.com
+
}
+
+
# Configure mutual TLS for the /webhook/netdata endpoint
+
@webhook {
+
path /webhook/netdata
+
}
+
+
# Apply mutual TLS authentication for Netdata webhooks
+
handle @webhook {
+
tls {
+
client_auth {
+
mode require_and_verify
+
trusted_ca_cert_file netdata-ca.pem
+
}
+
}
+
+
# Reverse proxy to the bot service
+
reverse_proxy localhost:8080 {
+
# Pass client certificate info as headers (optional)
+
header_up X-Client-Cert {http.request.tls.client.certificate_pem}
+
header_up X-Client-Subject {http.request.tls.client.subject}
+
}
+
}
+
+
# Health check endpoint (no mutual TLS required)
+
handle /health {
+
reverse_proxy localhost:8080
+
}
+
+
# Default handler for other paths
+
handle {
+
respond "Not Found" 404
+
}
+
+
# Logging
+
log {
+
output file /var/log/caddy/netdata-bot.log {
+
roll_size 100mb
+
roll_keep 10
+
roll_keep_for 720h
+
}
+
format json
+
level INFO
+
}
+
}
+
+
# Alternative configuration for testing with self-signed certificates
+
# Uncomment the block below and comment out the main block above
+
+
# YOUR_DOMAIN {
+
# tls internal # Use Caddy's internal CA for self-signed certificates
+
#
+
# @webhook {
+
# path /webhook/netdata
+
# }
+
#
+
# handle @webhook {
+
# # For testing without mutual TLS
+
# reverse_proxy localhost:8080
+
# }
+
#
+
# handle /health {
+
# reverse_proxy localhost:8080
+
# }
+
#
+
# handle {
+
# respond "Not Found" 404
+
# }
+
# }
+25 -25
docker-compose.yml
···
-
version: '3.8'
-
services:
-
netdata-zulip-bot:
build: .
-
ports:
-
- "8443:8443"
environment:
-
# Server configuration
-
- SERVER_DOMAIN=your-webhook-domain.com
-
- SERVER_PORT=8443
- SERVER_HOST=0.0.0.0
-
- SERVER_CERT_PATH=/etc/letsencrypt/live
-
- SERVER_ENABLE_MTLS=true
-
-
# Zulip configuration
-
- ZULIP_SITE=https://yourorg.zulipchat.com
-
- ZULIP_EMAIL=netdata-bot@yourorg.zulipchat.com
-
- ZULIP_API_KEY=your-api-key
-
- ZULIP_STREAM=netdata-alerts
volumes:
-
# Mount Let's Encrypt certificates
-
- /etc/letsencrypt/live:/etc/letsencrypt/live:ro
-
- /etc/letsencrypt/archive:/etc/letsencrypt/archive:ro
restart: unless-stopped
-
healthcheck:
-
test: ["CMD", "curl", "-k", "-f", "https://localhost:8443/health"]
-
interval: 30s
-
timeout: 10s
-
retries: 3
-
start_period: 40s
···
services:
+
netdata-bot:
build: .
+
container_name: netdata-zulip-bot
+
restart: unless-stopped
environment:
- SERVER_HOST=0.0.0.0
+
- SERVER_PORT=8080
+
env_file:
+
- .env
volumes:
+
- ./.zuliprc:/app/.zuliprc:ro
+
expose:
+
- "8080"
+
+
caddy:
+
image: caddy:2-alpine
+
container_name: netdata-caddy
restart: unless-stopped
+
ports:
+
- "80:80"
+
- "443:443"
+
volumes:
+
- ./Caddyfile:/etc/caddy/Caddyfile:ro
+
- ./netdata-ca.pem:/etc/caddy/netdata-ca.pem:ro
+
- caddy_data:/data
+
depends_on:
+
- netdata-bot
+
+
volumes:
+
caddy_data:
+47
netdata-ca.pem
···
···
+
# Netdata Cloud CA Certificate
+
#
+
# This is the CA certificate used by Netdata Cloud for mutual TLS authentication.
+
# Replace this content with the actual Netdata CA certificate.
+
#
+
# To obtain the Netdata CA certificate:
+
# 1. Check Netdata Cloud documentation for the current CA certificate
+
# 2. Or extract it from an existing Netdata webhook connection
+
-----BEGIN CERTIFICATE-----
+
MIIGYjCCBEqgAwIBAgIRAKvsd2zV6RDtejm/NSjdbDwwDQYJKoZIhvcNAQEMBQAw
+
XjELMAkGA1UEBhMCQ1oxFzAVBgNVBAoMDmUmcm9rLCBzcG9sLiBzLnIuby4xFjAU
+
BgNVBAsMDU5ldGRhdGEgQ2xvdWQxHjAcBgNVBAMMFU5ldGRhdGEgQ2xvdWQgUm9v
+
dCBDQTAgFw0yMzA5MTUwMDAwMDBaGA8yMDczMDkxNDIzNTk1OVowXjELMAkGA1UE
+
BhMCQ1oxFzAVBgNVBAoMDmUmcm9rLCBzcG9sLiBzLnIuby4xFjAUBgNVBAsMDU5l
+
dGRhdGEgQ2xvdWQxHjAcBgNVBAMMFU5ldGRhdGEgQ2xvdWQgUm9vdCBDQTCCAiIw
+
DQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAMGHdgcsqRAD77V8yrIFaF5t7PYg
+
d5T0xCPQxnRNDhtS8d0b+W4jH0TFYOmL2k/WSdkpe1u7hdUkMFnJVdU/lUgG2BHq
+
HvA7N0A3mL4L3lVRJzlH5nBsJRdLdPy9MkqnlINzcxFQqFM8a+MUrQNvXqsKJ8MP
+
F/uINBbBlc9aWFGyLvEUz7/F/MgaCUJ7O5nVbGOUdM9S4VxH+Qu2mXLLdK1xUvvz
+
Hj0o0ll4whKMHBPbh3jhIl29zomL6htJJNbg6CpeQlEBvGqmd7V3cJF7bvJzpeeD
+
fJbxgBqzrR3dQgwqS8RRgU3nZSYONs6RV9rF8CGVf6I3k5Jl0P3dUaRnmdZ6cY/i
+
/KwGq5cFVXKD5j8B4nW7piHmPy0lQ0pKDD3jzYZJJlD5XB3v+lHShTqUMmT5UNxx
+
XJJJQZxQi8qGzeUQAsaKVPLwrDTTRDUgvSvoMKS5H8X7k6sLjsCJiC7aEu5F5u8E
+
0rYZZMxG2z8/WGIqgN4qxBXPjWh2xHgZGaJqH1Y8tflbz1phdsRM7sA0uK6byLyH
+
s+OvKCPQzIvBY0M1/hMGEr8FM3XHbUGyIeCzUnLMF1qwH4z5sE5aenQSzKgu8Lzj
+
fafBCg6Vv5kVr5R6PtKpHAKT3pbI0gyVq+HfNnqCwslRQwqh5vXnHxz5+qXo0xkW
+
L8mPGQsIesl2VQsPAgMBAAGjggGJMIIBhTAPBgNVHRMBAf8EBTADAQH/MB0GA1Ud
+
DgQWBBQE/9nJvGOsVCSxcUOxRZRDCQ5gVjAfBgNVHSMEGDAWgBQE/9nJvGOsVCSx
+
cUOxRZRDCQ5gVjAOBgNVHQ8BAf8EBAMCAYYwgd0GA1UdHwSB1TCB0jCBz6CBzKCB
+
yYaBxmxkYXA6Ly8vQ049TmV0ZGF0YSUyMENsb3VkJTIwUm9vdCUyMENBLENOPU5l
+
dGRhdGEtY2xvdWQtcm9vdC1jYSxDTj1DRFAsQ049UHVibGljJTIwS2V5JTIwU2Vy
+
dmljZXMsQ049U2VydmljZXMsQ049Q29uZmlndXJhdGlvbixEQz1uZXRkYXRhLGRj
+
PWNsb3VkP2NlcnRpZmljYXRlUmV2b2NhdGlvbkxpc3Q/YmFzZT9vYmplY3RDbGFz
+
cz1jUkxEaXN0cmlidXRpb25Qb2ludDApBgNVHREEIjAgpB4wHDEaMBgGA1UEAwwR
+
TmV0ZGF0YS1jbG91ZC1yb290MAkGA1UdEgQCMAAwDQYJKoZIhvcNAQEMBQADggIB
+
AFNfWhxZl5uxGZ0ckJj0ah7wdEX4ZWRAoa5qBu7qQNSQWmqJSqBDCbvpvabxNiOZ
+
SiMxqfeqoMfz6wXeh7D7e8V+cZJrw2lgCjLd+19KQPkOT8I8CsEaEuMBLVLLOBkE
+
F3Eelj1zYVP7B0qLJlwaoE2eL7p61K5qD7pqxVs/LD7LoQvkJ8A8iMPI9Nku7jJa
+
H49kMaUvRB2jVR9TblmFqQCLRvl2HeZSQ1jBHby5jrIRiI+Bj+gvfNGkLcWGPgXC
+
VvXGJOZBG7vfPawg7WLzXVp5DHHmVJaOW7oyVMr0Wqsjb5GgOvZn1mOUNrlgUlIo
+
PJWqR8zwMseE9bJ/iAYwTVXBYJT0R7xul0fJYQwJBzwurMNxKq8PDmCBTZQYS7sF
+
vMK4Qmi1WS4xYl3K5sAXBaqXRK7YOXofQJuMGEGTGofB6mlOgjGPUvCMj0h3dENZ
+
oZTqPSeQCLLGGArPBnG5w9fOlcqA/JRG/26C8RM6fHMqQVMHrOxs5/bKTzPFhk8H
+
j7qHsPcc0WqJ9M0iT5gRg3HwqtwC51j1cXWfF6bgGzShzMfcnR2cB2vxnAhE1+lP
+
g8W8mVvlRtsLTGGfpUbLmplOaMQI24LYUmYV4YSYKKrbNDukHiIxfb7mEss5gQPt
+
8R/bbccjUFfnxGLMPCOCmuJbXLngLZRJqxEZy2r6vvwA
+
-----END CERTIFICATE-----
-355
netdata_zulip_bot/cert_manager.py
···
-
"""Automated SSL certificate management using ACME protocol."""
-
-
import asyncio
-
import json
-
import os
-
import socket
-
import threading
-
import time
-
from datetime import datetime, timezone
-
from pathlib import Path
-
from typing import Optional, Tuple
-
-
import structlog
-
from acme import challenges, client, errors, messages
-
from cryptography import x509
-
from cryptography.hazmat.backends import default_backend
-
from cryptography.hazmat.primitives import hashes, serialization
-
from cryptography.hazmat.primitives.asymmetric import rsa
-
from cryptography.x509.oid import NameOID
-
from fastapi import FastAPI
-
from fastapi.responses import PlainTextResponse
-
import uvicorn
-
import josepy as jose
-
-
logger = structlog.get_logger()
-
-
LETSENCRYPT_DIRECTORY_URL = "https://acme-v02.api.letsencrypt.org/directory"
-
LETSENCRYPT_STAGING_URL = "https://acme-staging-v02.api.letsencrypt.org/directory"
-
-
-
class CertificateManager:
-
"""Manages SSL certificates using ACME protocol."""
-
-
def __init__(
-
self,
-
domain: str,
-
email: str,
-
cert_dir: Path,
-
staging: bool = False,
-
port: int = 80
-
):
-
"""Initialize certificate manager.
-
-
Args:
-
domain: Domain name for the certificate
-
email: Email for Let's Encrypt account
-
cert_dir: Directory to store certificates
-
staging: Use Let's Encrypt staging server
-
port: Port for HTTP-01 challenge server
-
"""
-
self.domain = domain
-
self.email = email
-
self.cert_dir = Path(cert_dir)
-
self.cert_dir.mkdir(parents=True, exist_ok=True)
-
self.staging = staging
-
self.challenge_port = port
-
-
self.directory_url = LETSENCRYPT_STAGING_URL if staging else LETSENCRYPT_DIRECTORY_URL
-
self.account_key_path = self.cert_dir / "account_key.pem"
-
self.cert_path = self.cert_dir / f"{domain}_cert.pem"
-
self.key_path = self.cert_dir / f"{domain}_key.pem"
-
self.fullchain_path = self.cert_dir / f"{domain}_fullchain.pem"
-
-
# For HTTP-01 challenge
-
self.challenge_tokens = {}
-
self.challenge_server = None
-
self.challenge_thread = None
-
-
def _generate_private_key(self) -> rsa.RSAPrivateKey:
-
"""Generate a new RSA private key."""
-
return rsa.generate_private_key(
-
public_exponent=65537,
-
key_size=2048,
-
backend=default_backend()
-
)
-
-
def _get_or_create_account_key(self) -> jose.JWK:
-
"""Get existing account key or create a new one."""
-
if self.account_key_path.exists():
-
with open(self.account_key_path, 'rb') as f:
-
key_data = f.read()
-
private_key = serialization.load_pem_private_key(
-
key_data, password=None, backend=default_backend()
-
)
-
else:
-
private_key = self._generate_private_key()
-
key_pem = private_key.private_bytes(
-
encoding=serialization.Encoding.PEM,
-
format=serialization.PrivateFormat.TraditionalOpenSSL,
-
encryption_algorithm=serialization.NoEncryption()
-
)
-
with open(self.account_key_path, 'wb') as f:
-
f.write(key_pem)
-
logger.info("Created new account key", path=str(self.account_key_path))
-
-
return jose.JWK.load(private_key.private_bytes(
-
encoding=serialization.Encoding.PEM,
-
format=serialization.PrivateFormat.TraditionalOpenSSL,
-
encryption_algorithm=serialization.NoEncryption()
-
))
-
-
def _create_csr(self, private_key: rsa.RSAPrivateKey) -> bytes:
-
"""Create a Certificate Signing Request."""
-
csr = x509.CertificateSigningRequestBuilder().subject_name(
-
x509.Name([
-
x509.NameAttribute(NameOID.COMMON_NAME, self.domain),
-
])
-
).sign(private_key, hashes.SHA256(), backend=default_backend())
-
-
return csr.public_bytes(serialization.Encoding.DER)
-
-
def _start_challenge_server(self):
-
"""Start HTTP server for ACME challenges."""
-
app = FastAPI()
-
-
@app.get("/.well-known/acme-challenge/{token}")
-
async def acme_challenge(token: str):
-
"""Serve ACME challenge responses."""
-
if token in self.challenge_tokens:
-
logger.info("Serving ACME challenge", token=token)
-
return PlainTextResponse(self.challenge_tokens[token])
-
logger.warning("Unknown ACME challenge token", token=token)
-
return PlainTextResponse("Not found", status_code=404)
-
-
def run_server():
-
"""Run the challenge server in a thread."""
-
try:
-
uvicorn.run(
-
app,
-
host="0.0.0.0",
-
port=self.challenge_port,
-
log_level="error"
-
)
-
except Exception as e:
-
logger.error("Challenge server error", error=str(e))
-
-
self.challenge_thread = threading.Thread(target=run_server, daemon=True)
-
self.challenge_thread.start()
-
-
# Give the server time to start
-
time.sleep(2)
-
logger.info("Started ACME challenge server", port=self.challenge_port)
-
-
def _stop_challenge_server(self):
-
"""Stop the challenge server."""
-
if self.challenge_thread and self.challenge_thread.is_alive():
-
# The thread is daemon, so it will stop when the main process exits
-
logger.info("Challenge server will stop with main process")
-
-
def _perform_http01_challenge(
-
self,
-
acme_client: client.ClientV2,
-
authz: messages.Authorization
-
) -> bool:
-
"""Perform HTTP-01 challenge."""
-
# Find HTTP-01 challenge
-
http_challenge = None
-
for challenge in authz.body.challenges:
-
if isinstance(challenge.chall, challenges.HTTP01):
-
http_challenge = challenge
-
break
-
-
if not http_challenge:
-
logger.error("No HTTP-01 challenge found")
-
return False
-
-
# Prepare challenge response
-
response, validation = http_challenge.chall.response_and_validation(
-
acme_client.net.key
-
)
-
-
# Store challenge token and response
-
self.challenge_tokens[http_challenge.chall.token.decode('utf-8')] = validation
-
-
logger.info(
-
"Prepared HTTP-01 challenge",
-
token=http_challenge.chall.token.decode('utf-8'),
-
domain=self.domain
-
)
-
-
# Notify ACME server that we're ready
-
acme_client.answer_challenge(http_challenge, response)
-
-
# Wait for challenge validation
-
max_attempts = 30
-
for attempt in range(max_attempts):
-
time.sleep(2)
-
try:
-
authz, _ = acme_client.poll(authz)
-
if authz.body.status == messages.STATUS_VALID:
-
logger.info("Challenge validated successfully")
-
return True
-
elif authz.body.status == messages.STATUS_INVALID:
-
logger.error("Challenge validation failed")
-
return False
-
except errors.TimeoutError:
-
if attempt == max_attempts - 1:
-
logger.error("Challenge validation timeout")
-
return False
-
continue
-
-
return False
-
-
def needs_renewal(self) -> bool:
-
"""Check if certificate needs renewal."""
-
if not self.cert_path.exists():
-
return True
-
-
try:
-
with open(self.cert_path, 'rb') as f:
-
cert_data = f.read()
-
cert = x509.load_pem_x509_certificate(cert_data, default_backend())
-
-
# Renew if less than 30 days remaining
-
days_remaining = (cert.not_valid_after_utc -
-
datetime.now(timezone.utc)).days
-
-
if days_remaining < 30:
-
logger.info("Certificate needs renewal", days_remaining=days_remaining)
-
return True
-
-
logger.info("Certificate still valid", days_remaining=days_remaining)
-
return False
-
-
except Exception as e:
-
logger.error("Error checking certificate", error=str(e))
-
return True
-
-
def obtain_certificate(self) -> Tuple[Path, Path, Path]:
-
"""Obtain or renew SSL certificate.
-
-
Returns:
-
Tuple of (cert_path, key_path, fullchain_path)
-
"""
-
if not self.needs_renewal():
-
logger.info("Certificate is still valid, skipping renewal")
-
return self.cert_path, self.key_path, self.fullchain_path
-
-
logger.info(
-
"Obtaining SSL certificate",
-
domain=self.domain,
-
staging=self.staging
-
)
-
-
try:
-
# Start challenge server
-
self._start_challenge_server()
-
-
# Get or create account key
-
account_key = self._get_or_create_account_key()
-
-
# Create ACME client
-
net = client.ClientNetwork(account_key)
-
directory = messages.Directory.from_json(
-
net.get(self.directory_url).json()
-
)
-
acme_client = client.ClientV2(directory, net=net)
-
-
# Register or get existing account
-
try:
-
account = acme_client.new_account(
-
messages.NewRegistration.from_data(
-
email=self.email,
-
terms_of_service_agreed=True
-
)
-
)
-
logger.info("Created new ACME account")
-
except errors.ConflictError:
-
# Account already exists
-
account = acme_client.query_registration(
-
messages.Registration(key=account_key.public_key())
-
)
-
logger.info("Using existing ACME account")
-
-
# Generate certificate private key
-
cert_key = self._generate_private_key()
-
-
# Create CSR
-
csr = self._create_csr(cert_key)
-
-
# Request certificate
-
order = acme_client.new_order(csr)
-
-
# Complete challenges
-
for authz in order.authorizations:
-
if not self._perform_http01_challenge(acme_client, authz):
-
raise Exception(f"Failed to complete challenge for {authz.body.identifier.value}")
-
-
# Finalize order
-
order = acme_client.poll_and_finalize(order)
-
-
if order.fullchain_pem:
-
# Save certificate and key
-
with open(self.cert_path, 'w') as f:
-
f.write(order.fullchain_pem.split('\n\n')[0] + '\n')
-
-
with open(self.fullchain_path, 'w') as f:
-
f.write(order.fullchain_pem)
-
-
key_pem = cert_key.private_bytes(
-
encoding=serialization.Encoding.PEM,
-
format=serialization.PrivateFormat.TraditionalOpenSSL,
-
encryption_algorithm=serialization.NoEncryption()
-
).decode('utf-8')
-
-
with open(self.key_path, 'w') as f:
-
f.write(key_pem)
-
-
# Set proper permissions
-
os.chmod(self.key_path, 0o600)
-
os.chmod(self.cert_path, 0o644)
-
os.chmod(self.fullchain_path, 0o644)
-
-
logger.info(
-
"Certificate obtained successfully",
-
cert_path=str(self.cert_path),
-
key_path=str(self.key_path),
-
fullchain_path=str(self.fullchain_path)
-
)
-
-
return self.cert_path, self.key_path, self.fullchain_path
-
else:
-
raise Exception("Failed to obtain certificate")
-
-
except Exception as e:
-
logger.error("Failed to obtain certificate", error=str(e))
-
raise
-
finally:
-
# Clean up challenge tokens and stop server
-
self.challenge_tokens.clear()
-
self._stop_challenge_server()
-
-
def setup_auto_renewal(self, check_interval: int = 86400):
-
"""Setup automatic certificate renewal.
-
-
Args:
-
check_interval: Interval in seconds to check for renewal (default: 24 hours)
-
"""
-
def renewal_loop():
-
"""Background renewal loop."""
-
while True:
-
try:
-
if self.needs_renewal():
-
logger.info("Certificate renewal needed")
-
self.obtain_certificate()
-
else:
-
logger.debug("Certificate renewal not needed")
-
except Exception as e:
-
logger.error("Certificate renewal check failed", error=str(e))
-
-
time.sleep(check_interval)
-
-
renewal_thread = threading.Thread(target=renewal_loop, daemon=True)
-
renewal_thread.start()
-
logger.info("Started automatic certificate renewal", interval_hours=check_interval/3600)
···
+2 -17
netdata_zulip_bot/main.py
···
ZULIP_API_KEY=your-api-key-here
ZULIP_STREAM=netdata-alerts
-
# Server Configuration
SERVER_HOST=0.0.0.0
-
SERVER_PORT=8443
-
SERVER_DOMAIN=your-domain.com
-
SERVER_ENABLE_MTLS=true
-
-
# Automated SSL Certificate Configuration (Recommended)
-
SERVER_AUTO_CERT=true
-
SERVER_CERT_EMAIL=admin@example.com
-
SERVER_CERT_PATH=./certs
-
# Use Let's Encrypt staging server for testing
-
SERVER_CERT_STAGING=false
-
# Port for ACME HTTP-01 challenge (must be accessible from internet)
-
SERVER_ACME_PORT=80
-
-
# Manual SSL Certificate Configuration (if not using auto-cert)
-
# SERVER_AUTO_CERT=false
-
# SERVER_CERT_PATH=/etc/letsencrypt/live
"""
with open(".env.sample", 'w') as f:
···
ZULIP_API_KEY=your-api-key-here
ZULIP_STREAM=netdata-alerts
+
# Server Configuration (HTTP only, TLS handled by reverse proxy)
SERVER_HOST=0.0.0.0
+
SERVER_PORT=8080
"""
with open(".env.sample", 'w') as f:
+1 -8
netdata_zulip_bot/models.py
···
class ServerConfig(BaseModel):
"""Server configuration."""
host: str = "0.0.0.0"
-
port: int = 8443
-
domain: str # Required for Let's Encrypt
-
cert_path: str = "./certs" # Directory for storing certificates
-
enable_mtls: bool = True
-
auto_cert: bool = False # Enable automatic certificate management
-
cert_email: str = "" # Email for Let's Encrypt account
-
cert_staging: bool = False # Use Let's Encrypt staging server
-
acme_port: int = 80 # Port for ACME HTTP-01 challenge
model_config = ConfigDict(env_prefix="SERVER_")
···
class ServerConfig(BaseModel):
"""Server configuration."""
host: str = "0.0.0.0"
+
port: int = 8080 # Default HTTP port
model_config = ConfigDict(env_prefix="SERVER_")
-38
netdata_zulip_bot/netdata_ca.py
···
-
"""Netdata Cloud CA certificate for mutual TLS authentication."""
-
-
# This certificate is from the official Netdata documentation:
-
# https://github.com/netdata/netdata/blob/master/integrations/cloud-notifications/metadata.yaml
-
NETDATA_CA_CERT = """-----BEGIN CERTIFICATE-----
-
MIIF0jCCA7qgAwIBAgIUDV0rS5jXsyNX33evHEQOwn9fPo0wDQYJKoZIhvcNAQEN
-
BQAwgYAxCzAJBgNVBAYTAlVTMRMwEQYDVQQIEwpDYWxpZm9ybmlhMRYwFAYDVQQH
-
Ew1TYW4gRnJhbmNpc2NvMRYwFAYDVQQKEw1OZXRkYXRhLCBJbmMuMRIwEAYDVQQL
-
EwlDbG91ZCBTUkUxGDAWBgNVBAMTD05ldGRhdGEgUm9vdCBDQTAeFw0yMzAyMjIx
-
MjQzMDBaFw0zMzAyMTkxMjQzMDBaMIGAMQswCQYDVQQGEwJVUzETMBEGA1UECBMK
-
Q2FsaWZvcm5pYTEWMBQGA1UEBxMNU2FuIEZyYW5jaXNjbzEWMBQGA1UEChMNTmV0
-
ZGF0YSwgSW5jLjESMBAGA1UECxMJQ2xvdWQgU1JFMRgwFgYDVQQDEw9OZXRkYXRh
-
IFJvb3QgQ0EwggIiMA0GCSqGSIb3DQEBAQUAA4ICDwAwggIKAoICAQCwIg7z3R++
-
ppQYYVVoMIDlhWO3qVTMsAQoJYEvVa6fqaImUBLW/k19LUaXgUJPohB7gBp1pkjs
-
QfY5dBo8iFr7MDHtyiAFjcQV181sITTMBEJwp77R4slOXCvrreizhTt1gvf4S1zL
-
qeHBYWEgH0RLrOAqD0jkOHwewVouO0k3Wf2lEbCq3qRk2HeDvkv0LR7sFC+dDms8
-
fDHqb/htqhk+FAJELGRqLeaFq1Z5Eq1/9dk4SIeHgK5pdYqsjpBzOTmocgriw6he
-
s7F3dOec1ZZdcBEAxOjbYt4e58JwuR81cWAVMmyot5JNCzYVL9e5Vc5n22qt2dmc
-
Tzw2rLOPt9pT5bzbmyhcDuNg2Qj/5DySAQ+VQysx91BJRXyUimqE7DwQyLhpQU72
-
jw29lf2RHdCPNmk8J1TNropmpz/aI7rkperPugdOmxzP55i48ECbvDF4Wtazi+l+
-
4kx7ieeLfEQgixy4lRUUkrgJlIDOGbw+d2Ag6LtOgwBiBYnDgYpvLucnx5cFupPY
-
Cy3VlJ4EKUeQQSsz5kVmvotk9MED4sLx1As8V4e5ViwI5dCsRfKny7BeJ6XNPLnw
-
PtMh1hbiqCcDmB1urCqXcMle4sRhKccReYOwkLjLLZ80A+MuJuIEAUUuEPCwywzU
-
R7pagYsmvNgmwIIuJtB6mIJBShC7TpJG+wIDAQABo0IwQDAOBgNVHQ8BAf8EBAMC
-
AQYwDwYDVR0TAQH/BAUwAwEB/zAdBgNVHQ4EFgQU9IbvOsPSUrpr8H2zSafYVQ9e
-
Ft8wDQYJKoZIhvcNAQENBQADggIBABQ08aI31VKZs8jzg+y/QM5cvzXlVhcpkZsY
-
1VVBr0roSBw9Pld9SERrEHto8PVXbadRxeEs4sKivJBKubWAooQ6NTvEB9MHuGnZ
-
VCU+N035Gq/mhBZgtIs/Zz33jTB2ju3G4Gm9VTZbVqd0OUxFs41Iqvi0HStC3/Io
-
rKi7crubmp5f2cNW1HrS++ScbTM+VaKVgQ2Tg5jOjou8wtA+204iYXlFpw9Q0qnP
-
qq6ix7TfLLeRVp6mauwPsAJUgHZluz7yuv3r7TBdukU4ZKUmfAGIPSebtB3EzXfH
-
7Y326xzv0hEpjvDHLy6+yFfTdBSrKPsMHgc9bsf88dnypNYL8TUiEHlcTgCGU8ts
-
ud8sWN2M5FEWbHPNYRVfH3xgY2iOYZzn0i+PVyGryOPuzkRHTxDLPIGEWE5susM4
-
X4bnNJyKH1AMkBCErR34CLXtAe2ngJlV/V3D4I8CQFJdQkn9tuznohUU/j80xvPH
-
FOcDGQYmh4m2aIJtlNVP6+/92Siugb5y7HfslyRK94+bZBg2D86TcCJWaaZOFUrR
-
Y3WniYXsqM5/JI4OOzu7dpjtkJUYvwtg7Qb5jmm8Ilf5rQZJhuvsygzX6+WM079y
-
nsjoQAm6OwpTN5362vE9SYu1twz7KdzBlUkDhePEOgQkWfLHBJWwB+PvB1j/cUA3
-
5zrbwvQf
-
-----END CERTIFICATE-----"""
···
+3 -82
netdata_zulip_bot/server.py
···
"""FastAPI webhook server for receiving Netdata notifications."""
-
import ssl
-
import tempfile
-
from pathlib import Path
from typing import Dict, Any
import structlog
···
from fastapi import FastAPI, HTTPException, Request, status
from fastapi.responses import JSONResponse
-
from .cert_manager import CertificateManager
from .formatter import ZulipMessageFormatter
from .models import WebhookPayload, ZulipConfig, ServerConfig
-
from .netdata_ca import NETDATA_CA_CERT
from .zulip_client import ZulipNotifier
logger = structlog.get_logger()
···
self.zulip_config = zulip_config
self.server_config = server_config
self.formatter = ZulipMessageFormatter()
-
self.cert_manager = None
-
-
# Initialize certificate manager if auto-cert is enabled
-
if self.server_config.auto_cert:
-
self.cert_manager = CertificateManager(
-
domain=self.server_config.domain,
-
email=self.server_config.cert_email,
-
cert_dir=Path(self.server_config.cert_path),
-
staging=self.server_config.cert_staging,
-
port=self.server_config.acme_port
-
)
# Initialize Zulip client
try:
···
)
raise
-
def get_ssl_context(self) -> ssl.SSLContext:
-
"""Create SSL context for HTTPS and mutual TLS."""
-
context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
-
-
# Get certificate paths
-
if self.cert_manager and self.server_config.auto_cert:
-
# Use automated certificates
-
try:
-
cert_file, key_file, fullchain_file = self.cert_manager.obtain_certificate()
-
logger.info(
-
"Using automated SSL certificate",
-
cert_file=str(cert_file),
-
key_file=str(key_file)
-
)
-
except Exception as e:
-
logger.error("Failed to obtain automated certificate", error=str(e))
-
raise
-
else:
-
# Use manually provided certificates
-
cert_path = Path(self.server_config.cert_path) / self.server_config.domain
-
fullchain_file = cert_path / "fullchain.pem"
-
key_file = cert_path / "privkey.pem"
-
-
if not fullchain_file.exists() or not key_file.exists():
-
logger.error(
-
"SSL certificate files not found",
-
cert_file=str(fullchain_file),
-
key_file=str(key_file)
-
)
-
raise FileNotFoundError(f"SSL certificate files not found at {cert_path}")
-
-
context.load_cert_chain(str(fullchain_file), str(key_file))
-
-
# Configure mutual TLS if enabled
-
if self.server_config.enable_mtls:
-
# Use the hardcoded Netdata CA certificate
-
with tempfile.NamedTemporaryFile(mode='w', suffix='.pem', delete=False) as ca_file:
-
ca_file.write(NETDATA_CA_CERT)
-
ca_file_path = ca_file.name
-
-
try:
-
context.load_verify_locations(ca_file_path)
-
context.verify_mode = ssl.CERT_REQUIRED
-
logger.info("Mutual TLS enabled with hardcoded Netdata CA certificate")
-
finally:
-
# Clean up the temporary file
-
Path(ca_file_path).unlink(missing_ok=True)
-
else:
-
context.verify_mode = ssl.CERT_NONE
-
logger.info("Mutual TLS disabled")
-
-
return context
def run(self):
-
"""Run the webhook server with HTTPS and optional mutual TLS."""
try:
-
# Setup automatic certificate renewal if enabled
-
if self.cert_manager and self.server_config.auto_cert:
-
self.cert_manager.setup_auto_renewal()
-
logger.info("Automatic certificate renewal enabled")
-
-
ssl_context = self.get_ssl_context()
-
logger.info(
-
"Starting Netdata Zulip webhook server",
host=self.server_config.host,
-
port=self.server_config.port,
-
domain=self.server_config.domain,
-
mtls_enabled=self.server_config.enable_mtls,
-
auto_cert=self.server_config.auto_cert
)
uvicorn.run(
self.app,
host=self.server_config.host,
port=self.server_config.port,
-
ssl_context=ssl_context,
access_log=False, # We handle logging in middleware
)
···
"""FastAPI webhook server for receiving Netdata notifications."""
from typing import Dict, Any
import structlog
···
from fastapi import FastAPI, HTTPException, Request, status
from fastapi.responses import JSONResponse
from .formatter import ZulipMessageFormatter
from .models import WebhookPayload, ZulipConfig, ServerConfig
from .zulip_client import ZulipNotifier
logger = structlog.get_logger()
···
self.zulip_config = zulip_config
self.server_config = server_config
self.formatter = ZulipMessageFormatter()
# Initialize Zulip client
try:
···
)
raise
def run(self):
+
"""Run the webhook server (HTTP only, TLS handled by reverse proxy)."""
try:
logger.info(
+
"Starting Netdata Zulip webhook server (HTTP)",
host=self.server_config.host,
+
port=self.server_config.port
)
uvicorn.run(
self.app,
host=self.server_config.host,
port=self.server_config.port,
access_log=False, # We handle logging in middleware
)
-3
pyproject.toml
···
"zulip>=0.9.0",
"pydantic>=2.5.0",
"python-multipart>=0.0.6",
-
"acme>=2.8.0",
-
"josepy>=1.14.0",
-
"cryptography>=41.0.0",
"python-dotenv>=1.0.0",
"structlog>=23.2.0",
]
···
"zulip>=0.9.0",
"pydantic>=2.5.0",
"python-multipart>=0.0.6",
"python-dotenv>=1.0.0",
"structlog>=23.2.0",
]
-129
tests/test_cert_manager.py
···
-
"""Tests for the certificate manager module."""
-
-
import tempfile
-
from pathlib import Path
-
from unittest.mock import Mock, patch, MagicMock
-
-
import pytest
-
-
from netdata_zulip_bot.cert_manager import CertificateManager
-
-
-
class TestCertificateManager:
-
"""Test certificate manager functionality."""
-
-
@pytest.fixture
-
def temp_cert_dir(self):
-
"""Create a temporary directory for certificates."""
-
with tempfile.TemporaryDirectory() as tmpdir:
-
yield Path(tmpdir)
-
-
@pytest.fixture
-
def cert_manager(self, temp_cert_dir):
-
"""Create a certificate manager instance."""
-
return CertificateManager(
-
domain="test.example.com",
-
email="test@example.com",
-
cert_dir=temp_cert_dir,
-
staging=True, # Always use staging for tests
-
port=8080
-
)
-
-
def test_initialization(self, cert_manager, temp_cert_dir):
-
"""Test certificate manager initialization."""
-
assert cert_manager.domain == "test.example.com"
-
assert cert_manager.email == "test@example.com"
-
assert cert_manager.cert_dir == temp_cert_dir
-
assert cert_manager.staging is True
-
assert cert_manager.challenge_port == 8080
-
-
# Check that paths are created correctly
-
assert cert_manager.account_key_path == temp_cert_dir / "account_key.pem"
-
assert cert_manager.cert_path == temp_cert_dir / "test.example.com_cert.pem"
-
assert cert_manager.key_path == temp_cert_dir / "test.example.com_key.pem"
-
assert cert_manager.fullchain_path == temp_cert_dir / "test.example.com_fullchain.pem"
-
-
def test_cert_dir_creation(self, temp_cert_dir):
-
"""Test that certificate directory is created if it doesn't exist."""
-
new_dir = temp_cert_dir / "nested" / "certs"
-
cert_manager = CertificateManager(
-
domain="test.example.com",
-
email="test@example.com",
-
cert_dir=new_dir,
-
staging=True
-
)
-
assert new_dir.exists()
-
assert new_dir.is_dir()
-
-
@patch('netdata_zulip_bot.cert_manager.x509')
-
def test_needs_renewal_no_cert(self, mock_x509, cert_manager):
-
"""Test that renewal is needed when certificate doesn't exist."""
-
assert cert_manager.needs_renewal() is True
-
-
@patch('netdata_zulip_bot.cert_manager.datetime')
-
@patch('netdata_zulip_bot.cert_manager.x509')
-
def test_needs_renewal_expired(self, mock_x509, mock_datetime, cert_manager):
-
"""Test that renewal is needed when certificate is expiring soon."""
-
from datetime import datetime, timezone, timedelta
-
-
# Create a mock certificate file
-
cert_manager.cert_path.touch()
-
-
# Mock certificate with 20 days remaining
-
mock_cert = Mock()
-
now = datetime(2024, 1, 1, tzinfo=timezone.utc)
-
mock_cert.not_valid_after_utc = now + timedelta(days=20)
-
mock_x509.load_pem_x509_certificate.return_value = mock_cert
-
mock_datetime.now.return_value = now
-
-
assert cert_manager.needs_renewal() is True
-
-
@patch('netdata_zulip_bot.cert_manager.datetime')
-
@patch('netdata_zulip_bot.cert_manager.x509')
-
def test_needs_renewal_valid(self, mock_x509, mock_datetime, cert_manager):
-
"""Test that renewal is not needed when certificate is still valid."""
-
from datetime import datetime, timezone, timedelta
-
-
# Create a mock certificate file
-
cert_manager.cert_path.touch()
-
-
# Mock certificate with 60 days remaining
-
mock_cert = Mock()
-
now = datetime(2024, 1, 1, tzinfo=timezone.utc)
-
mock_cert.not_valid_after_utc = now + timedelta(days=60)
-
mock_x509.load_pem_x509_certificate.return_value = mock_cert
-
mock_datetime.now.return_value = now
-
-
assert cert_manager.needs_renewal() is False
-
-
def test_generate_private_key(self, cert_manager):
-
"""Test private key generation."""
-
key = cert_manager._generate_private_key()
-
assert key is not None
-
assert key.key_size == 2048
-
-
@patch('netdata_zulip_bot.cert_manager.threading.Thread')
-
def test_challenge_server_start(self, mock_thread, cert_manager):
-
"""Test that challenge server starts correctly."""
-
cert_manager._start_challenge_server()
-
-
# Verify thread was created and started
-
mock_thread.assert_called_once()
-
mock_thread.return_value.start.assert_called_once()
-
-
def test_challenge_tokens_storage(self, cert_manager):
-
"""Test that challenge tokens are stored correctly."""
-
cert_manager.challenge_tokens["test_token"] = "test_response"
-
assert cert_manager.challenge_tokens["test_token"] == "test_response"
-
-
@patch('netdata_zulip_bot.cert_manager.client.ClientV2')
-
@patch('netdata_zulip_bot.cert_manager.client.ClientNetwork')
-
def test_obtain_certificate_mock(self, mock_network, mock_client, cert_manager):
-
"""Test certificate obtaining with mocked ACME client."""
-
# This is a simplified test that mocks the ACME interaction
-
# In production, this would interact with Let's Encrypt staging server
-
-
# Mock that certificate doesn't need renewal
-
with patch.object(cert_manager, 'needs_renewal', return_value=False):
-
paths = cert_manager.obtain_certificate()
-
assert paths == (cert_manager.cert_path, cert_manager.key_path, cert_manager.fullchain_path)
···