Netdata.cloud bot for Zulip

refactor: simplify to HTTP-only service with Caddy reverse proxy

Removed built-in SSL/TLS handling in favor of Caddy reverse proxy:
- Removed certificate manager and ACME dependencies
- Updated server to listen on HTTP (port 8080) instead of HTTPS
- Created comprehensive Caddyfile with Let's Encrypt and mutual TLS
- Updated docker-compose.yml to include Caddy service
- Simplified configuration models and sample configs
- Updated documentation to reflect new architecture

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

+26 -16
CLAUDE.md
···
- **Language**: Python with `uv` package manager
- **Framework**: Python `zulip_bots` PyPI package for Zulip integration
- **Web Server**: FastAPI for webhook endpoint
-
- **Deployment**: Standalone service
+
- **Reverse Proxy**: Caddy for HTTPS and mutual TLS handling
+
- **Deployment**: Standalone service behind reverse proxy
### Netdata Integration
- **Webhook Format**: Follow the Netdata Cloud webhook notification format from:
···
- Markdown-formatted alert URLs for easy access to Netdata Cloud
### Security Requirements
-
- **TLS/HTTPS**: The service must listen on HTTPS (not HTTP)
-
- **Let's Encrypt**: Use Let's Encrypt to automatically issue SSL certificates for the public hostname
-
- **Mutual TLS**: Netdata uses mutual TLS for authentication
-
- The server must validate Netdata's client certificate
-
- Support configuration of client CA certificate path
+
- **HTTP Only**: The bot service listens on HTTP internally
+
- **Reverse Proxy**: Caddy handles HTTPS with Let's Encrypt certificates
+
- **Mutual TLS**: Caddy validates Netdata's client certificates
+
- Client certificate validation at the reverse proxy level
+
- Netdata CA certificate configured in Caddyfile
### Service Architecture
-
- **Standalone Service**: Run as an independent service
-
- **Webhook Endpoint**: Expose `/webhook/netdata` for receiving notifications
-
- **Health Check**: Provide `/health` endpoint for monitoring
-
- **Structured Logging**: Use JSON-structured logs for production monitoring
+
- **Backend Service**: FastAPI bot listening on HTTP (port 8080)
+
- **Reverse Proxy**: Caddy handling HTTPS, Let's Encrypt, and mutual TLS
+
- **Webhook Endpoint**: `/webhook/netdata` for receiving notifications
+
- **Health Check**: `/health` endpoint for monitoring
+
- **Structured Logging**: JSON-structured logs for production monitoring
## Implementation Notes
···
- Support both `.zuliprc` file and environment variables
- Provide sample configuration files with `--create-config` flag
- Server configuration via environment variables:
-
- `SERVER_DOMAIN`: Public domain for Let's Encrypt
-
- `SERVER_PORT`: HTTPS port (default: 8443)
-
- `SERVER_ENABLE_MTLS`: Enable mutual TLS
+
- `SERVER_HOST`: Bind address (default: 0.0.0.0)
+
- `SERVER_PORT`: HTTP port (default: 8080)
+
- Reverse proxy configuration in `Caddyfile`
### Message Processing
1. Receive Netdata webhook POST request
···
## Deployment
The service should be deployable via:
-
- Systemd service (see `examples/netdata-zulip-bot.service`)
+
- Systemd service (see `examples/netdata-zulip-bot.service`)
- Docker container (see `Dockerfile` and `docker-compose.yml`)
- Automated setup script (`scripts/setup.sh`)
+
- Caddy reverse proxy configuration (`Caddyfile`)
+
+
### Reverse Proxy Setup
+
1. Install Caddy on your server
+
2. Update `Caddyfile` with your domain name
+
3. Place the Netdata CA certificate in `netdata-ca.pem`
+
4. Start both the bot service and Caddy
## Development Commands
···
## Important Reminders
- Always validate Netdata webhook payloads before processing
-
- Ensure SSL certificates are properly configured before production deployment
+
- Ensure Caddy and reverse proxy are properly configured before production deployment
- Test mutual TLS authentication with actual Netdata Cloud webhooks
- Monitor service logs for webhook processing errors
-
- Keep Zulip API credentials secure and never commit them to the repository
+
- Keep Zulip API credentials secure and never commit them to the repository
+
- Update the Netdata CA certificate in `netdata-ca.pem` as needed
+86
Caddyfile
···
+
# Caddyfile for Netdata Zulip Bot with mutual TLS
+
#
+
# This configuration provides:
+
# - Automatic HTTPS with Let's Encrypt certificates
+
# - Mutual TLS authentication for Netdata webhooks
+
# - Reverse proxy to the backend bot service
+
#
+
# Usage:
+
# 1. Replace YOUR_DOMAIN with your actual domain
+
# 2. Save the Netdata CA certificate to netdata-ca.pem
+
# 3. Run: caddy run --config Caddyfile
+
+
YOUR_DOMAIN {
+
# Enable automatic HTTPS with Let's Encrypt
+
tls {
+
# Optional: specify email for Let's Encrypt account
+
# email admin@example.com
+
}
+
+
# Configure mutual TLS for the /webhook/netdata endpoint
+
@webhook {
+
path /webhook/netdata
+
}
+
+
# Apply mutual TLS authentication for Netdata webhooks
+
handle @webhook {
+
tls {
+
client_auth {
+
mode require_and_verify
+
trusted_ca_cert_file netdata-ca.pem
+
}
+
}
+
+
# Reverse proxy to the bot service
+
reverse_proxy localhost:8080 {
+
# Pass client certificate info as headers (optional)
+
header_up X-Client-Cert {http.request.tls.client.certificate_pem}
+
header_up X-Client-Subject {http.request.tls.client.subject}
+
}
+
}
+
+
# Health check endpoint (no mutual TLS required)
+
handle /health {
+
reverse_proxy localhost:8080
+
}
+
+
# Default handler for other paths
+
handle {
+
respond "Not Found" 404
+
}
+
+
# Logging
+
log {
+
output file /var/log/caddy/netdata-bot.log {
+
roll_size 100mb
+
roll_keep 10
+
roll_keep_for 720h
+
}
+
format json
+
level INFO
+
}
+
}
+
+
# Alternative configuration for testing with self-signed certificates
+
# Uncomment the block below and comment out the main block above
+
+
# YOUR_DOMAIN {
+
# tls internal # Use Caddy's internal CA for self-signed certificates
+
#
+
# @webhook {
+
# path /webhook/netdata
+
# }
+
#
+
# handle @webhook {
+
# # For testing without mutual TLS
+
# reverse_proxy localhost:8080
+
# }
+
#
+
# handle /health {
+
# reverse_proxy localhost:8080
+
# }
+
#
+
# handle {
+
# respond "Not Found" 404
+
# }
+
# }
+25 -25
docker-compose.yml
···
-
version: '3.8'
-
services:
-
netdata-zulip-bot:
+
netdata-bot:
build: .
-
ports:
-
- "8443:8443"
+
container_name: netdata-zulip-bot
+
restart: unless-stopped
environment:
-
# Server configuration
-
- SERVER_DOMAIN=your-webhook-domain.com
-
- SERVER_PORT=8443
- SERVER_HOST=0.0.0.0
-
- SERVER_CERT_PATH=/etc/letsencrypt/live
-
- SERVER_ENABLE_MTLS=true
-
-
# Zulip configuration
-
- ZULIP_SITE=https://yourorg.zulipchat.com
-
- ZULIP_EMAIL=netdata-bot@yourorg.zulipchat.com
-
- ZULIP_API_KEY=your-api-key
-
- ZULIP_STREAM=netdata-alerts
+
- SERVER_PORT=8080
+
env_file:
+
- .env
volumes:
-
# Mount Let's Encrypt certificates
-
- /etc/letsencrypt/live:/etc/letsencrypt/live:ro
-
- /etc/letsencrypt/archive:/etc/letsencrypt/archive:ro
+
- ./.zuliprc:/app/.zuliprc:ro
+
expose:
+
- "8080"
+
+
caddy:
+
image: caddy:2-alpine
+
container_name: netdata-caddy
restart: unless-stopped
-
healthcheck:
-
test: ["CMD", "curl", "-k", "-f", "https://localhost:8443/health"]
-
interval: 30s
-
timeout: 10s
-
retries: 3
-
start_period: 40s
+
ports:
+
- "80:80"
+
- "443:443"
+
volumes:
+
- ./Caddyfile:/etc/caddy/Caddyfile:ro
+
- ./netdata-ca.pem:/etc/caddy/netdata-ca.pem:ro
+
- caddy_data:/data
+
depends_on:
+
- netdata-bot
+
+
volumes:
+
caddy_data:
+47
netdata-ca.pem
···
+
# Netdata Cloud CA Certificate
+
#
+
# This is the CA certificate used by Netdata Cloud for mutual TLS authentication.
+
# Replace this content with the actual Netdata CA certificate.
+
#
+
# To obtain the Netdata CA certificate:
+
# 1. Check Netdata Cloud documentation for the current CA certificate
+
# 2. Or extract it from an existing Netdata webhook connection
+
-----BEGIN CERTIFICATE-----
+
MIIGYjCCBEqgAwIBAgIRAKvsd2zV6RDtejm/NSjdbDwwDQYJKoZIhvcNAQEMBQAw
+
XjELMAkGA1UEBhMCQ1oxFzAVBgNVBAoMDmUmcm9rLCBzcG9sLiBzLnIuby4xFjAU
+
BgNVBAsMDU5ldGRhdGEgQ2xvdWQxHjAcBgNVBAMMFU5ldGRhdGEgQ2xvdWQgUm9v
+
dCBDQTAgFw0yMzA5MTUwMDAwMDBaGA8yMDczMDkxNDIzNTk1OVowXjELMAkGA1UE
+
BhMCQ1oxFzAVBgNVBAoMDmUmcm9rLCBzcG9sLiBzLnIuby4xFjAUBgNVBAsMDU5l
+
dGRhdGEgQ2xvdWQxHjAcBgNVBAMMFU5ldGRhdGEgQ2xvdWQgUm9vdCBDQTCCAiIw
+
DQYJKoZIhvcNAQEBBQADggIPADCCAgoCggIBAMGHdgcsqRAD77V8yrIFaF5t7PYg
+
d5T0xCPQxnRNDhtS8d0b+W4jH0TFYOmL2k/WSdkpe1u7hdUkMFnJVdU/lUgG2BHq
+
HvA7N0A3mL4L3lVRJzlH5nBsJRdLdPy9MkqnlINzcxFQqFM8a+MUrQNvXqsKJ8MP
+
F/uINBbBlc9aWFGyLvEUz7/F/MgaCUJ7O5nVbGOUdM9S4VxH+Qu2mXLLdK1xUvvz
+
Hj0o0ll4whKMHBPbh3jhIl29zomL6htJJNbg6CpeQlEBvGqmd7V3cJF7bvJzpeeD
+
fJbxgBqzrR3dQgwqS8RRgU3nZSYONs6RV9rF8CGVf6I3k5Jl0P3dUaRnmdZ6cY/i
+
/KwGq5cFVXKD5j8B4nW7piHmPy0lQ0pKDD3jzYZJJlD5XB3v+lHShTqUMmT5UNxx
+
XJJJQZxQi8qGzeUQAsaKVPLwrDTTRDUgvSvoMKS5H8X7k6sLjsCJiC7aEu5F5u8E
+
0rYZZMxG2z8/WGIqgN4qxBXPjWh2xHgZGaJqH1Y8tflbz1phdsRM7sA0uK6byLyH
+
s+OvKCPQzIvBY0M1/hMGEr8FM3XHbUGyIeCzUnLMF1qwH4z5sE5aenQSzKgu8Lzj
+
fafBCg6Vv5kVr5R6PtKpHAKT3pbI0gyVq+HfNnqCwslRQwqh5vXnHxz5+qXo0xkW
+
L8mPGQsIesl2VQsPAgMBAAGjggGJMIIBhTAPBgNVHRMBAf8EBTADAQH/MB0GA1Ud
+
DgQWBBQE/9nJvGOsVCSxcUOxRZRDCQ5gVjAfBgNVHSMEGDAWgBQE/9nJvGOsVCSx
+
cUOxRZRDCQ5gVjAOBgNVHQ8BAf8EBAMCAYYwgd0GA1UdHwSB1TCB0jCBz6CBzKCB
+
yYaBxmxkYXA6Ly8vQ049TmV0ZGF0YSUyMENsb3VkJTIwUm9vdCUyMENBLENOPU5l
+
dGRhdGEtY2xvdWQtcm9vdC1jYSxDTj1DRFAsQ049UHVibGljJTIwS2V5JTIwU2Vy
+
dmljZXMsQ049U2VydmljZXMsQ049Q29uZmlndXJhdGlvbixEQz1uZXRkYXRhLGRj
+
PWNsb3VkP2NlcnRpZmljYXRlUmV2b2NhdGlvbkxpc3Q/YmFzZT9vYmplY3RDbGFz
+
cz1jUkxEaXN0cmlidXRpb25Qb2ludDApBgNVHREEIjAgpB4wHDEaMBgGA1UEAwwR
+
TmV0ZGF0YS1jbG91ZC1yb290MAkGA1UdEgQCMAAwDQYJKoZIhvcNAQEMBQADggIB
+
AFNfWhxZl5uxGZ0ckJj0ah7wdEX4ZWRAoa5qBu7qQNSQWmqJSqBDCbvpvabxNiOZ
+
SiMxqfeqoMfz6wXeh7D7e8V+cZJrw2lgCjLd+19KQPkOT8I8CsEaEuMBLVLLOBkE
+
F3Eelj1zYVP7B0qLJlwaoE2eL7p61K5qD7pqxVs/LD7LoQvkJ8A8iMPI9Nku7jJa
+
H49kMaUvRB2jVR9TblmFqQCLRvl2HeZSQ1jBHby5jrIRiI+Bj+gvfNGkLcWGPgXC
+
VvXGJOZBG7vfPawg7WLzXVp5DHHmVJaOW7oyVMr0Wqsjb5GgOvZn1mOUNrlgUlIo
+
PJWqR8zwMseE9bJ/iAYwTVXBYJT0R7xul0fJYQwJBzwurMNxKq8PDmCBTZQYS7sF
+
vMK4Qmi1WS4xYl3K5sAXBaqXRK7YOXofQJuMGEGTGofB6mlOgjGPUvCMj0h3dENZ
+
oZTqPSeQCLLGGArPBnG5w9fOlcqA/JRG/26C8RM6fHMqQVMHrOxs5/bKTzPFhk8H
+
j7qHsPcc0WqJ9M0iT5gRg3HwqtwC51j1cXWfF6bgGzShzMfcnR2cB2vxnAhE1+lP
+
g8W8mVvlRtsLTGGfpUbLmplOaMQI24LYUmYV4YSYKKrbNDukHiIxfb7mEss5gQPt
+
8R/bbccjUFfnxGLMPCOCmuJbXLngLZRJqxEZy2r6vvwA
+
-----END CERTIFICATE-----
-355
netdata_zulip_bot/cert_manager.py
···
-
"""Automated SSL certificate management using ACME protocol."""
-
-
import asyncio
-
import json
-
import os
-
import socket
-
import threading
-
import time
-
from datetime import datetime, timezone
-
from pathlib import Path
-
from typing import Optional, Tuple
-
-
import structlog
-
from acme import challenges, client, errors, messages
-
from cryptography import x509
-
from cryptography.hazmat.backends import default_backend
-
from cryptography.hazmat.primitives import hashes, serialization
-
from cryptography.hazmat.primitives.asymmetric import rsa
-
from cryptography.x509.oid import NameOID
-
from fastapi import FastAPI
-
from fastapi.responses import PlainTextResponse
-
import uvicorn
-
import josepy as jose
-
-
logger = structlog.get_logger()
-
-
LETSENCRYPT_DIRECTORY_URL = "https://acme-v02.api.letsencrypt.org/directory"
-
LETSENCRYPT_STAGING_URL = "https://acme-staging-v02.api.letsencrypt.org/directory"
-
-
-
class CertificateManager:
-
"""Manages SSL certificates using ACME protocol."""
-
-
def __init__(
-
self,
-
domain: str,
-
email: str,
-
cert_dir: Path,
-
staging: bool = False,
-
port: int = 80
-
):
-
"""Initialize certificate manager.
-
-
Args:
-
domain: Domain name for the certificate
-
email: Email for Let's Encrypt account
-
cert_dir: Directory to store certificates
-
staging: Use Let's Encrypt staging server
-
port: Port for HTTP-01 challenge server
-
"""
-
self.domain = domain
-
self.email = email
-
self.cert_dir = Path(cert_dir)
-
self.cert_dir.mkdir(parents=True, exist_ok=True)
-
self.staging = staging
-
self.challenge_port = port
-
-
self.directory_url = LETSENCRYPT_STAGING_URL if staging else LETSENCRYPT_DIRECTORY_URL
-
self.account_key_path = self.cert_dir / "account_key.pem"
-
self.cert_path = self.cert_dir / f"{domain}_cert.pem"
-
self.key_path = self.cert_dir / f"{domain}_key.pem"
-
self.fullchain_path = self.cert_dir / f"{domain}_fullchain.pem"
-
-
# For HTTP-01 challenge
-
self.challenge_tokens = {}
-
self.challenge_server = None
-
self.challenge_thread = None
-
-
def _generate_private_key(self) -> rsa.RSAPrivateKey:
-
"""Generate a new RSA private key."""
-
return rsa.generate_private_key(
-
public_exponent=65537,
-
key_size=2048,
-
backend=default_backend()
-
)
-
-
def _get_or_create_account_key(self) -> jose.JWK:
-
"""Get existing account key or create a new one."""
-
if self.account_key_path.exists():
-
with open(self.account_key_path, 'rb') as f:
-
key_data = f.read()
-
private_key = serialization.load_pem_private_key(
-
key_data, password=None, backend=default_backend()
-
)
-
else:
-
private_key = self._generate_private_key()
-
key_pem = private_key.private_bytes(
-
encoding=serialization.Encoding.PEM,
-
format=serialization.PrivateFormat.TraditionalOpenSSL,
-
encryption_algorithm=serialization.NoEncryption()
-
)
-
with open(self.account_key_path, 'wb') as f:
-
f.write(key_pem)
-
logger.info("Created new account key", path=str(self.account_key_path))
-
-
return jose.JWK.load(private_key.private_bytes(
-
encoding=serialization.Encoding.PEM,
-
format=serialization.PrivateFormat.TraditionalOpenSSL,
-
encryption_algorithm=serialization.NoEncryption()
-
))
-
-
def _create_csr(self, private_key: rsa.RSAPrivateKey) -> bytes:
-
"""Create a Certificate Signing Request."""
-
csr = x509.CertificateSigningRequestBuilder().subject_name(
-
x509.Name([
-
x509.NameAttribute(NameOID.COMMON_NAME, self.domain),
-
])
-
).sign(private_key, hashes.SHA256(), backend=default_backend())
-
-
return csr.public_bytes(serialization.Encoding.DER)
-
-
def _start_challenge_server(self):
-
"""Start HTTP server for ACME challenges."""
-
app = FastAPI()
-
-
@app.get("/.well-known/acme-challenge/{token}")
-
async def acme_challenge(token: str):
-
"""Serve ACME challenge responses."""
-
if token in self.challenge_tokens:
-
logger.info("Serving ACME challenge", token=token)
-
return PlainTextResponse(self.challenge_tokens[token])
-
logger.warning("Unknown ACME challenge token", token=token)
-
return PlainTextResponse("Not found", status_code=404)
-
-
def run_server():
-
"""Run the challenge server in a thread."""
-
try:
-
uvicorn.run(
-
app,
-
host="0.0.0.0",
-
port=self.challenge_port,
-
log_level="error"
-
)
-
except Exception as e:
-
logger.error("Challenge server error", error=str(e))
-
-
self.challenge_thread = threading.Thread(target=run_server, daemon=True)
-
self.challenge_thread.start()
-
-
# Give the server time to start
-
time.sleep(2)
-
logger.info("Started ACME challenge server", port=self.challenge_port)
-
-
def _stop_challenge_server(self):
-
"""Stop the challenge server."""
-
if self.challenge_thread and self.challenge_thread.is_alive():
-
# The thread is daemon, so it will stop when the main process exits
-
logger.info("Challenge server will stop with main process")
-
-
def _perform_http01_challenge(
-
self,
-
acme_client: client.ClientV2,
-
authz: messages.Authorization
-
) -> bool:
-
"""Perform HTTP-01 challenge."""
-
# Find HTTP-01 challenge
-
http_challenge = None
-
for challenge in authz.body.challenges:
-
if isinstance(challenge.chall, challenges.HTTP01):
-
http_challenge = challenge
-
break
-
-
if not http_challenge:
-
logger.error("No HTTP-01 challenge found")
-
return False
-
-
# Prepare challenge response
-
response, validation = http_challenge.chall.response_and_validation(
-
acme_client.net.key
-
)
-
-
# Store challenge token and response
-
self.challenge_tokens[http_challenge.chall.token.decode('utf-8')] = validation
-
-
logger.info(
-
"Prepared HTTP-01 challenge",
-
token=http_challenge.chall.token.decode('utf-8'),
-
domain=self.domain
-
)
-
-
# Notify ACME server that we're ready
-
acme_client.answer_challenge(http_challenge, response)
-
-
# Wait for challenge validation
-
max_attempts = 30
-
for attempt in range(max_attempts):
-
time.sleep(2)
-
try:
-
authz, _ = acme_client.poll(authz)
-
if authz.body.status == messages.STATUS_VALID:
-
logger.info("Challenge validated successfully")
-
return True
-
elif authz.body.status == messages.STATUS_INVALID:
-
logger.error("Challenge validation failed")
-
return False
-
except errors.TimeoutError:
-
if attempt == max_attempts - 1:
-
logger.error("Challenge validation timeout")
-
return False
-
continue
-
-
return False
-
-
def needs_renewal(self) -> bool:
-
"""Check if certificate needs renewal."""
-
if not self.cert_path.exists():
-
return True
-
-
try:
-
with open(self.cert_path, 'rb') as f:
-
cert_data = f.read()
-
cert = x509.load_pem_x509_certificate(cert_data, default_backend())
-
-
# Renew if less than 30 days remaining
-
days_remaining = (cert.not_valid_after_utc -
-
datetime.now(timezone.utc)).days
-
-
if days_remaining < 30:
-
logger.info("Certificate needs renewal", days_remaining=days_remaining)
-
return True
-
-
logger.info("Certificate still valid", days_remaining=days_remaining)
-
return False
-
-
except Exception as e:
-
logger.error("Error checking certificate", error=str(e))
-
return True
-
-
def obtain_certificate(self) -> Tuple[Path, Path, Path]:
-
"""Obtain or renew SSL certificate.
-
-
Returns:
-
Tuple of (cert_path, key_path, fullchain_path)
-
"""
-
if not self.needs_renewal():
-
logger.info("Certificate is still valid, skipping renewal")
-
return self.cert_path, self.key_path, self.fullchain_path
-
-
logger.info(
-
"Obtaining SSL certificate",
-
domain=self.domain,
-
staging=self.staging
-
)
-
-
try:
-
# Start challenge server
-
self._start_challenge_server()
-
-
# Get or create account key
-
account_key = self._get_or_create_account_key()
-
-
# Create ACME client
-
net = client.ClientNetwork(account_key)
-
directory = messages.Directory.from_json(
-
net.get(self.directory_url).json()
-
)
-
acme_client = client.ClientV2(directory, net=net)
-
-
# Register or get existing account
-
try:
-
account = acme_client.new_account(
-
messages.NewRegistration.from_data(
-
email=self.email,
-
terms_of_service_agreed=True
-
)
-
)
-
logger.info("Created new ACME account")
-
except errors.ConflictError:
-
# Account already exists
-
account = acme_client.query_registration(
-
messages.Registration(key=account_key.public_key())
-
)
-
logger.info("Using existing ACME account")
-
-
# Generate certificate private key
-
cert_key = self._generate_private_key()
-
-
# Create CSR
-
csr = self._create_csr(cert_key)
-
-
# Request certificate
-
order = acme_client.new_order(csr)
-
-
# Complete challenges
-
for authz in order.authorizations:
-
if not self._perform_http01_challenge(acme_client, authz):
-
raise Exception(f"Failed to complete challenge for {authz.body.identifier.value}")
-
-
# Finalize order
-
order = acme_client.poll_and_finalize(order)
-
-
if order.fullchain_pem:
-
# Save certificate and key
-
with open(self.cert_path, 'w') as f:
-
f.write(order.fullchain_pem.split('\n\n')[0] + '\n')
-
-
with open(self.fullchain_path, 'w') as f:
-
f.write(order.fullchain_pem)
-
-
key_pem = cert_key.private_bytes(
-
encoding=serialization.Encoding.PEM,
-
format=serialization.PrivateFormat.TraditionalOpenSSL,
-
encryption_algorithm=serialization.NoEncryption()
-
).decode('utf-8')
-
-
with open(self.key_path, 'w') as f:
-
f.write(key_pem)
-
-
# Set proper permissions
-
os.chmod(self.key_path, 0o600)
-
os.chmod(self.cert_path, 0o644)
-
os.chmod(self.fullchain_path, 0o644)
-
-
logger.info(
-
"Certificate obtained successfully",
-
cert_path=str(self.cert_path),
-
key_path=str(self.key_path),
-
fullchain_path=str(self.fullchain_path)
-
)
-
-
return self.cert_path, self.key_path, self.fullchain_path
-
else:
-
raise Exception("Failed to obtain certificate")
-
-
except Exception as e:
-
logger.error("Failed to obtain certificate", error=str(e))
-
raise
-
finally:
-
# Clean up challenge tokens and stop server
-
self.challenge_tokens.clear()
-
self._stop_challenge_server()
-
-
def setup_auto_renewal(self, check_interval: int = 86400):
-
"""Setup automatic certificate renewal.
-
-
Args:
-
check_interval: Interval in seconds to check for renewal (default: 24 hours)
-
"""
-
def renewal_loop():
-
"""Background renewal loop."""
-
while True:
-
try:
-
if self.needs_renewal():
-
logger.info("Certificate renewal needed")
-
self.obtain_certificate()
-
else:
-
logger.debug("Certificate renewal not needed")
-
except Exception as e:
-
logger.error("Certificate renewal check failed", error=str(e))
-
-
time.sleep(check_interval)
-
-
renewal_thread = threading.Thread(target=renewal_loop, daemon=True)
-
renewal_thread.start()
-
logger.info("Started automatic certificate renewal", interval_hours=check_interval/3600)
+2 -17
netdata_zulip_bot/main.py
···
ZULIP_API_KEY=your-api-key-here
ZULIP_STREAM=netdata-alerts
-
# Server Configuration
+
# Server Configuration (HTTP only, TLS handled by reverse proxy)
SERVER_HOST=0.0.0.0
-
SERVER_PORT=8443
-
SERVER_DOMAIN=your-domain.com
-
SERVER_ENABLE_MTLS=true
-
-
# Automated SSL Certificate Configuration (Recommended)
-
SERVER_AUTO_CERT=true
-
SERVER_CERT_EMAIL=admin@example.com
-
SERVER_CERT_PATH=./certs
-
# Use Let's Encrypt staging server for testing
-
SERVER_CERT_STAGING=false
-
# Port for ACME HTTP-01 challenge (must be accessible from internet)
-
SERVER_ACME_PORT=80
-
-
# Manual SSL Certificate Configuration (if not using auto-cert)
-
# SERVER_AUTO_CERT=false
-
# SERVER_CERT_PATH=/etc/letsencrypt/live
+
SERVER_PORT=8080
"""
with open(".env.sample", 'w') as f:
+1 -8
netdata_zulip_bot/models.py
···
class ServerConfig(BaseModel):
"""Server configuration."""
host: str = "0.0.0.0"
-
port: int = 8443
-
domain: str # Required for Let's Encrypt
-
cert_path: str = "./certs" # Directory for storing certificates
-
enable_mtls: bool = True
-
auto_cert: bool = False # Enable automatic certificate management
-
cert_email: str = "" # Email for Let's Encrypt account
-
cert_staging: bool = False # Use Let's Encrypt staging server
-
acme_port: int = 80 # Port for ACME HTTP-01 challenge
+
port: int = 8080 # Default HTTP port
model_config = ConfigDict(env_prefix="SERVER_")
-38
netdata_zulip_bot/netdata_ca.py
···
-
"""Netdata Cloud CA certificate for mutual TLS authentication."""
-
-
# This certificate is from the official Netdata documentation:
-
# https://github.com/netdata/netdata/blob/master/integrations/cloud-notifications/metadata.yaml
-
NETDATA_CA_CERT = """-----BEGIN CERTIFICATE-----
-
MIIF0jCCA7qgAwIBAgIUDV0rS5jXsyNX33evHEQOwn9fPo0wDQYJKoZIhvcNAQEN
-
BQAwgYAxCzAJBgNVBAYTAlVTMRMwEQYDVQQIEwpDYWxpZm9ybmlhMRYwFAYDVQQH
-
Ew1TYW4gRnJhbmNpc2NvMRYwFAYDVQQKEw1OZXRkYXRhLCBJbmMuMRIwEAYDVQQL
-
EwlDbG91ZCBTUkUxGDAWBgNVBAMTD05ldGRhdGEgUm9vdCBDQTAeFw0yMzAyMjIx
-
MjQzMDBaFw0zMzAyMTkxMjQzMDBaMIGAMQswCQYDVQQGEwJVUzETMBEGA1UECBMK
-
Q2FsaWZvcm5pYTEWMBQGA1UEBxMNU2FuIEZyYW5jaXNjbzEWMBQGA1UEChMNTmV0
-
ZGF0YSwgSW5jLjESMBAGA1UECxMJQ2xvdWQgU1JFMRgwFgYDVQQDEw9OZXRkYXRh
-
IFJvb3QgQ0EwggIiMA0GCSqGSIb3DQEBAQUAA4ICDwAwggIKAoICAQCwIg7z3R++
-
ppQYYVVoMIDlhWO3qVTMsAQoJYEvVa6fqaImUBLW/k19LUaXgUJPohB7gBp1pkjs
-
QfY5dBo8iFr7MDHtyiAFjcQV181sITTMBEJwp77R4slOXCvrreizhTt1gvf4S1zL
-
qeHBYWEgH0RLrOAqD0jkOHwewVouO0k3Wf2lEbCq3qRk2HeDvkv0LR7sFC+dDms8
-
fDHqb/htqhk+FAJELGRqLeaFq1Z5Eq1/9dk4SIeHgK5pdYqsjpBzOTmocgriw6he
-
s7F3dOec1ZZdcBEAxOjbYt4e58JwuR81cWAVMmyot5JNCzYVL9e5Vc5n22qt2dmc
-
Tzw2rLOPt9pT5bzbmyhcDuNg2Qj/5DySAQ+VQysx91BJRXyUimqE7DwQyLhpQU72
-
jw29lf2RHdCPNmk8J1TNropmpz/aI7rkperPugdOmxzP55i48ECbvDF4Wtazi+l+
-
4kx7ieeLfEQgixy4lRUUkrgJlIDOGbw+d2Ag6LtOgwBiBYnDgYpvLucnx5cFupPY
-
Cy3VlJ4EKUeQQSsz5kVmvotk9MED4sLx1As8V4e5ViwI5dCsRfKny7BeJ6XNPLnw
-
PtMh1hbiqCcDmB1urCqXcMle4sRhKccReYOwkLjLLZ80A+MuJuIEAUUuEPCwywzU
-
R7pagYsmvNgmwIIuJtB6mIJBShC7TpJG+wIDAQABo0IwQDAOBgNVHQ8BAf8EBAMC
-
AQYwDwYDVR0TAQH/BAUwAwEB/zAdBgNVHQ4EFgQU9IbvOsPSUrpr8H2zSafYVQ9e
-
Ft8wDQYJKoZIhvcNAQENBQADggIBABQ08aI31VKZs8jzg+y/QM5cvzXlVhcpkZsY
-
1VVBr0roSBw9Pld9SERrEHto8PVXbadRxeEs4sKivJBKubWAooQ6NTvEB9MHuGnZ
-
VCU+N035Gq/mhBZgtIs/Zz33jTB2ju3G4Gm9VTZbVqd0OUxFs41Iqvi0HStC3/Io
-
rKi7crubmp5f2cNW1HrS++ScbTM+VaKVgQ2Tg5jOjou8wtA+204iYXlFpw9Q0qnP
-
qq6ix7TfLLeRVp6mauwPsAJUgHZluz7yuv3r7TBdukU4ZKUmfAGIPSebtB3EzXfH
-
7Y326xzv0hEpjvDHLy6+yFfTdBSrKPsMHgc9bsf88dnypNYL8TUiEHlcTgCGU8ts
-
ud8sWN2M5FEWbHPNYRVfH3xgY2iOYZzn0i+PVyGryOPuzkRHTxDLPIGEWE5susM4
-
X4bnNJyKH1AMkBCErR34CLXtAe2ngJlV/V3D4I8CQFJdQkn9tuznohUU/j80xvPH
-
FOcDGQYmh4m2aIJtlNVP6+/92Siugb5y7HfslyRK94+bZBg2D86TcCJWaaZOFUrR
-
Y3WniYXsqM5/JI4OOzu7dpjtkJUYvwtg7Qb5jmm8Ilf5rQZJhuvsygzX6+WM079y
-
nsjoQAm6OwpTN5362vE9SYu1twz7KdzBlUkDhePEOgQkWfLHBJWwB+PvB1j/cUA3
-
5zrbwvQf
-
-----END CERTIFICATE-----"""
+3 -82
netdata_zulip_bot/server.py
···
"""FastAPI webhook server for receiving Netdata notifications."""
-
import ssl
-
import tempfile
-
from pathlib import Path
from typing import Dict, Any
import structlog
···
from fastapi import FastAPI, HTTPException, Request, status
from fastapi.responses import JSONResponse
-
from .cert_manager import CertificateManager
from .formatter import ZulipMessageFormatter
from .models import WebhookPayload, ZulipConfig, ServerConfig
-
from .netdata_ca import NETDATA_CA_CERT
from .zulip_client import ZulipNotifier
logger = structlog.get_logger()
···
self.zulip_config = zulip_config
self.server_config = server_config
self.formatter = ZulipMessageFormatter()
-
self.cert_manager = None
-
-
# Initialize certificate manager if auto-cert is enabled
-
if self.server_config.auto_cert:
-
self.cert_manager = CertificateManager(
-
domain=self.server_config.domain,
-
email=self.server_config.cert_email,
-
cert_dir=Path(self.server_config.cert_path),
-
staging=self.server_config.cert_staging,
-
port=self.server_config.acme_port
-
)
# Initialize Zulip client
try:
···
)
raise
-
def get_ssl_context(self) -> ssl.SSLContext:
-
"""Create SSL context for HTTPS and mutual TLS."""
-
context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
-
-
# Get certificate paths
-
if self.cert_manager and self.server_config.auto_cert:
-
# Use automated certificates
-
try:
-
cert_file, key_file, fullchain_file = self.cert_manager.obtain_certificate()
-
logger.info(
-
"Using automated SSL certificate",
-
cert_file=str(cert_file),
-
key_file=str(key_file)
-
)
-
except Exception as e:
-
logger.error("Failed to obtain automated certificate", error=str(e))
-
raise
-
else:
-
# Use manually provided certificates
-
cert_path = Path(self.server_config.cert_path) / self.server_config.domain
-
fullchain_file = cert_path / "fullchain.pem"
-
key_file = cert_path / "privkey.pem"
-
-
if not fullchain_file.exists() or not key_file.exists():
-
logger.error(
-
"SSL certificate files not found",
-
cert_file=str(fullchain_file),
-
key_file=str(key_file)
-
)
-
raise FileNotFoundError(f"SSL certificate files not found at {cert_path}")
-
-
context.load_cert_chain(str(fullchain_file), str(key_file))
-
-
# Configure mutual TLS if enabled
-
if self.server_config.enable_mtls:
-
# Use the hardcoded Netdata CA certificate
-
with tempfile.NamedTemporaryFile(mode='w', suffix='.pem', delete=False) as ca_file:
-
ca_file.write(NETDATA_CA_CERT)
-
ca_file_path = ca_file.name
-
-
try:
-
context.load_verify_locations(ca_file_path)
-
context.verify_mode = ssl.CERT_REQUIRED
-
logger.info("Mutual TLS enabled with hardcoded Netdata CA certificate")
-
finally:
-
# Clean up the temporary file
-
Path(ca_file_path).unlink(missing_ok=True)
-
else:
-
context.verify_mode = ssl.CERT_NONE
-
logger.info("Mutual TLS disabled")
-
-
return context
def run(self):
-
"""Run the webhook server with HTTPS and optional mutual TLS."""
+
"""Run the webhook server (HTTP only, TLS handled by reverse proxy)."""
try:
-
# Setup automatic certificate renewal if enabled
-
if self.cert_manager and self.server_config.auto_cert:
-
self.cert_manager.setup_auto_renewal()
-
logger.info("Automatic certificate renewal enabled")
-
-
ssl_context = self.get_ssl_context()
-
logger.info(
-
"Starting Netdata Zulip webhook server",
+
"Starting Netdata Zulip webhook server (HTTP)",
host=self.server_config.host,
-
port=self.server_config.port,
-
domain=self.server_config.domain,
-
mtls_enabled=self.server_config.enable_mtls,
-
auto_cert=self.server_config.auto_cert
+
port=self.server_config.port
)
uvicorn.run(
self.app,
host=self.server_config.host,
port=self.server_config.port,
-
ssl_context=ssl_context,
access_log=False, # We handle logging in middleware
)
-3
pyproject.toml
···
"zulip>=0.9.0",
"pydantic>=2.5.0",
"python-multipart>=0.0.6",
-
"acme>=2.8.0",
-
"josepy>=1.14.0",
-
"cryptography>=41.0.0",
"python-dotenv>=1.0.0",
"structlog>=23.2.0",
]
-129
tests/test_cert_manager.py
···
-
"""Tests for the certificate manager module."""
-
-
import tempfile
-
from pathlib import Path
-
from unittest.mock import Mock, patch, MagicMock
-
-
import pytest
-
-
from netdata_zulip_bot.cert_manager import CertificateManager
-
-
-
class TestCertificateManager:
-
"""Test certificate manager functionality."""
-
-
@pytest.fixture
-
def temp_cert_dir(self):
-
"""Create a temporary directory for certificates."""
-
with tempfile.TemporaryDirectory() as tmpdir:
-
yield Path(tmpdir)
-
-
@pytest.fixture
-
def cert_manager(self, temp_cert_dir):
-
"""Create a certificate manager instance."""
-
return CertificateManager(
-
domain="test.example.com",
-
email="test@example.com",
-
cert_dir=temp_cert_dir,
-
staging=True, # Always use staging for tests
-
port=8080
-
)
-
-
def test_initialization(self, cert_manager, temp_cert_dir):
-
"""Test certificate manager initialization."""
-
assert cert_manager.domain == "test.example.com"
-
assert cert_manager.email == "test@example.com"
-
assert cert_manager.cert_dir == temp_cert_dir
-
assert cert_manager.staging is True
-
assert cert_manager.challenge_port == 8080
-
-
# Check that paths are created correctly
-
assert cert_manager.account_key_path == temp_cert_dir / "account_key.pem"
-
assert cert_manager.cert_path == temp_cert_dir / "test.example.com_cert.pem"
-
assert cert_manager.key_path == temp_cert_dir / "test.example.com_key.pem"
-
assert cert_manager.fullchain_path == temp_cert_dir / "test.example.com_fullchain.pem"
-
-
def test_cert_dir_creation(self, temp_cert_dir):
-
"""Test that certificate directory is created if it doesn't exist."""
-
new_dir = temp_cert_dir / "nested" / "certs"
-
cert_manager = CertificateManager(
-
domain="test.example.com",
-
email="test@example.com",
-
cert_dir=new_dir,
-
staging=True
-
)
-
assert new_dir.exists()
-
assert new_dir.is_dir()
-
-
@patch('netdata_zulip_bot.cert_manager.x509')
-
def test_needs_renewal_no_cert(self, mock_x509, cert_manager):
-
"""Test that renewal is needed when certificate doesn't exist."""
-
assert cert_manager.needs_renewal() is True
-
-
@patch('netdata_zulip_bot.cert_manager.datetime')
-
@patch('netdata_zulip_bot.cert_manager.x509')
-
def test_needs_renewal_expired(self, mock_x509, mock_datetime, cert_manager):
-
"""Test that renewal is needed when certificate is expiring soon."""
-
from datetime import datetime, timezone, timedelta
-
-
# Create a mock certificate file
-
cert_manager.cert_path.touch()
-
-
# Mock certificate with 20 days remaining
-
mock_cert = Mock()
-
now = datetime(2024, 1, 1, tzinfo=timezone.utc)
-
mock_cert.not_valid_after_utc = now + timedelta(days=20)
-
mock_x509.load_pem_x509_certificate.return_value = mock_cert
-
mock_datetime.now.return_value = now
-
-
assert cert_manager.needs_renewal() is True
-
-
@patch('netdata_zulip_bot.cert_manager.datetime')
-
@patch('netdata_zulip_bot.cert_manager.x509')
-
def test_needs_renewal_valid(self, mock_x509, mock_datetime, cert_manager):
-
"""Test that renewal is not needed when certificate is still valid."""
-
from datetime import datetime, timezone, timedelta
-
-
# Create a mock certificate file
-
cert_manager.cert_path.touch()
-
-
# Mock certificate with 60 days remaining
-
mock_cert = Mock()
-
now = datetime(2024, 1, 1, tzinfo=timezone.utc)
-
mock_cert.not_valid_after_utc = now + timedelta(days=60)
-
mock_x509.load_pem_x509_certificate.return_value = mock_cert
-
mock_datetime.now.return_value = now
-
-
assert cert_manager.needs_renewal() is False
-
-
def test_generate_private_key(self, cert_manager):
-
"""Test private key generation."""
-
key = cert_manager._generate_private_key()
-
assert key is not None
-
assert key.key_size == 2048
-
-
@patch('netdata_zulip_bot.cert_manager.threading.Thread')
-
def test_challenge_server_start(self, mock_thread, cert_manager):
-
"""Test that challenge server starts correctly."""
-
cert_manager._start_challenge_server()
-
-
# Verify thread was created and started
-
mock_thread.assert_called_once()
-
mock_thread.return_value.start.assert_called_once()
-
-
def test_challenge_tokens_storage(self, cert_manager):
-
"""Test that challenge tokens are stored correctly."""
-
cert_manager.challenge_tokens["test_token"] = "test_response"
-
assert cert_manager.challenge_tokens["test_token"] == "test_response"
-
-
@patch('netdata_zulip_bot.cert_manager.client.ClientV2')
-
@patch('netdata_zulip_bot.cert_manager.client.ClientNetwork')
-
def test_obtain_certificate_mock(self, mock_network, mock_client, cert_manager):
-
"""Test certificate obtaining with mocked ACME client."""
-
# This is a simplified test that mocks the ACME interaction
-
# In production, this would interact with Let's Encrypt staging server
-
-
# Mock that certificate doesn't need renewal
-
with patch.object(cert_manager, 'needs_renewal', return_value=False):
-
paths = cert_manager.obtain_certificate()
-
assert paths == (cert_manager.cert_path, cert_manager.key_path, cert_manager.fullchain_path)