Netdata.cloud bot for Zulip

Netdata Zulip Bot - Development Instructions#

This repository implements a production-ready Zulip bot that receives incoming webhook notifications from Netdata Cloud and posts the resulting notifications to a Zulip topic.

Core Requirements#

Technology Stack#

  • Language: Python with uv package manager
  • Framework: Python zulip_bots PyPI package for Zulip integration
  • Web Server: FastAPI for webhook endpoint
  • Reverse Proxy: Caddy for HTTPS and mutual TLS handling
  • Deployment: Standalone service behind reverse proxy

Netdata Integration#

  • Webhook Format: Follow the Netdata Cloud webhook notification format from:
    • URL: https://raw.githubusercontent.com/netdata/netdata/refs/heads/master/integrations/cloud-notifications/metadata.yaml
    • Section: id: notify-cloud-webhook
  • Notification Types: Handle both alert notifications and reachability notifications

Zulip Configuration#

  • Configuration Source: The .zuliprc file should contain:
    • Zulip site URL
    • Bot email and API key
    • Target stream/channel for posting messages
  • Topic Organization: Messages should be posted to topics based on severity:
    • critical, warning, clear for alerts
    • reachability for host status changes
  • Message Format: Rich markdown with:
    • Alert details and timestamps
    • Markdown-formatted alert URLs for easy access to Netdata Cloud

Security Requirements#

  • HTTP Only: The bot service listens on HTTP internally
  • Reverse Proxy: Caddy handles HTTPS with Let's Encrypt certificates
  • Mutual TLS: Caddy validates Netdata's client certificates
    • Client certificate validation at the reverse proxy level
    • Netdata CA certificate configured in Caddyfile

Service Architecture#

  • Backend Service: FastAPI bot listening on HTTP (port 8080)
  • Reverse Proxy: Caddy handling HTTPS, Let's Encrypt, and mutual TLS
  • Webhook Endpoint: /webhook/netdata for receiving notifications
  • Health Check: /health endpoint for monitoring
  • Structured Logging: JSON-structured logs for production monitoring

Implementation Notes#

Configuration Management#

  • Support both .zuliprc file and environment variables
  • Provide sample configuration files with --create-config flag
  • Server configuration via environment variables:
    • SERVER_HOST: Bind address (default: 0.0.0.0)
    • SERVER_PORT: HTTP port (default: 8080)
  • Reverse proxy configuration in Caddyfile

Message Processing#

  1. Receive Netdata webhook POST request
  2. Validate and parse JSON payload
  3. Determine notification type (alert vs reachability)
  4. Format message with appropriate emoji and markdown
  5. Send to configured Zulip stream/topic

Error Handling#

  • Validate all incoming payloads
  • Log errors without exposing internal details
  • Return appropriate HTTP status codes
  • Implement retry logic for Zulip API failures

Testing#

Run tests with:

uv run python -m pytest tests/ -v

Deployment#

The service should be deployable via:

  • Systemd service (see examples/netdata-zulip-bot.service)
  • Docker container (see Dockerfile and docker-compose.yml)
  • Automated setup script (scripts/setup.sh)
  • Caddy reverse proxy configuration (Caddyfile)

Reverse Proxy Setup#

  1. Install Caddy on your server
  2. Update Caddyfile with your domain name
  3. Place the Netdata CA certificate in netdata-ca.pem
  4. Start both the bot service and Caddy

Development Commands#

  • Install dependencies: uv sync
  • Create config samples: uv run netdata-zulip-bot --create-config
  • Run tests: uv run python -m pytest tests/
  • Start service: uv run netdata-zulip-bot

Important Reminders#

  • Always validate Netdata webhook payloads before processing
  • Ensure Caddy and reverse proxy are properly configured before production deployment
  • Test mutual TLS authentication with actual Netdata Cloud webhooks
  • Monitor service logs for webhook processing errors
  • Keep Zulip API credentials secure and never commit them to the repository
  • Update the Netdata CA certificate in netdata-ca.pem as needed