Netdata.cloud bot for Zulip
1# Netdata Zulip Bot
2
3*100% vibe coded, use at your peril*
4
5A webhook service that receives notifications from Netdata Cloud and forwards them to Zulip channels. Features HTTPS with Let's Encrypt certificates and mutual TLS authentication for secure communication with Netdata Cloud.
6
7## Features
8
9- 🔐 **Automated SSL Certificates**: Built-in Let's Encrypt integration with automatic renewal
10- 🤝 **Mutual TLS**: Secure authentication with Netdata Cloud
11- 📊 **Rich Formatting**: Beautiful Zulip messages with emojis and markdown
12- 🏷️ **Topic Organization**: Automatic topic routing by severity level
13- 📝 **Structured Logging**: JSON-structured logs for monitoring
14- ⚡ **High Performance**: FastAPI-based webhook endpoint
15- 🚀 **Standalone**: No external dependencies like certbot required
16
17## Quick Start
18
19### 1. Install Dependencies
20
21```bash
22# Using uv (recommended)
23uv sync
24
25# Or using pip
26pip install -e .
27```
28
29### 2. Create Configuration
30
31```bash
32# Generate sample configuration files
33netdata-zulip-bot --create-config
34
35# Copy and customize
36cp .zuliprc.sample ~/.zuliprc
37```
38
39### 3. Configure Zulip Settings
40
41Edit `~/.zuliprc`:
42
43```ini
44[api]
45site=https://yourorg.zulipchat.com
46email=netdata-bot@yourorg.zulipchat.com
47key=your-zulip-api-key
48stream=netdata-alerts
49```
50
51### 4. Set Server Environment Variables
52
53```bash
54export SERVER_DOMAIN=your-webhook-domain.com
55export SERVER_PORT=8443
56export SERVER_ENABLE_MTLS=true
57
58# For automated SSL certificates (recommended)
59export SERVER_AUTO_CERT=true
60export SERVER_CERT_EMAIL=admin@example.com
61# Use staging for testing (optional)
62export SERVER_CERT_STAGING=false
63```
64
65### 5. Run the Service
66
67```bash
68# With automated SSL certificates
69netdata-zulip-bot
70
71# The bot will automatically:
72# 1. Obtain SSL certificates from Let's Encrypt
73# 2. Start the HTTPS server
74# 3. Renew certificates before expiration
75```
76
77## Configuration
78
79### Zulip Configuration
80
81The bot supports two configuration methods:
82
83#### Method 1: Zuliprc File (Recommended)
84
85Create `~/.zuliprc`:
86
87```ini
88[api]
89site=https://yourorg.zulipchat.com
90email=netdata-bot@yourorg.zulipchat.com
91key=your-zulip-api-key
92stream=netdata-alerts
93```
94
95#### Method 2: Environment Variables
96
97```bash
98export ZULIP_SITE=https://yourorg.zulipchat.com
99export ZULIP_EMAIL=netdata-bot@yourorg.zulipchat.com
100export ZULIP_API_KEY=your-api-key
101export ZULIP_STREAM=netdata-alerts
102```
103
104Use `--env-config` flag to use environment variables instead of zuliprc.
105
106### Server Configuration
107
108Set these environment variables:
109
110- `SERVER_DOMAIN`: Your public domain (required)
111- `SERVER_HOST`: Bind address (default: `0.0.0.0`)
112- `SERVER_PORT`: HTTPS port (default: `8443`)
113- `SERVER_ENABLE_MTLS`: Enable mutual TLS (default: `true`)
114
115#### Automated SSL Configuration (Recommended)
116
117- `SERVER_AUTO_CERT`: Enable automatic certificate management (default: `false`)
118- `SERVER_CERT_EMAIL`: Email for Let's Encrypt account (required when auto_cert is true)
119- `SERVER_CERT_PATH`: Directory for storing certificates (default: `./certs`)
120- `SERVER_CERT_STAGING`: Use Let's Encrypt staging server for testing (default: `false`)
121- `SERVER_ACME_PORT`: Port for ACME HTTP-01 challenge (default: `80`)
122
123#### Manual SSL Configuration
124
125If not using automated certificates:
126- `SERVER_CERT_PATH`: Path to certificate directory
127- Place `fullchain.pem` and `privkey.pem` in `{SERVER_CERT_PATH}/{SERVER_DOMAIN}/`
128
129## Message Format
130
131### Alert Notifications
132
133Messages are posted to topics based on severity level:
134
135- **Topic**: `critical`, `warning`, or `clear`
136- **Format**: Rich markdown with alert details, timestamps, and links
137
138Example:
139```
140🔴 **High CPU Usage**
141
142**Space:** production
143**Chart:** system.cpu
144**Context:** cpu utilization
145**Severity:** Critical
146**Time:** 2024-01-15 14:30:00 UTC
147
148**Details:** CPU usage has exceeded 90% for 5 minutes
149**Summary:** Critical alert: High CPU usage detected
150
151[View Alert](https://app.netdata.cloud/spaces/...)
152```
153
154### Reachability Notifications
155
156Messages are posted to the `reachability` topic:
157
158```
159❌ **Host Unreachable**
160
161**Host:** web-server-01
162**Status:** ❌ Unreachable
163**Severity:** Critical
164
165**Summary:** Host web-server-01 is no longer reachable
166
167[View Host](https://app.netdata.cloud/...)
168```
169
170## Deployment
171
172### Systemd Service
173
174Create `/etc/systemd/system/netdata-zulip-bot.service`:
175
176```ini
177[Unit]
178Description=Netdata Zulip Bot
179After=network.target
180
181[Service]
182Type=simple
183User=netdata-bot
184WorkingDirectory=/opt/netdata-zulip-bot
185Environment=SERVER_DOMAIN=your-domain.com
186ExecStart=/opt/netdata-zulip-bot/venv/bin/netdata-zulip-bot
187Restart=always
188RestartSec=5
189
190[Install]
191WantedBy=multi-user.target
192```
193
194Enable and start:
195```bash
196sudo systemctl enable netdata-zulip-bot
197sudo systemctl start netdata-zulip-bot
198```
199
200### Docker
201
202```dockerfile
203FROM python:3.11-slim
204
205WORKDIR /app
206COPY . .
207RUN pip install -e .
208
209EXPOSE 8443
210
211CMD ["netdata-zulip-bot"]
212```
213
214## Security
215
216### SSL Certificate Management
217
218The bot includes fully automated SSL certificate management:
219
2201. **Automatic Provisioning**: Obtains certificates from Let's Encrypt on first run
2212. **Automatic Renewal**: Checks daily and renews certificates 30 days before expiration
2223. **Zero Downtime**: Certificate renewal happens in the background
2234. **ACME HTTP-01 Challenge**: Built-in challenge server (requires port 80 access)
224
225### Mutual TLS Authentication
226
227The service supports mutual TLS to authenticate Netdata Cloud webhooks:
228
2291. **Server Certificate**: Automatically managed via built-in ACME client
2302. **Client Verification**: Validates Netdata's client certificate
2313. **CA Certificate**: Built-in Netdata CA certificate for client validation
232
233### Webhook Endpoint Security
234
235- HTTPS-only communication
236- Request logging and monitoring
237- Payload validation and sanitization
238- Error handling without information disclosure
239
240## Monitoring
241
242The service provides structured JSON logging for easy monitoring:
243
244```json
245{
246 "timestamp": "2024-01-15T14:30:00.000Z",
247 "level": "info",
248 "event": "Message sent to Zulip",
249 "stream": "netdata-alerts",
250 "topic": "critical",
251 "message_id": 12345
252}
253```
254
255### Health Check
256
257```bash
258curl -k https://your-domain.com:8443/health
259```
260
261Response:
262```json
263{
264 "status": "healthy",
265 "service": "netdata-zulip-bot"
266}
267```
268
269## Development
270
271### Running Tests
272
273```bash
274pytest
275```
276
277### Code Formatting
278
279```bash
280black .
281ruff check .
282```
283
284### Local Development
285
286For development, you can disable HTTPS and mTLS:
287
288```bash
289export SERVER_ENABLE_MTLS=false
290# Use HTTP for testing (not recommended for production)
291```
292
293## Troubleshooting
294
295### Common Issues
296
2971. **Certificate Issues**
298 - For automated certs: Ensure port 80 is accessible for ACME challenges
299 - Domain must point to your server's IP address
300 - Check `SERVER_CERT_EMAIL` is set for auto-cert mode
301 - Use `SERVER_CERT_STAGING=true` for testing to avoid rate limits
302
3032. **Zulip Connection Failed**
304 - Verify API credentials in zuliprc
305 - Test connection with Zulip's API
306
3073. **Webhook Not Receiving Data**
308 - Check firewall settings for port 8443
309 - Verify domain DNS resolution
310 - Check Netdata Cloud webhook configuration
311
312### Logs
313
314View service logs:
315```bash
316sudo journalctl -u netdata-zulip-bot -f
317```
318
319## License
320
321MIT License - see LICENSE file for details.