Netdata.cloud bot for Zulip
1# Netdata Zulip Bot
2
3*100% vibe coded, use at your peril*
4
5A webhook service that receives notifications from Netdata Cloud and forwards them to Zulip channels. Features HTTPS with Let's Encrypt certificates and mutual TLS authentication for secure communication with Netdata Cloud.
6
7## Features
8
9- 🔐 **HTTPS with Let's Encrypt**: Automatic SSL certificate management
10- 🤝 **Mutual TLS**: Secure authentication with Netdata Cloud
11- 📊 **Rich Formatting**: Beautiful Zulip messages with emojis and markdown
12- 🏷️ **Topic Organization**: Automatic topic routing by severity level
13- 📝 **Structured Logging**: JSON-structured logs for monitoring
14- ⚡ **High Performance**: FastAPI-based webhook endpoint
15
16## Quick Start
17
18### 1. Install Dependencies
19
20```bash
21# Using uv (recommended)
22uv sync
23
24# Or using pip
25pip install -e .
26```
27
28### 2. Create Configuration
29
30```bash
31# Generate sample configuration files
32netdata-zulip-bot --create-config
33
34# Copy and customize
35cp .zuliprc.sample ~/.zuliprc
36```
37
38### 3. Configure Zulip Settings
39
40Edit `~/.zuliprc`:
41
42```ini
43[api]
44site=https://yourorg.zulipchat.com
45email=netdata-bot@yourorg.zulipchat.com
46key=your-zulip-api-key
47stream=netdata-alerts
48```
49
50### 4. Set Server Environment Variables
51
52```bash
53export SERVER_DOMAIN=your-webhook-domain.com
54export SERVER_PORT=8443
55export SERVER_ENABLE_MTLS=true
56```
57
58### 5. Setup SSL Certificate
59
60```bash
61# Install certbot and obtain certificate
62sudo certbot certonly --standalone -d your-webhook-domain.com
63
64# Ensure certificate files are accessible
65sudo chown -R $USER:$USER /etc/letsencrypt/live/your-webhook-domain.com/
66```
67
68### 6. Run the Service
69
70```bash
71netdata-zulip-bot
72```
73
74## Configuration
75
76### Zulip Configuration
77
78The bot supports two configuration methods:
79
80#### Method 1: Zuliprc File (Recommended)
81
82Create `~/.zuliprc`:
83
84```ini
85[api]
86site=https://yourorg.zulipchat.com
87email=netdata-bot@yourorg.zulipchat.com
88key=your-zulip-api-key
89stream=netdata-alerts
90```
91
92#### Method 2: Environment Variables
93
94```bash
95export ZULIP_SITE=https://yourorg.zulipchat.com
96export ZULIP_EMAIL=netdata-bot@yourorg.zulipchat.com
97export ZULIP_API_KEY=your-api-key
98export ZULIP_STREAM=netdata-alerts
99```
100
101Use `--env-config` flag to use environment variables instead of zuliprc.
102
103### Server Configuration
104
105Set these environment variables:
106
107- `SERVER_DOMAIN`: Your public domain (required for Let's Encrypt)
108- `SERVER_HOST`: Bind address (default: `0.0.0.0`)
109- `SERVER_PORT`: HTTPS port (default: `8443`)
110- `SERVER_CERT_PATH`: Certificate path (default: `/etc/letsencrypt/live`)
111- `SERVER_ENABLE_MTLS`: Enable mutual TLS (default: `true`)
112- `SERVER_CLIENT_CA_PATH`: Client CA certificate for mTLS validation
113
114## Message Format
115
116### Alert Notifications
117
118Messages are posted to topics based on severity level:
119
120- **Topic**: `critical`, `warning`, or `clear`
121- **Format**: Rich markdown with alert details, timestamps, and links
122
123Example:
124```
125🔴 **High CPU Usage**
126
127**Space:** production
128**Chart:** system.cpu
129**Context:** cpu utilization
130**Severity:** Critical
131**Time:** 2024-01-15 14:30:00 UTC
132
133**Details:** CPU usage has exceeded 90% for 5 minutes
134**Summary:** Critical alert: High CPU usage detected
135
136[View Alert](https://app.netdata.cloud/spaces/...)
137```
138
139### Reachability Notifications
140
141Messages are posted to the `reachability` topic:
142
143```
144❌ **Host Unreachable**
145
146**Host:** web-server-01
147**Status:** ❌ Unreachable
148**Severity:** Critical
149
150**Summary:** Host web-server-01 is no longer reachable
151
152[View Host](https://app.netdata.cloud/...)
153```
154
155## Deployment
156
157### Systemd Service
158
159Create `/etc/systemd/system/netdata-zulip-bot.service`:
160
161```ini
162[Unit]
163Description=Netdata Zulip Bot
164After=network.target
165
166[Service]
167Type=simple
168User=netdata-bot
169WorkingDirectory=/opt/netdata-zulip-bot
170Environment=SERVER_DOMAIN=your-domain.com
171ExecStart=/opt/netdata-zulip-bot/venv/bin/netdata-zulip-bot
172Restart=always
173RestartSec=5
174
175[Install]
176WantedBy=multi-user.target
177```
178
179Enable and start:
180```bash
181sudo systemctl enable netdata-zulip-bot
182sudo systemctl start netdata-zulip-bot
183```
184
185### Docker
186
187```dockerfile
188FROM python:3.11-slim
189
190WORKDIR /app
191COPY . .
192RUN pip install -e .
193
194EXPOSE 8443
195
196CMD ["netdata-zulip-bot"]
197```
198
199## Security
200
201### Mutual TLS Authentication
202
203The service supports mutual TLS to authenticate Netdata Cloud webhooks:
204
2051. **Server Certificate**: Automatically managed by Let's Encrypt
2062. **Client Verification**: Validates Netdata's client certificate
2073. **CA Certificate**: Configure `SERVER_CLIENT_CA_PATH` to validate client certs
208
209### Webhook Endpoint Security
210
211- HTTPS-only communication
212- Request logging and monitoring
213- Payload validation and sanitization
214- Error handling without information disclosure
215
216## Monitoring
217
218The service provides structured JSON logging for easy monitoring:
219
220```json
221{
222 "timestamp": "2024-01-15T14:30:00.000Z",
223 "level": "info",
224 "event": "Message sent to Zulip",
225 "stream": "netdata-alerts",
226 "topic": "critical",
227 "message_id": 12345
228}
229```
230
231### Health Check
232
233```bash
234curl -k https://your-domain.com:8443/health
235```
236
237Response:
238```json
239{
240 "status": "healthy",
241 "service": "netdata-zulip-bot"
242}
243```
244
245## Development
246
247### Running Tests
248
249```bash
250pytest
251```
252
253### Code Formatting
254
255```bash
256black .
257ruff check .
258```
259
260### Local Development
261
262For development, you can disable HTTPS and mTLS:
263
264```bash
265export SERVER_ENABLE_MTLS=false
266# Use HTTP for testing (not recommended for production)
267```
268
269## Troubleshooting
270
271### Common Issues
272
2731. **Certificate Not Found**
274 - Ensure Let's Encrypt certificates exist at `/etc/letsencrypt/live/your-domain.com/`
275 - Check file permissions
276
2772. **Zulip Connection Failed**
278 - Verify API credentials in zuliprc
279 - Test connection with Zulip's API
280
2813. **Webhook Not Receiving Data**
282 - Check firewall settings for port 8443
283 - Verify domain DNS resolution
284 - Check Netdata Cloud webhook configuration
285
286### Logs
287
288View service logs:
289```bash
290sudo journalctl -u netdata-zulip-bot -f
291```
292
293## License
294
295MIT License - see LICENSE file for details.