forked from
microcosm.blue/microcosm-rs
Constellation, Spacedust, Slingshot, UFOs: atproto crates and services for microcosm
1cozy-ucosm
2
3
4## gateway
5
6- tailscale (exit node enabled)
7 -> allow ipv4 and ipv6 forwarding
8- caddy
9
10 ```bash
11 apt install golang
12 go install github.com/caddyserver/xcaddy/cmd/xcaddy@latest
13 go/bin/xcaddy build \
14 --with github.com/caddyserver/cache-handler \
15 --with github.com/darkweak/storages/badger/caddy \
16 --with github.com/mholt/caddy-ratelimit
17 # then https://caddyserver.com/docs/running#manual-installation
18
19 mkdir /var/cache/caddy-badger
20 chown -R caddy:caddy /var/cache/caddy-badger/
21 ```
22
23 - `/etc/caddy/Caddyfile`
24
25 ```
26 {
27 cache {
28 badger
29 api {
30 prometheus
31 }
32 }
33 }
34
35 links.bsky.bad-example.com {
36 reverse_proxy link-aggregator:6789
37
38 @browser `{header.Origin.startsWith("Mozilla/5.0")`
39 rate_limit {
40 zone global_burst {
41 key {remote_host}
42 events 10
43 window 1s
44 }
45 zone global_general {
46 key {remote_host}
47 events 100
48 window 60s
49 log_key true
50 }
51 zone website_harsh_limit {
52 key {header.Origin}
53 match {
54 expression {header.User-Agent}.startsWith("Mozilla/5.0")
55 }
56 events 1000
57 window 30s
58 log_key true
59 }
60 }
61 respond /souin-api/metrics "denied" 403 # does not work
62 cache {
63 ttl 3s
64 stale 1h
65 default_cache_control public, s-maxage=3
66 badger {
67 path /var/cache/caddy-badger/links
68 }
69 }
70 }
71
72 gateway:80 {
73 metrics
74 cache
75 }
76 ```
77 well... the gateway fell over IMMEDIATELY with like 2 req/sec from deletions, with that ^^ config. for now i removed everything except the reverse proxy config + normal caddy metrics and it's running fine on vanilla caddy. i did try reducing the rate-limiting configs to a single, fixed-key global limit but it still ate all the ram and died. maybe badger w/ the cache config was still a problem. maybe it would have been ok on a machine with more than 1GB mem.
78
79
80 alternative proxies:
81
82 - nginx. i should probably just use this. acme-client is a piece of cake to set up, and i know how to configure it.
83 - haproxy. also kind of familiar, it's old and stable. no idea how it handle low-mem (our 1gb) vs nginx.
84 - sozu. popular rust thing, fast. doesn't have rate-limiting or cache feature?
85 - rpxy. like caddy (auto-tls) but in rust and actually fast? has an "experimental" cache feature. but the cache feature looks good.
86 - rama. build-your-own proxy. not sure that it has both cache and limiter in their standard features?
87 - pingora. build-your-own cloudflare, so like, probably stable. has tools for cache and limiting. low-mem...?
88 - cache stuff in pingora seems a little... hit and miss (byeeeee). only a test impl for Storage for the main cache feature?
89 - but the rate-limiter has a guide: https://github.com/cloudflare/pingora/blob/main/docs/user_guide/rate_limiter.md
90
91 what i want is low-resource reverse proxy with built-in rate-limiting and caching. but maybe cache (and/or ratelimiting) could be external to the reverse proxy
92 - varnish is a dedicated cache. has https://github.com/varnish/varnish-modules/blob/master/src/vmod_vsthrottle.vcc
93 - apache traffic control has experimental rate-limiting plugins
94
95
96- victoriametrics
97
98 ```bash
99 curl -LO https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.109.1/victoria-metrics-linux-amd64-v1.109.1.tar.gz
100 tar xzf victoria-metrics-linux-amd64-v1.109.1.tar.gz
101 # and then https://docs.victoriametrics.com/quick-start/#starting-vm-single-from-a-binary
102 sudo mkdir /etc/victoria-metrics && sudo chown -R victoriametrics:victoriametrics /etc/victoria-metrics
103
104 ```
105
106 - `/etc/victoria-metrics/prometheus.yml`
107
108 ```yaml
109global:
110 scrape_interval: '15s'
111
112scrape_configs:
113 - job_name: 'link_aggregator'
114 static_configs:
115 - targets: ['link-aggregator:8765']
116 - job_name: 'gateway:caddy'
117 static_configs:
118 - targets: ['gateway:80/metrics']
119 - job_name: 'gateway:cache'
120 static_configs:
121 - targets: ['gateway:80/souin-api/metrics']
122 ```
123
124 - `ExecStart` in `/etc/systemd/system/victoriametrics.service`:
125
126 ```
127 ExecStart=/usr/local/bin/victoria-metrics-prod -storageDataPath=/var/lib/victoria-metrics -retentionPeriod=90d -selfScrapeInterval=1m -promscrape.config=/etc/victoria-metrics/prometheus.yml
128 ```
129
130- grafana
131
132 followed `https://grafana.com/docs/grafana/latest/setup-grafana/installation/debian/#install-grafana-on-debian-or-ubuntu`
133
134 something something something then
135
136 ```
137 sudo grafana-cli --pluginUrl https://github.com/VictoriaMetrics/victoriametrics-datasource/releases/download/v0.11.1/victoriametrics-datasource-v0.11.1.zip plugins install victoriametrics
138 ```
139
140- raspi node_exporter
141
142 ```bash
143 curl -LO https://github.com/prometheus/node_exporter/releases/download/v1.8.2/node_exporter-1.8.2.linux-armv7.tar.gz
144 tar xzf node_exporter-1.8.2.linux-armv7.tar.gz
145 sudo cp node_exporter-1.8.2.linux-armv7/node_exporter /usr/local/bin/
146 sudo useradd --no-create-home --shell /bin/false node_exporter
147 sudo nano /etc/systemd/system/node_exporter.service
148 # [Unit]
149 # Description=Node Exporter
150 # Wants=network-online.target
151 # After=network-online.target
152
153 # [Service]
154 # User=node_exporter
155 # Group=node_exporter
156 # Type=simple
157 # ExecStart=/usr/local/bin/node_exporter
158 # Restart=always
159 # RestartSec=3
160
161 # [Install]
162 # WantedBy=multi-user.target
163 sudo systemctl daemon-reload
164 sudo systemctl enable node_exporter.service
165 sudo systemctl start node_exporter.service
166 ```
167
168 todo: get raspi vcgencmd outputs into metrics
169
170- nginx on gateway
171
172 ```nginx
173 # in http
174
175 ##
176 # cozy cache
177 ##
178 proxy_cache_path /var/cache/nginx keys_zone=cozy_zone:10m;
179
180 ##
181 # cozy limit
182 ##
183 limit_req_zone $binary_remote_addr zone=cozy_ip_limit:10m rate=50r/s;
184 limit_req_zone $server_name zone=cozy_global_limit:10m rate=1000r/s;
185
186 # in sites-available/constellation.microcosm.blue
187
188 upstream cozy_link_aggregator {
189 server link-aggregator:6789;
190 keepalive 16;
191 }
192
193 server {
194 listen 8080;
195 listen [::]:8080;
196
197 server_name constellation.microcosm.blue;
198
199 proxy_cache cozy_zone;
200 proxy_cache_background_update on;
201 proxy_cache_key "$scheme$proxy_host$uri$is_args$args$http_accept";
202 proxy_cache_lock on; # make simlutaneous requests for the same uri wait for it to appear in cache instead of hitting origin
203 proxy_cache_lock_age 1s;
204 proxy_cache_lock_timeout 2s;
205 proxy_cache_valid 10s; # default -- should be explicitly set in the response headers
206 proxy_cache_valid any 15s; # non-200s default
207 proxy_read_timeout 5s;
208 proxy_send_timeout 15s;
209 proxy_socket_keepalive on;
210
211 limit_req zone=cozy_ip_limit nodelay burst=100;
212 limit_req zone=cozy_global_limit;
213 limit_req_status 429;
214
215 location / {
216 proxy_pass http://cozy_link_aggregator;
217 include proxy_params;
218 proxy_http_version 1.1;
219 proxy_set_header Connection ""; # for keepalive
220 }
221 }
222 ```
223
224 also `systemctl edit nginx` and paste
225
226 ```
227 [Service]
228 Restart=always
229 ```
230
231 —https://serverfault.com/a/1003373
232
233 now making browsers redirect to the microcosm.blue url:
234
235 ```
236 [...]
237 server_name links.bsky.bad-example.com;
238
239 add_header Access-Control-Allow-Origin * always; # bit of hack to have it here but nginx doesn't like it in the `if`
240 if ($http_user_agent ~ ^Mozilla/) {
241 # for now send *browsers* to the new location, hopefully without impacting api requests
242 # (yeah we're doing UA test here and content-negotatiation in the app. whatever.)
243 return 301 https://constellation.microcosm.blue$request_uri;
244 }
245 [...]
246 ```
247
248- nginx metrics
249
250 - download nginx-prometheus-exporter
251 https://github.com/nginx/nginx-prometheus-exporter/releases/download/v1.4.1/nginx-prometheus-exporter_1.4.1_linux_amd64.tar.gz
252
253 - err actually going to make mistakes and try with snap
254 `snap install nginx-prometheus-exporter`
255 - so it got a binary for me but no systemd task set up. boooo.
256 `snap remove nginx-prometheus-exporter`
257
258 - ```bash
259 curl -LO https://github.com/nginx/nginx-prometheus-exporter/releases/download/v1.4.1/nginx-prometheus-exporter_1.4.1_linux_amd64.tar.gz
260 tar xzf nginx-prometheus-exporter_1.4.1_linux_amd64.tar.gz
261 mv nginx-prometheus-exporter /usr/local/bin
262 useradd --no-create-home --shell /bin/false nginx-prometheus-exporter
263 nano /etc/systemd/system/nginx-prometheus-exporter.service
264 # [Unit]
265 # Description=NGINX Exporter
266 # Wants=network-online.target
267 # After=network-online.target
268
269 # [Service]
270 # User=nginx-prometheus-exporter
271 # Group=nginx-prometheus-exporter
272 # Type=simple
273 # ExecStart=/usr/local/bin/nginx-prometheus-exporter --nginx.scrape-uri=http://gateway:8080/stub_status --web.listen-address=gateway:9113
274 # Restart=always
275 # RestartSec=3
276
277 # [Install]
278 # WantedBy=multi-user.target
279 systemctl daemon-reload
280 systemctl start nginx-prometheus-exporter.service
281 systemctl enable nginx-prometheus-exporter.service
282 ```
283
284 - nginx `/etc/nginx/sites-available/gateway-nginx-status`
285
286 ```nginx
287 server {
288 listen 8080;
289 listen [::]:8080;
290
291 server_name gateway;
292
293 location /stub_status {
294 stub_status;
295 }
296 location / {
297 return 404;
298 }
299 }
300 ```
301
302 ```bash
303 ln -s /etc/nginx/sites-available/gateway-nginx-status /etc/nginx/sites-enabled/
304 ```
305
306
307## bootes (pi5)
308
309- mount sd card, touch `ssh` file echo `echo "pi:$(echo raspberry | openssl passwd -6 -stdin)" > userconf.txt`
310- raspi-config: enable pcie 3, set hostname, enable ssh
311- put ssh key into `.ssh/authorized_keys`
312- put `PasswordAuthentication no` in `/etc/ssh/sshd_config`
313- `sudo apt update && sudo apt upgrade`
314- `sudo apt install xfsprogs`
315- `sudo mkfs.xfs -L c11n-kv /dev/nvme0n1`
316- `sudo mount /dev/nvme0n1 /mnt`
317- set up tailscale
318- `sudo tailscale up`
319- `git clone https://github.com/atcosm/links.git`
320- tailscale: disable bootes key expiry
321- rustup `curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh`
322- `cd links/constellation`
323- `sudo apt install libssl-dev` needed
324- `sudo apt install clang` needed for bindgen
325- (in tmux) `cargo build --release`
326- `mkdir ~/backup`
327- `sudo mount.cifs "//truenas.local/folks data" /home/pi/backup -o user=phil,uid=pi`
328- `sudo chown pi:pi /mnt/`
329- `RUST_BACKTRACE=full cargo run --bin rocks-restore-from-backup --release -- --from-backup-dir "/home/pi/backup/constellation-index" --to-data-dir /mnt/constellation-index`
330etc
331- follow above `- raspi node_exporter`
332- configure victoriametrics to scrape the new pi
333- configure ulimit before starting! `ulimit -n 16384`
334- `RUST_BACKTRACE=full cargo run --release -- --backend rocks --data /mnt/constellation-index/ --jetstream us-east-2 --backup /home/pi/backup/constellation-index --backup-interval 6 --max-old-backups 20`
335- add server to nginx gateway upstream: ` server 100.123.79.12:6789; # bootes`
336- stop backups from running on the older instance! `RUST_BACKTRACE=full cargo run --release -- --backend rocks --data /mnt/links-2.rocks/ --jetstream us-east-1`
337- stop upstreaming requests to older instance in nginx
338
339
340- systemd unit for running: `sudo nano /etc/systemd/system/constellation.service`
341
342 ```ini
343 [Unit]
344 Description=Constellation backlinks index
345 After=network.target
346
347 [Service]
348 User=pi
349 WorkingDirectory=/home/pi/links/constellation
350 ExecStart=/home/pi/links/target/release/main --backend rocks --data /mnt/constellation-index/ --jetstream us-east-2 --backup /home/pi/backup/constellation-index --backup-interval 6 --max-old-backups 20
351 LimitNOFILE=16384
352 Restart=always
353
354 [Install]
355 WantedBy=multi-user.target
356 ```
357
358
359- todo: overlayfs? would need to figure out builds/updates still, also i guess logs are currently written to sd? (oof)
360- todo: cross-compile for raspi?
361
362---
363
364some todos
365
366- [x] tailscale: exit node
367 - [!] link_aggregator: use exit node
368 -> worked, but reverted for now: tailscale on raspi was consuming ~50% cpu for the jetstream traffic. this might be near its max since it would have been catching up at the time (max jetstream throughput) but it feels a bit too much. we have to trust the jetstream server and link_aggregator doesn't (yet) make any other external connections, so for now the raspi connects directly from my home again.
369- [x] caddy: reverse proxy
370 - [x] build with cache and rate-limit plugins
371 - [x] configure systemd to keep it alive
372- [x] configure caddy cache
373- [x] configure caddy rate-limit
374- [ ] configure ~caddy~ nginx to use a health check (once it's added)
375- [ ] ~configure caddy to only expose cache metrics to tailnet :/~
376- [x] make some grafana dashboards
377- [ ] raspi: mount /dev/sda on boot
378- [ ] raspi: run link_aggregator via systemd so it starts on startup (and restarts?)
379
380- [x] use nginx instead of caddy
381- [x] nginx: enable cache
382- [x] nginx: rate-limit
383- [ ] nginx: get metrics
384
385
386
387
388---
389
390nginx cors for constellation + small burst bump
391
392```nginx
393upstream cozy_constellation {
394 server <tailnet ip>:6789; # bootes; ip so that we don't race on reboot with tailscale coming up, which nginx doesn't like
395 keepalive 16;
396}
397
398server {
399 server_name constellation.microcosm.blue;
400
401 proxy_cache cozy_zone;
402 proxy_cache_background_update on;
403 proxy_cache_key "$scheme$proxy_host$uri$is_args$args$http_accept";
404 proxy_cache_lock on; # make simlutaneous requests for the same uri wait for it to appear in cache instead of hitting origin
405 proxy_cache_lock_age 1s;
406 proxy_cache_lock_timeout 2s;
407 proxy_cache_valid 10s; # default -- should be explicitly set in the response headers
408 proxy_cache_valid any 2s; # non-200s default
409 proxy_read_timeout 5s;
410 proxy_send_timeout 15s;
411 proxy_socket_keepalive on;
412
413 # take over cors responsibility from upsteram. `always` applies it to error responses.
414 proxy_hide_header 'Access-Control-Allow-Origin';
415 proxy_hide_header 'Access-Control-Allowed-Methods';
416 proxy_hide_header 'Access-Control-Allow-Headers';
417 add_header 'Access-Control-Allow-Origin' '*' always;
418 add_header 'Access-Control-Allow-Methods' 'GET' always;
419 add_header 'Access-Control-Allow-Headers' '*' always;
420
421
422 limit_req zone=cozy_ip_limit nodelay burst=150;
423 limit_req zone=cozy_global_limit burst=1800;
424 limit_req_status 429;
425
426 location / {
427 proxy_pass http://cozy_constellation;
428 include proxy_params;
429 proxy_http_version 1.1;
430 proxy_set_header Connection ""; # for keepalive
431 }
432
433
434 listen 443 ssl; # managed by Certbot
435 ssl_certificate /etc/letsencrypt/live/constellation.microcosm.blue/fullchain.pem; # managed by Certbot
436 ssl_certificate_key /etc/letsencrypt/live/constellation.microcosm.blue/privkey.pem; # managed by Certbot
437 include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
438 ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
439
440}
441
442server {
443 if ($host = constellation.microcosm.blue) {
444 return 301 https://$host$request_uri;
445 } # managed by Certbot
446
447
448 server_name constellation.microcosm.blue;
449 listen 80;
450 return 404; # managed by Certbot
451}
452```
453
454re-reading about `nodelay`, i should probably remove it -- nginx would then queue requests to upstream, but still service them at the configured limit. it's fine for my internet since the global limit isn't nodelay, but probably less "fair" to clients if there's contention around the global limit (earlier requests would get all of theirs serviced before later ones can get in the queue)
455
456leaving it for now though.
457
458
459### nginx logs to prom
460
461```bash
462curl -LO https://github.com/martin-helmich/prometheus-nginxlog-exporter/releases/download/v1.11.0/prometheus-nginxlog-exporter_1.11.0_linux_amd64.deb
463apt install ./prometheus-nginxlog-exporter_1.11.0_linux_amd64.deb
464systemctl enable prometheus-nginxlog-exporter.service
465
466```
467
468have it run as www-data (maybe not the best idea but...)
469file `/usr/lib/systemd/system/prometheus-nginxlog-exporter.service`
470set User under service and remove capabilities bounding
471
472```systemd
473User=www-data
474#CapabilityBoundingSet=
475```
476
477in `nginx.conf` in `http`:
478
479```nginx
480log_format constellation_format "$remote_addr - $remote_user [$time_local] \"$request\" $status $body_bytes_sent \"$http_referer\" \"$http_user_agent\" \"$http_x_forwarded_for\"";
481```
482
483in `sites-available/constellation.microcosm.blue` in `server`:
484
485```nginx
486# log format must match prometheus-nginx-log-exporter
487access_log /var/log/nginx/constellation-access.log constellation_format;
488```
489
490config at `/etc/prometheus-nginxlog-exporter.hcl`
491
492
493
494```bash
495systemctl start prometheus-nginxlog-exporter.service
496```