A tool for backing up ATProto related data to S3

Back AT it#

This is a tool I'm activly developing to back up my ATProtocol type things to S3 storage.

Currently there are 2 things that can be backed up:

1: PDS data - A users repo and their blobs 2: Tangled Knot data - The repositories directory that contains all of the repo data and the directory that contains the SQLite database

The PDS repo data is pulled straight from the xrpc endpoint at sent straight to S3. The blob data however is streamed into a zip file and sent to S3 so that not all the data is held in memory while the backup takes place (the minio library will still keep some in memory as a multipart request).

It's very hacky right now and needs polishing to use with caution. Although let's face it, the worst it can do at the moment it backup some bad data which is better than no data 🤪

How to use#

Clone the repo and copy the .env.example file to be .env. Fill in the .env file with you S3 variables.

For PDS data backup you need to ensure that DID and PDS_HOST are populated. (You can run this tool on any machine to back PDS data up)

For Knot data backup you need to ensure that TANGLED_KNOT_DATABASE_DIRECTORY and TANGLED_KNOT_REPOSITORY_DIRECTORY are populated. (You need to run this tool on your Knot server to back up Knot data)

Run go run .

Todo#

  • - Turn this into a long running app using a cron library perhaps
  • - User query params properly when creating the URLs to fetch repo and blobs
  • - Allow configuring the backup of knot repo data per users DID maybe?