Monorepo for Tangled โ€” https://tangled.org

proposal: ingest repo records #282

open
opened by oppi.li

sh.tangled.repo records are one of the last few records that need atprotation (2-way sync between appview/pds). this one is tricky because we want consistency between knot state, appview state and PDS state. the path forward is described below:

  • migration to did/rkey syntax universally: in several places, we utilize did/repo-name as a globally unique identifier for a repository, we should migrate this to did/rkey:
    • for the tangled appview: this means reworking our ACLs, routers and DB to use did/rkey or ATURI as a globally unique identifier for a repo
    • for knots: this means changing paths on disk to be did/rkey, allowing git ops to host:did/rkey , updating ACLs, and XRPC endpoints
    • for spindles: this means updating ACLs and XRPC endpoints
  • once this is done, we can define ingestion logic for all services in the network to pull sh.tangled.repo records
    • for the tangled appview, this should create/delete/update a repo pointer record
    • for knots: this should create-or-ignore/delete-or-ignore/update-or-ignore a repo on disk. migration of repos needs to be thought out here.
    • for spindles: as above
  • define edge-case behaviors:
    • when referring to repos by rkey, it is possible for clever users to create duplicate records with the same repo-name, appview routers must handle this ambiguity with a new interstitial page that offers a redirect, knots must return an error upon push/pull on ambiguous git URLs
    • the knot cli can introduce a command backfill or sync to bring knot state up to sync with the rest of the world (repos, collaborators, pubkeys, ACLs etc.).

Using did/rkey as internal identifier is a good idea. That will allow repository rename which is better than current.

once this is done, we can define ingestion logic for all services in the network to pull sh.tangled.repo records

Not sure I understood this correctly. you mean we will introduce migration logic for existing did/repo-name form of data to did/rkey on startup, right? yeah that sounds reasonable. We can remove that extra migration logic when we're out of the alpha phase.

it is possible for clever users to create duplicate records with the same repo-name

How about using rkey field itself for a repository name? I've seen we are already doing this for default labels. (e.g. gfi label has rkey good-first-issue). I'm not sure if this is ok in atproto spec.

the knot cli can introduce a command backfill or sync

Just to make sure, current knot also needs this anyway right? I haven't seen any backfill logic for public keys it is storing.


Though I think we can rather skip this part and go with did-for-repo. Both are breaking changes so it would be better to do once after proper discussion. It can also atprotate the repository and even allow cross-user migrations without loosing existing references. Issues and PRs will migrate with the repository itself. Also, serving repository identity in knot can solve knot<->pds syncing.

I think the problem with having the rkeys be the actual repo names is that it kind of destroys repo renaming and deleting the repo and creating a new a one with the same name unless like you make every record related to that repo have something that's still unique to the repo (cid wouldn't work because then you couldn't have knot migration, so created time maybe? I think just trusting what the record says would be fine because if someone still wants to go and delete the repo and create it again without losing all the links to it, they could just do that directly at the knot ?) and then for renaming have a stand in record that just points to the actual one, but then you would still have the problem of someone renaming it and then creating another repo with the same name which is bad so yeah I don't think rkeys having the repo name would be a good idea even tho it's a much nicer experience, unless repo renaming is never implemented ig but that I think would be worse but idk how much about anything works ever so idk

I think it would be easy to allow repo renaming by keeping the old record with a field that links to the new one, and a field in the new one that links to the old one. This is essentially how it works on Github. It's a little jank but having the rkey be the name is something we really need.

having the rkeys be the actual repo names is that it kind of destroys repo renaming

True. I forgot that point while writing...

I think it would be easy to allow repo renaming by keeping the old record with a field that links to the new one

@knotbin.com honestly now I'm against to my reponame-as-rkey idea. redirecting can work, but we are just reinventing the ActivityPub's identity at that point. The best feature of ATURI is that it is immutable. Also doing that will break the knot's advantage over did/rkey so I literally ruined the good part of this proposal with the rkey idea.

sign up or login to add to the discussion
Labels

None yet.

area
appview
knot
spindle
assignee
oppi.li
Participants 4
AT URI
at://did:plc:qfpnj4og54vl56wngdriaxug/sh.tangled.repo.issue/3m46d5s2osl22