A better Rust ATProto crate

value serde impl, some bug fixes, attribute macros bunch of tests lexicon codegen planning/partial skeleton

Orual 429fb2f9 83d7296b

+12
.claude/settings.local.json
···
···
+
{
+
"permissions": {
+
"allow": [
+
"WebSearch",
+
"WebFetch(domain:atproto.com)",
+
"WebFetch(domain:github.com)",
+
"WebFetch(domain:raw.githubusercontent.com)"
+
],
+
"deny": [],
+
"ask": []
+
}
+
}
+2 -1
.gitignore
···
/result
/result-lib
.direnv
-
/.pre-commit-config.yaml
···
/result
/result-lib
.direnv
+
.claude
/.pre-commit-config.yaml
+
CLAUDE.md
+115
CLAUDE.md
···
···
+
# CLAUDE.md
+
+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+
+
## Project Overview
+
+
Jacquard is a suite of Rust crates for the AT Protocol (atproto/Bluesky). The project emphasizes spec-compliant, validated, performant baseline types with minimal boilerplate. Key design goals:
+
+
- Validated AT Protocol types including typed at:// URIs
+
- Custom lexicon extension support
+
- Lexicon `Value` type for working with unknown atproto data (dag-cbor or json)
+
- Using as much or as little of the crates as needed
+
+
## Workspace Structure
+
+
This is a Cargo workspace with several crates:
+
+
- **jacquard-common**: Core AT Protocol types (DIDs, handles, at-URIs, NSIDs, TIDs, CIDs, etc.) and the `CowStr` type for efficient string handling
+
- **jacquard-lexicon**: Lexicon parsing and Rust code generation from lexicon schemas
+
- **jacquard-api**: Generated API bindings (currently empty/in development)
+
- **jacquard-derive**: Derive macros for lexicon structures
+
- **jacquard**: Main binary (currently minimal)
+
+
## Development Commands
+
+
### Using Nix (preferred)
+
```bash
+
# Enter dev shell
+
nix develop
+
+
# Build
+
nix build
+
+
# Run
+
nix develop -c cargo run
+
```
+
+
### Using Cargo/Just
+
```bash
+
# Build
+
cargo build
+
+
# Run tests
+
cargo test
+
+
# Run specific test
+
cargo test <test_name>
+
+
# Run specific package tests
+
cargo test -p <package_name>
+
+
# Run
+
cargo run
+
+
# Auto-recompile and run
+
just watch [ARGS]
+
+
# Format and lint all
+
just pre-commit-all
+
```
+
+
## String Type Pattern
+
+
The codebase uses a consistent pattern for validated string types. Each type should have:
+
+
### Constructors
+
- `new()`: Construct from a string slice with appropriate lifetime (borrows)
+
- `new_owned()`: Construct from `impl AsRef<str>`, taking ownership
+
- `new_static()`: Construct from `&'static str` using `SmolStr`/`CowStr`'s static constructor (no allocation)
+
- `raw()`: Same as `new()` but panics instead of returning `Result`
+
- `unchecked()`: Same as `new()` but doesn't validate (marked `unsafe`)
+
- `as_str()`: Return string slice
+
+
### Traits
+
All string types should implement:
+
- `Serialize` + `Deserialize` (custom impl for latter, sometimes for former)
+
- `FromStr`, `Display`
+
- `Debug`, `PartialEq`, `Eq`, `Hash`, `Clone`
+
- `From<T> for String`, `CowStr`, `SmolStr`
+
- `From<String>`, `From<CowStr>`, `From<SmolStr>`, or `TryFrom` if likely to fail
+
- `AsRef<str>`
+
- `Deref` with `Target = str` (usually)
+
+
### Implementation Details
+
- Use `#[repr(transparent)]` when possible (exception: at-uri type and components)
+
- Use `SmolStr` directly as inner type if most instances will be under 24 bytes
+
- Use `CowStr` for longer strings to allow borrowing from input
+
- Implement `IntoStatic` trait to take ownership of string types
+
+
## Code Style
+
+
- Avoid comments for self-documenting code
+
- Comments should not detail fixes when refactoring
+
- Professional writing within source code and comments only
+
- Prioritize long-term maintainability over implementation speed
+
+
## Testing
+
+
- Write test cases for all critical code
+
- Tests can be run per-package or workspace-wide
+
- Use `cargo test <name>` to run specific tests
+
- Current test coverage: 89 tests in jacquard-common
+
+
## Current State & Next Steps
+
+
### Completed
+
- ✅ Comprehensive validation tests for all core string types (handle, DID, NSID, TID, record key, AT-URI, datetime, language, identifier)
+
- ✅ Validated implementations against AT Protocol specs and TypeScript reference implementation
+
- ✅ String type interface standardization (Language now has `new_static()`, Datetime has full conversion traits)
+
- ✅ Data serialization: Full serialize/deserialize for `Data<'_>`, `Array`, `Object` with format-specific handling (JSON vs CBOR)
+
- ✅ CidLink wrapper type with automatic `{"$link": "cid"}` serialization in JSON
+
- ✅ Integration test with real Bluesky thread data validates round-trip correctness
+
+
### Next Steps
+
1. **Lexicon Code Generation**: Begin work on lexicon-to-Rust code generation now that core types are stable
+80 -8
Cargo.lock
···
checksum = "d0881ea181b1df73ff77ffaaf9c7544ecc11e82fba9b5f27b262a3c73a332555"
[[package]]
name = "enum_dispatch"
version = "0.3.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
···
checksum = "7943c866cc5cd64cbc25b2e01621d07fa8eb2a1a23160ee81ce38704e97b8ecf"
[[package]]
name = "itoa"
version = "1.0.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
···
]
[[package]]
name = "jacquard-common"
version = "0.1.0"
dependencies = [
···
"miette",
"multibase",
"multihash",
"ouroboros",
"rand",
"regex",
···
]
[[package]]
name = "jacquard-lexicon"
version = "0.1.0"
[[package]]
name = "js-sys"
···
]
[[package]]
name = "proc-macro-error"
version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
···
[[package]]
name = "quote"
-
version = "1.0.40"
source = "registry+https://github.com/rust-lang/crates.io-index"
-
checksum = "1885c039570dc00dcb4ff087a89e185fd56bae234ddc7f056a945bf36467248d"
dependencies = [
"proc-macro2",
]
···
[[package]]
name = "serde"
-
version = "1.0.227"
source = "registry+https://github.com/rust-lang/crates.io-index"
-
checksum = "80ece43fc6fbed4eb5392ab50c07334d3e577cbf40997ee896fe7af40bba4245"
dependencies = [
"serde_core",
"serde_derive",
···
[[package]]
name = "serde_core"
-
version = "1.0.227"
source = "registry+https://github.com/rust-lang/crates.io-index"
-
checksum = "7a576275b607a2c86ea29e410193df32bc680303c82f31e275bbfcafe8b33be5"
dependencies = [
"serde_derive",
]
[[package]]
name = "serde_derive"
-
version = "1.0.227"
source = "registry+https://github.com/rust-lang/crates.io-index"
-
checksum = "51e694923b8824cf0e9b382adf0f60d4e05f348f357b38833a3fa5ed7c2ede04"
dependencies = [
"proc-macro2",
"quote",
···
"ryu",
"serde",
"serde_core",
]
[[package]]
···
checksum = "d0881ea181b1df73ff77ffaaf9c7544ecc11e82fba9b5f27b262a3c73a332555"
[[package]]
+
name = "either"
+
version = "1.15.0"
+
source = "registry+https://github.com/rust-lang/crates.io-index"
+
checksum = "48c757948c5ede0e46177b7add2e67155f70e33c07fea8284df6576da70b3719"
+
+
[[package]]
name = "enum_dispatch"
version = "0.3.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
···
checksum = "7943c866cc5cd64cbc25b2e01621d07fa8eb2a1a23160ee81ce38704e97b8ecf"
[[package]]
+
name = "itertools"
+
version = "0.14.0"
+
source = "registry+https://github.com/rust-lang/crates.io-index"
+
checksum = "2b192c782037fadd9cfa75548310488aabdbf3d2da73885b31bd0abd03351285"
+
dependencies = [
+
"either",
+
]
+
+
[[package]]
name = "itoa"
version = "1.0.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
···
]
[[package]]
+
name = "jacquard-api"
+
version = "0.1.0"
+
+
[[package]]
name = "jacquard-common"
version = "0.1.0"
dependencies = [
···
"miette",
"multibase",
"multihash",
+
"num-traits",
"ouroboros",
"rand",
"regex",
···
]
[[package]]
+
name = "jacquard-derive"
+
version = "0.1.0"
+
dependencies = [
+
"heck 0.5.0",
+
"itertools",
+
"jacquard-common",
+
"jacquard-lexicon",
+
"prettyplease",
+
"proc-macro2",
+
"quote",
+
"serde",
+
"serde_json",
+
"serde_repr",
+
"serde_with",
+
"syn 2.0.106",
+
]
+
+
[[package]]
name = "jacquard-lexicon"
version = "0.1.0"
+
dependencies = [
+
"heck 0.5.0",
+
"itertools",
+
"jacquard-common",
+
"prettyplease",
+
"proc-macro2",
+
"quote",
+
"serde",
+
"serde_json",
+
"serde_repr",
+
"serde_with",
+
"syn 2.0.106",
+
]
[[package]]
name = "js-sys"
···
]
[[package]]
+
name = "prettyplease"
+
version = "0.2.37"
+
source = "registry+https://github.com/rust-lang/crates.io-index"
+
checksum = "479ca8adacdd7ce8f1fb39ce9ecccbfe93a3f1344b3d0d97f20bc0196208f62b"
+
dependencies = [
+
"proc-macro2",
+
"syn 2.0.106",
+
]
+
+
[[package]]
name = "proc-macro-error"
version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
···
[[package]]
name = "quote"
+
version = "1.0.41"
source = "registry+https://github.com/rust-lang/crates.io-index"
+
checksum = "ce25767e7b499d1b604768e7cde645d14cc8584231ea6b295e9c9eb22c02e1d1"
dependencies = [
"proc-macro2",
]
···
[[package]]
name = "serde"
+
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
+
checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e"
dependencies = [
"serde_core",
"serde_derive",
···
[[package]]
name = "serde_core"
+
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
+
checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad"
dependencies = [
"serde_derive",
]
[[package]]
name = "serde_derive"
+
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
+
checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79"
dependencies = [
"proc-macro2",
"quote",
···
"ryu",
"serde",
"serde_core",
+
]
+
+
[[package]]
+
name = "serde_repr"
+
version = "0.1.20"
+
source = "registry+https://github.com/rust-lang/crates.io-index"
+
checksum = "175ee3e80ae9982737ca543e96133087cbd9a485eecc3bc4de9c1a37b47ea59c"
+
dependencies = [
+
"proc-macro2",
+
"quote",
+
"syn 2.0.106",
]
[[package]]
+356
codegen_plan.md
···
···
+
# Lexicon Codegen Plan
+
+
## Goal
+
Generate idiomatic Rust types from AT Protocol lexicon schemas with minimal nesting/indirection.
+
+
## Existing Infrastructure
+
+
### Already Implemented
+
- **lexicon.rs**: Complete lexicon parsing types (`LexiconDoc`, `LexUserType`, `LexObject`, etc)
+
- **fs.rs**: Directory walking for finding `.json` lexicon files
+
- **schema.rs**: `find_ref_unions()` - collects union fields from a single lexicon
+
- **output.rs**: Partial - has string type mapping and doc comment generation
+
+
### Attribute Macros
+
- `#[lexicon]` - adds `extra_data` field to structs
+
- `#[open_union]` - adds `Unknown(Data<'s>)` variant to enums
+
+
## Design Decisions
+
+
### Module/File Structure
+
- NSID `app.bsky.feed.post` → `app_bsky/feed/post.rs`
+
- Flat module names (no `app::bsky`, just `app_bsky`)
+
- Parent modules: `app_bsky/feed.rs` with `pub mod post;`
+
+
### Type Naming
+
- **Main def**: Use last segment of NSID
+
- `app.bsky.feed.post#main` → `Post`
+
- **Other defs**: Pascal-case the def name
+
- `replyRef` → `ReplyRef`
+
- **Union variants**: Use last segment of ref NSID
+
- `app.bsky.embed.images` → `Images`
+
- Collisions resolved by module path, not type name
+
- **No proliferation of `Main` types** like atrium has
+
+
### Type Generation
+
+
#### Records (lexRecord)
+
```rust
+
#[lexicon]
+
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]
+
#[serde(rename_all = "camelCase")]
+
pub struct Post<'s> {
+
/// Client-declared timestamp...
+
pub created_at: Datetime,
+
#[serde(skip_serializing_if = "Option::is_none")]
+
pub embed: Option<RecordEmbed<'s>>,
+
pub text: CowStr<'s>,
+
}
+
```
+
+
#### Objects (lexObject)
+
Same as records but without `#[lexicon]` if inline/not a top-level def.
+
+
#### Unions (lexRefUnion)
+
```rust
+
#[open_union]
+
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq)]
+
#[serde(tag = "$type")]
+
pub enum RecordEmbed<'s> {
+
#[serde(rename = "app.bsky.embed.images")]
+
Images(Box<jacquard_api::app_bsky::embed::Images<'s>>),
+
#[serde(rename = "app.bsky.embed.video")]
+
Video(Box<jacquard_api::app_bsky::embed::Video<'s>>),
+
}
+
```
+
+
- Use `Box<T>` for all variants (handles circular refs)
+
- `#[open_union]` adds `Unknown(Data<'s>)` catch-all
+
+
#### Queries (lexXrpcQuery)
+
```rust
+
pub struct GetAuthorFeedParams<'s> {
+
pub actor: AtIdentifier<'s>,
+
pub limit: Option<i64>,
+
pub cursor: Option<CowStr<'s>>,
+
}
+
+
pub struct GetAuthorFeedOutput<'s> {
+
pub cursor: Option<CowStr<'s>>,
+
pub feed: Vec<FeedViewPost<'s>>,
+
}
+
```
+
+
- Flat params/output structs
+
- No nesting like `Input { params: {...} }`
+
+
#### Procedures (lexXrpcProcedure)
+
Same as queries but with both `Input` and `Output` structs.
+
+
### Field Handling
+
+
#### Optional Fields
+
- Fields not in `required: []` → `Option<T>`
+
- Add `#[serde(skip_serializing_if = "Option::is_none")]`
+
+
#### Lifetimes
+
- All types have `'a` lifetime for borrowing from input
+
- `#[serde(borrow)]` where needed for zero-copy
+
+
#### Type Mapping
+
- `LexString` with format → specific types (`Datetime`, `Did`, etc)
+
- `LexString` without format → `CowStr<'a>`
+
- `LexInteger` → `i64`
+
- `LexBoolean` → `bool`
+
- `LexBytes` → `Bytes`
+
- `LexCidLink` → `CidLink<'a>`
+
- `LexBlob` → `Blob<'a>`
+
- `LexRef` → resolve to actual type path
+
- `LexRefUnion` → generate enum
+
- `LexArray` → `Vec<T>`
+
- `LexUnknown` → `Data<'a>`
+
+
### Reference Resolution
+
+
#### Known Refs
+
- Check corpus for ref existence
+
- `#ref: "app.bsky.embed.images"` → `jacquard_api::app_bsky::embed::Images<'a>`
+
- Handle fragments: `#ref: "com.example.foo#bar"` → `jacquard_api::com_example::foo::Bar<'a>`
+
+
#### Unknown Refs
+
- **In struct fields**: use `Data<'a>` as fallback type
+
- **In union variants**: handled by `Unknown(Data<'a>)` variant from `#[open_union]`
+
- Optional: log warnings for missing refs
+
+
## Implementation Phases
+
+
### Phase 1: Corpus Loading & Registry
+
**Goal**: Load all lexicons into memory for ref resolution
+
+
**Tasks**:
+
1. Create `LexiconCorpus` struct
+
- `HashMap<SmolStr, LexiconDoc<'static>>` - NSID → doc
+
- Methods: `load_from_dir()`, `get()`, `resolve_ref()`
+
2. Load all `.json` files from lexicon directory
+
3. Parse into `LexiconDoc` and insert into registry
+
4. Handle fragments in refs (`nsid#def`)
+
+
**Output**: Corpus registry that can resolve any ref
+
+
### Phase 2: Ref Analysis & Union Collection
+
**Goal**: Build complete picture of what refs exist and what unions need
+
+
**Tasks**:
+
1. Extend `find_ref_unions()` to work across entire corpus
+
2. For each union, collect all refs and check existence
+
3. Build `UnionRegistry`:
+
- Union name → list of (known refs, unknown refs)
+
4. Detect circular refs (optional - or just Box everything)
+
+
**Output**: Complete list of unions to generate with their variants
+
+
### Phase 3: Code Generation - Core Types
+
**Goal**: Generate Rust code for individual types
+
+
**Tasks**:
+
1. Implement type generators:
+
- `generate_struct()` for records/objects
+
- `generate_enum()` for unions
+
- `generate_field()` for object properties
+
- `generate_type()` for primitives/refs
+
2. Handle optional fields (`required` list)
+
3. Add doc comments from `description`
+
4. Apply `#[lexicon]` / `#[open_union]` macros
+
5. Add serde attributes
+
+
**Output**: `TokenStream` for each type
+
+
### Phase 4: Module Organization
+
**Goal**: Organize generated types into module hierarchy
+
+
**Tasks**:
+
1. Parse NSID into components: `["app", "bsky", "feed", "post"]`
+
2. Determine file paths: `app_bsky/feed/post.rs`
+
3. Generate module files: `app_bsky/feed.rs` with `pub mod post;`
+
4. Generate root module: `app_bsky.rs`
+
5. Handle re-exports if needed
+
+
**Output**: File path → generated code mapping
+
+
### Phase 5: File Writing
+
**Goal**: Write generated code to filesystem
+
+
**Tasks**:
+
1. Format code with `prettyplease`
+
2. Create directory structure
+
3. Write module files
+
4. Write type files
+
5. Optional: run `rustfmt`
+
+
**Output**: Generated code on disk
+
+
### Phase 6: Testing & Validation
+
**Goal**: Ensure generated code compiles and works
+
+
**Tasks**:
+
1. Generate code for test lexicons
+
2. Compile generated code
+
3. Test serialization/deserialization
+
4. Test union variant matching
+
5. Test extra_data capture
+
+
## Edge Cases & Considerations
+
+
### Circular References
+
- **Simple approach**: Union variants always use `Box<T>` → handles all circular refs
+
- **Alternative**: DFS cycle detection to only Box when needed
+
- Track visited refs and recursion stack
+
- If ref appears in rec_stack → cycle detected
+
- Algorithm:
+
```rust
+
fn has_cycle(corpus, start_ref, visited, rec_stack) -> bool {
+
visited.insert(start_ref);
+
rec_stack.insert(start_ref);
+
+
for child_ref in collect_refs_from_def(resolve(start_ref)) {
+
if !visited.contains(child_ref) {
+
if has_cycle(corpus, child_ref, visited, rec_stack) {
+
return true;
+
}
+
} else if rec_stack.contains(child_ref) {
+
return true; // back edge = cycle
+
}
+
}
+
+
rec_stack.remove(start_ref);
+
false
+
}
+
```
+
- Only box variants that participate in cycles
+
- **Recommendation**: Start with simple (always Box), optimize later if needed
+
+
### Name Collisions
+
- Multiple types with same name in different lexicons
+
- Module path disambiguates: `app_bsky::feed::Post` vs `com_example::feed::Post`
+
+
### Unknown Refs
+
- Fallback to `Data<'s>` in struct fields
+
- Caught by `Unknown` variant in unions
+
- Warn during generation
+
+
### Inline Defs
+
- Nested objects/unions in same lexicon
+
- Generate as separate types in same file
+
- Keep names scoped to parent (e.g., `PostReplyRef`)
+
+
### Arrays
+
- `Vec<T>` for arrays
+
- Handle nested unions in arrays
+
+
### Tokens
+
- Simple marker types
+
- Generate as unit structs or type aliases?
+
+
## Traits for Generated Types
+
+
### Collection Trait (Records)
+
Records implement the existing `Collection` trait from jacquard-common:
+
+
```rust
+
pub struct Post<'a> {
+
// ... fields
+
}
+
+
impl Collection for Post<'_> {
+
const NSID: &'static str = "app.bsky.feed.post";
+
type Record = Post<'static>;
+
}
+
```
+
+
### XrpcRequest Trait (Queries/Procedures)
+
New trait for XRPC endpoints:
+
+
```rust
+
pub trait XrpcRequest<'x> {
+
/// The NSID for this XRPC method
+
const NSID: &'static str;
+
+
/// HTTP method (GET for queries, POST for procedures)
+
const METHOD: XrpcMethod;
+
+
/// Input encoding (MIME type, e.g., "application/json")
+
/// None for queries (no body)
+
const INPUT_ENCODING: Option<&'static str>;
+
+
/// Output encoding (MIME type)
+
const OUTPUT_ENCODING: &'static str;
+
+
/// Request parameters type (query params or body)
+
type Params: Serialize;
+
+
/// Response output type
+
type Output: Deserialize<'x>;
+
}
+
+
pub enum XrpcMethod {
+
Query, // GET
+
Procedure, // POST
+
}
+
```
+
+
**Generated implementation:**
+
```rust
+
pub struct GetAuthorFeedParams<'a> {
+
pub actor: AtIdentifier<'a>,
+
pub limit: Option<i64>,
+
pub cursor: Option<CowStr<'a>>,
+
}
+
+
pub struct GetAuthorFeedOutput<'a> {
+
pub cursor: Option<CowStr<'a>>,
+
pub feed: Vec<FeedViewPost<'a>>,
+
}
+
+
impl XrpcRequest for GetAuthorFeedParams<'_> {
+
const NSID: &'static str = "app.bsky.feed.getAuthorFeed";
+
const METHOD: XrpcMethod = XrpcMethod::Query;
+
const INPUT_ENCODING: Option<&'static str> = None; // queries have no body
+
const OUTPUT_ENCODING: &'static str = "application/json";
+
+
type Params = Self;
+
type Output = GetAuthorFeedOutput<'static>;
+
}
+
```
+
+
**Encoding variations:**
+
- Most procedures: `"application/json"` for input/output
+
- Blob uploads: `"*/*"` or specific MIME type for input
+
- CAR files: `"application/vnd.ipld.car"` for repo operations
+
- Read from lexicon's `input.encoding` and `output.encoding` fields
+
+
**Trait benefits:**
+
- Allows monomorphization (static dispatch) for performance
+
- Also supports `dyn XrpcRequest` for dynamic dispatch if needed
+
- Client code can be generic over `impl XrpcRequest`
+
+
### Subscriptions
+
WebSocket streams - defer for now. Will need separate trait with message types.
+
+
## Open Questions
+
+
1. **Validation**: Generate runtime validation (min/max length, regex, etc)?
+
2. **Tokens**: How to represent token types?
+
3. **Errors**: How to handle codegen errors (missing refs, invalid schemas)?
+
4. **Incremental**: Support incremental codegen (only changed lexicons)?
+
5. **Formatting**: Always run rustfmt or rely on prettyplease?
+
6. **XrpcRequest location**: Should trait live in jacquard-common or separate jacquard-xrpc crate?
+
+
## Success Criteria
+
+
- [ ] Generates code for all official AT Protocol lexicons
+
- [ ] Generated code compiles without errors
+
- [ ] No `Main` proliferation
+
- [ ] Union variants have readable names
+
- [ ] Unknown refs handled gracefully
+
- [ ] `#[lexicon]` and `#[open_union]` applied correctly
+
- [ ] Serialization round-trips correctly
+14
crates/jacquard-api/Cargo.toml
···
···
+
[package]
+
name = "jacquard-api"
+
edition.workspace = true
+
version.workspace = true
+
authors.workspace = true
+
repository.workspace = true
+
keywords.workspace = true
+
categories.workspace = true
+
readme.workspace = true
+
documentation.workspace = true
+
exclude.workspace = true
+
description.workspace = true
+
+
[dependencies]
+16
crates/jacquard-api/src/lib.rs
···
···
+
//placeholder for codegen api output
+
+
pub fn add(left: u64, right: u64) -> u64 {
+
left + right
+
}
+
+
#[cfg(test)]
+
mod tests {
+
use super::*;
+
+
#[test]
+
fn it_works() {
+
let result = add(2, 2);
+
assert_eq!(result, 4);
+
}
+
}
+1
crates/jacquard-common/Cargo.toml
···
miette = "7.6.0"
multibase = "0.9.1"
multihash = "0.19.3"
ouroboros = "0.18.5"
rand = "0.9.2"
regex = "1.11.3"
···
miette = "7.6.0"
multibase = "0.9.1"
multihash = "0.19.3"
+
num-traits = "0.2.19"
ouroboros = "0.18.5"
rand = "0.9.2"
regex = "1.11.3"
+22
crates/jacquard-common/src/cowstr.rs
···
}
}
impl Eq for CowStr<'_> {}
impl Hash for CowStr<'_> {
···
}
}
+
impl PartialOrd for CowStr<'_> {
+
fn partial_cmp(&self, other: &Self) -> Option<std::cmp::Ordering> {
+
Some(match (self, other) {
+
(CowStr::Borrowed(s1), CowStr::Borrowed(s2)) => s1.cmp(s2),
+
(CowStr::Borrowed(s1), CowStr::Owned(s2)) => s1.cmp(&s2.as_ref()),
+
(CowStr::Owned(s1), CowStr::Borrowed(s2)) => s1.as_str().cmp(s2),
+
(CowStr::Owned(s1), CowStr::Owned(s2)) => s1.cmp(s2),
+
})
+
}
+
}
+
+
impl Ord for CowStr<'_> {
+
fn cmp(&self, other: &Self) -> std::cmp::Ordering {
+
match (self, other) {
+
(CowStr::Borrowed(s1), CowStr::Borrowed(s2)) => s1.cmp(s2),
+
(CowStr::Borrowed(s1), CowStr::Owned(s2)) => s1.cmp(&s2.as_ref()),
+
(CowStr::Owned(s1), CowStr::Borrowed(s2)) => s1.as_str().cmp(s2),
+
(CowStr::Owned(s1), CowStr::Owned(s2)) => s1.cmp(s2),
+
}
+
}
+
}
+
impl Eq for CowStr<'_> {}
impl Hash for CowStr<'_> {
-2
crates/jacquard-common/src/types.rs
···
use serde::{Deserialize, Serialize};
-
use crate::types::nsid::Nsid;
-
pub mod aturi;
pub mod blob;
pub mod cid;
···
use serde::{Deserialize, Serialize};
pub mod aturi;
pub mod blob;
pub mod cid;
+77 -1
crates/jacquard-common/src/types/aturi.rs
···
pub type UriPathBuf = UriPath<'static>;
pub static ATURI_REGEX: LazyLock<Regex> = LazyLock::new(|| {
-
Regex::new(r##"^at://(?<authority>[a-zA-Z0-9._:%-]+)(/(?<collection>[a-zA-Z0-9-.]+)(/(?<rkey>[a-zA-Z0-9._~:@!$&%')(*+,;=-]+))?)?(#(?<fragment>/[a-zA-Z0-9._~:@!$&%')(*+,;=-[]/\]*))?$"##).unwrap()
});
impl<'u> AtUri<'u> {
···
self.inner.borrow_uri().as_ref()
}
}
···
pub type UriPathBuf = UriPath<'static>;
pub static ATURI_REGEX: LazyLock<Regex> = LazyLock::new(|| {
+
// Fragment allows: / and \ and other special chars. In raw string, backslashes are literal.
+
Regex::new(r##"^at://(?<authority>[a-zA-Z0-9._:%-]+)(/(?<collection>[a-zA-Z0-9-.]+)(/(?<rkey>[a-zA-Z0-9._~:@!$&%')(*+,;=-]+))?)?(#(?<fragment>/[a-zA-Z0-9._~:@!$&%')(*+,;=\-\[\]/\\]*))?$"##).unwrap()
});
impl<'u> AtUri<'u> {
···
self.inner.borrow_uri().as_ref()
}
}
+
+
#[cfg(test)]
+
mod tests {
+
use super::*;
+
+
#[test]
+
fn valid_at_uris() {
+
assert!(AtUri::new("at://did:plc:foo").is_ok());
+
assert!(AtUri::new("at://alice.bsky.social").is_ok());
+
assert!(AtUri::new("at://did:plc:foo/com.example.post").is_ok());
+
assert!(AtUri::new("at://did:plc:foo/com.example.post/123").is_ok());
+
}
+
+
#[test]
+
fn authority_only() {
+
let uri = AtUri::new("at://alice.test").unwrap();
+
assert_eq!(uri.authority().as_str(), "alice.test");
+
assert!(uri.collection().is_none());
+
assert!(uri.rkey().is_none());
+
}
+
+
#[test]
+
fn authority_and_collection() {
+
let uri = AtUri::new("at://alice.test/com.example.foo").unwrap();
+
assert_eq!(uri.authority().as_str(), "alice.test");
+
assert_eq!(uri.collection().unwrap().as_str(), "com.example.foo");
+
assert!(uri.rkey().is_none());
+
}
+
+
#[test]
+
fn full_uri() {
+
let uri = AtUri::new("at://alice.test/com.example.foo/123").unwrap();
+
assert_eq!(uri.authority().as_str(), "alice.test");
+
assert_eq!(uri.collection().unwrap().as_str(), "com.example.foo");
+
assert_eq!(uri.rkey().unwrap().as_ref(), "123");
+
}
+
+
#[test]
+
fn with_fragment() {
+
let uri = AtUri::new("at://alice.test/com.example.foo/123#/path").unwrap();
+
assert_eq!(uri.fragment().as_ref().unwrap().as_ref(), "/path");
+
+
// Fragment must start with /
+
assert!(AtUri::new("at://alice.test#path").is_err());
+
assert!(AtUri::new("at://alice.test#/foo/bar").is_ok());
+
}
+
+
#[test]
+
fn no_trailing_slash() {
+
assert!(AtUri::new("at://alice.test/").is_err());
+
assert!(AtUri::new("at://alice.test/com.example.foo/").is_err());
+
}
+
+
#[test]
+
fn must_have_authority() {
+
assert!(AtUri::new("at://").is_err());
+
assert!(AtUri::new("at:///com.example.foo").is_err());
+
}
+
+
#[test]
+
fn must_start_with_at_scheme() {
+
assert!(AtUri::new("alice.test").is_err());
+
assert!(AtUri::new("https://alice.test").is_err());
+
}
+
+
#[test]
+
fn max_length() {
+
// Spec says 8KB max
+
let long_did = format!("did:plc:{}", "a".repeat(8000));
+
let uri = format!("at://{}", long_did);
+
assert!(uri.len() < 8192);
+
// Should work if components are valid
+
// (our DID will fail at 2048 chars, but this tests the URI doesn't impose extra limits)
+
}
+
}
+33 -1
crates/jacquard-common/src/types/blob.rs
···
str::FromStr,
};
-
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq, Hash)]
#[serde(rename_all = "camelCase")]
pub struct Blob<'b> {
pub r#ref: Cid<'b>,
#[serde(borrow)]
pub mime_type: MimeType<'b>,
pub size: usize,
}
impl IntoStatic for Blob<'_> {
···
str::FromStr,
};
+
#[derive(Deserialize, Debug, Clone, PartialEq, Eq, Hash)]
#[serde(rename_all = "camelCase")]
pub struct Blob<'b> {
pub r#ref: Cid<'b>,
#[serde(borrow)]
pub mime_type: MimeType<'b>,
pub size: usize,
+
}
+
+
impl Serialize for Blob<'_> {
+
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
+
where
+
S: Serializer,
+
{
+
use serde::ser::SerializeMap;
+
+
if serializer.is_human_readable() {
+
// JSON: ref needs to be {"$link": "cid"}
+
let mut map = serializer.serialize_map(Some(4))?;
+
map.serialize_entry("$type", "blob")?;
+
+
// Serialize ref as {"$link": "cid_string"}
+
let mut ref_map = std::collections::BTreeMap::new();
+
ref_map.insert("$link", self.r#ref.as_str());
+
map.serialize_entry("ref", &ref_map)?;
+
+
map.serialize_entry("mimeType", &self.mime_type)?;
+
map.serialize_entry("size", &self.size)?;
+
map.end()
+
} else {
+
// CBOR: ref is just the CID directly
+
let mut map = serializer.serialize_map(Some(4))?;
+
map.serialize_entry("$type", "blob")?;
+
map.serialize_entry("ref", &self.r#ref)?;
+
map.serialize_entry("mimeType", &self.mime_type)?;
+
map.serialize_entry("size", &self.size)?;
+
map.end()
+
}
+
}
}
impl IntoStatic for Blob<'_> {
+266
crates/jacquard-common/src/types/cid.rs
···
self.as_str()
}
}
···
self.as_str()
}
}
+
+
/// CID link wrapper that serializes as {"$link": "cid"} in JSON
+
/// and as raw CID in CBOR
+
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
+
#[repr(transparent)]
+
pub struct CidLink<'c>(pub Cid<'c>);
+
+
impl<'c> CidLink<'c> {
+
pub fn new(cid: &'c [u8]) -> Result<Self, Error> {
+
Ok(Self(Cid::new(cid)?))
+
}
+
+
pub fn new_owned(cid: &[u8]) -> Result<CidLink<'static>, Error> {
+
Ok(CidLink(Cid::new_owned(cid)?))
+
}
+
+
pub fn new_static(cid: &'static str) -> Self {
+
Self(Cid::str(cid))
+
}
+
+
pub fn ipld(cid: IpldCid) -> CidLink<'static> {
+
CidLink(Cid::ipld(cid))
+
}
+
+
pub fn str(cid: &'c str) -> Self {
+
Self(Cid::str(cid))
+
}
+
+
pub fn cow_str(cid: CowStr<'c>) -> Self {
+
Self(Cid::cow_str(cid))
+
}
+
+
pub fn as_str(&self) -> &str {
+
self.0.as_str()
+
}
+
+
pub fn to_ipld(&self) -> Result<IpldCid, cid::Error> {
+
self.0.to_ipld()
+
}
+
+
pub fn into_inner(self) -> Cid<'c> {
+
self.0
+
}
+
}
+
+
impl fmt::Display for CidLink<'_> {
+
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+
self.0.fmt(f)
+
}
+
}
+
+
impl FromStr for CidLink<'_> {
+
type Err = Infallible;
+
+
fn from_str(s: &str) -> Result<Self, Self::Err> {
+
Ok(CidLink(Cid::from_str(s)?))
+
}
+
}
+
+
impl IntoStatic for CidLink<'_> {
+
type Output = CidLink<'static>;
+
+
fn into_static(self) -> Self::Output {
+
CidLink(self.0.into_static())
+
}
+
}
+
+
impl Serialize for CidLink<'_> {
+
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
+
where
+
S: Serializer,
+
{
+
if serializer.is_human_readable() {
+
// JSON: {"$link": "cid_string"}
+
use serde::ser::SerializeMap;
+
let mut map = serializer.serialize_map(Some(1))?;
+
map.serialize_entry("$link", self.0.as_str())?;
+
map.end()
+
} else {
+
// CBOR: raw CID
+
self.0.serialize(serializer)
+
}
+
}
+
}
+
+
impl<'de> Deserialize<'de> for CidLink<'_> {
+
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
+
where
+
D: Deserializer<'de>,
+
{
+
if deserializer.is_human_readable() {
+
// JSON: expect {"$link": "cid_string"}
+
struct LinkVisitor;
+
+
impl<'de> Visitor<'de> for LinkVisitor {
+
type Value = CidLink<'static>;
+
+
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
+
formatter.write_str("a CID link object with $link field")
+
}
+
+
fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
+
where
+
A: serde::de::MapAccess<'de>,
+
{
+
use serde::de::Error;
+
+
let mut link: Option<String> = None;
+
+
while let Some(key) = map.next_key::<String>()? {
+
if key == "$link" {
+
link = Some(map.next_value()?);
+
} else {
+
// Skip unknown fields
+
let _: serde::de::IgnoredAny = map.next_value()?;
+
}
+
}
+
+
if let Some(cid_str) = link {
+
Ok(CidLink(Cid::from(cid_str)))
+
} else {
+
Err(A::Error::missing_field("$link"))
+
}
+
}
+
}
+
+
deserializer.deserialize_map(LinkVisitor)
+
} else {
+
// CBOR: raw CID
+
Ok(CidLink(Cid::deserialize(deserializer)?))
+
}
+
}
+
}
+
+
impl From<CidLink<'_>> for String {
+
fn from(value: CidLink) -> Self {
+
value.0.into()
+
}
+
}
+
+
impl<'c> From<CidLink<'c>> for CowStr<'c> {
+
fn from(value: CidLink<'c>) -> Self {
+
value.0.into()
+
}
+
}
+
+
impl From<String> for CidLink<'_> {
+
fn from(value: String) -> Self {
+
CidLink(Cid::from(value))
+
}
+
}
+
+
impl<'c> From<CowStr<'c>> for CidLink<'c> {
+
fn from(value: CowStr<'c>) -> Self {
+
CidLink(Cid::from(value))
+
}
+
}
+
+
impl From<IpldCid> for CidLink<'_> {
+
fn from(value: IpldCid) -> Self {
+
CidLink(Cid::from(value))
+
}
+
}
+
+
impl<'c> From<Cid<'c>> for CidLink<'c> {
+
fn from(value: Cid<'c>) -> Self {
+
CidLink(value)
+
}
+
}
+
+
impl<'c> From<CidLink<'c>> for Cid<'c> {
+
fn from(value: CidLink<'c>) -> Self {
+
value.0
+
}
+
}
+
+
impl AsRef<str> for CidLink<'_> {
+
fn as_ref(&self) -> &str {
+
self.0.as_ref()
+
}
+
}
+
+
impl Deref for CidLink<'_> {
+
type Target = str;
+
+
fn deref(&self) -> &Self::Target {
+
self.0.deref()
+
}
+
}
+
+
#[cfg(test)]
+
mod tests {
+
use super::*;
+
+
const TEST_CID: &str = "bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha";
+
+
#[test]
+
fn cidlink_serialize_json() {
+
let link = CidLink::str(TEST_CID);
+
let json = serde_json::to_string(&link).unwrap();
+
assert_eq!(json, r#"{"$link":"bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha"}"#);
+
}
+
+
#[test]
+
fn cidlink_deserialize_json() {
+
let json = r#"{"$link":"bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha"}"#;
+
let link: CidLink = serde_json::from_str(json).unwrap();
+
assert_eq!(link.as_str(), TEST_CID);
+
}
+
+
#[test]
+
fn cidlink_roundtrip_json() {
+
let link = CidLink::str(TEST_CID);
+
let json = serde_json::to_string(&link).unwrap();
+
let parsed: CidLink = serde_json::from_str(&json).unwrap();
+
assert_eq!(link, parsed);
+
assert_eq!(link.as_str(), TEST_CID);
+
}
+
+
#[test]
+
fn cidlink_constructors() {
+
let link1 = CidLink::str(TEST_CID);
+
let link2 = CidLink::cow_str(CowStr::Borrowed(TEST_CID));
+
let link3 = CidLink::from(TEST_CID.to_string());
+
let link4 = CidLink::new_static(TEST_CID);
+
+
assert_eq!(link1.as_str(), TEST_CID);
+
assert_eq!(link2.as_str(), TEST_CID);
+
assert_eq!(link3.as_str(), TEST_CID);
+
assert_eq!(link4.as_str(), TEST_CID);
+
}
+
+
#[test]
+
fn cidlink_conversions() {
+
let link = CidLink::str(TEST_CID);
+
+
// CidLink -> Cid
+
let cid: Cid = link.clone().into();
+
assert_eq!(cid.as_str(), TEST_CID);
+
+
// Cid -> CidLink
+
let link2: CidLink = cid.into();
+
assert_eq!(link2.as_str(), TEST_CID);
+
+
// CidLink -> String
+
let s: String = link.clone().into();
+
assert_eq!(s, TEST_CID);
+
+
// CidLink -> CowStr
+
let cow: CowStr = link.into();
+
assert_eq!(cow.as_ref(), TEST_CID);
+
}
+
+
#[test]
+
fn cidlink_display() {
+
let link = CidLink::str(TEST_CID);
+
assert_eq!(format!("{}", link), TEST_CID);
+
}
+
+
#[test]
+
fn cidlink_deref() {
+
let link = CidLink::str(TEST_CID);
+
assert_eq!(&*link, TEST_CID);
+
assert_eq!(link.as_ref(), TEST_CID);
+
}
+
}
+73 -1
crates/jacquard-common/src/types/datetime.rs
···
use chrono::DurationRound;
use serde::Serializer;
use serde::{Deserialize, Deserializer, Serialize, de::Error};
-
use smol_str::ToSmolStr;
use std::sync::LazyLock;
use std::{cmp, str::FromStr};
···
}
}
}
···
use chrono::DurationRound;
use serde::Serializer;
use serde::{Deserialize, Deserializer, Serialize, de::Error};
+
use smol_str::{SmolStr, ToSmolStr};
+
use std::fmt;
use std::sync::LazyLock;
use std::{cmp, str::FromStr};
···
}
}
}
+
+
impl From<chrono::DateTime<chrono::FixedOffset>> for Datetime {
+
fn from(dt: chrono::DateTime<chrono::FixedOffset>) -> Self {
+
Self::new(dt)
+
}
+
}
+
+
impl From<Datetime> for String {
+
fn from(value: Datetime) -> Self {
+
value.serialized.to_string()
+
}
+
}
+
+
impl From<Datetime> for SmolStr {
+
fn from(value: Datetime) -> Self {
+
match value.serialized {
+
CowStr::Borrowed(s) => SmolStr::new(s),
+
CowStr::Owned(s) => s,
+
}
+
}
+
}
+
+
impl From<Datetime> for CowStr<'static> {
+
fn from(value: Datetime) -> Self {
+
value.serialized
+
}
+
}
+
+
impl AsRef<str> for Datetime {
+
fn as_ref(&self) -> &str {
+
self.as_str()
+
}
+
}
+
+
impl fmt::Display for Datetime {
+
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+
f.write_str(self.as_str())
+
}
+
}
+
+
#[cfg(test)]
+
mod tests {
+
use super::*;
+
+
#[test]
+
fn valid_datetimes() {
+
assert!(Datetime::from_str("2023-01-15T12:30:45.123456Z").is_ok());
+
assert!(Datetime::from_str("2023-01-15T12:30:45Z").is_ok());
+
assert!(Datetime::from_str("2023-01-15T12:30:45+00:00").is_ok());
+
assert!(Datetime::from_str("2023-01-15T12:30:45-05:00").is_ok());
+
}
+
+
#[test]
+
fn microsecond_precision() {
+
let dt = Datetime::from_str("2023-01-15T12:30:45.123456Z").unwrap();
+
assert!(dt.as_str().contains(".123456"));
+
}
+
+
#[test]
+
fn requires_timezone() {
+
// Missing timezone should fail
+
assert!(Datetime::from_str("2023-01-15T12:30:45").is_err());
+
}
+
+
#[test]
+
fn round_trip() {
+
let original = "2023-01-15T12:30:45.123456Z";
+
let dt = Datetime::from_str(original).unwrap();
+
assert_eq!(dt.as_str(), original);
+
}
+
}
+97
crates/jacquard-common/src/types/did.rs
···
#[repr(transparent)]
pub struct Did<'d>(CowStr<'d>);
pub static DID_REGEX: LazyLock<Regex> =
LazyLock::new(|| Regex::new(r"^did:[a-z]+:[a-zA-Z0-9._:%-]*[a-zA-Z0-9._-]$").unwrap());
···
self.as_str()
}
}
···
#[repr(transparent)]
pub struct Did<'d>(CowStr<'d>);
+
/// Regex for DID validation per AT Protocol spec.
+
///
+
/// Note: This regex allows `%` in the identifier but prevents DIDs from ending with `:` or `%`.
+
/// It does NOT validate that percent-encoding is well-formed (i.e., `%XX` where XX are hex digits).
+
/// This matches the behavior of the official TypeScript implementation, which also does not
+
/// enforce percent-encoding validity at validation time. While the spec states "percent sign
+
/// must be followed by two hex characters," this is treated as a best practice rather than
+
/// a hard validation requirement.
pub static DID_REGEX: LazyLock<Regex> =
LazyLock::new(|| Regex::new(r"^did:[a-z]+:[a-zA-Z0-9._:%-]*[a-zA-Z0-9._-]$").unwrap());
···
self.as_str()
}
}
+
+
#[cfg(test)]
+
mod tests {
+
use super::*;
+
+
#[test]
+
fn valid_dids() {
+
assert!(Did::new("did:plc:abc123").is_ok());
+
assert!(Did::new("did:web:example.com").is_ok());
+
assert!(Did::new("did:method:val_ue").is_ok());
+
assert!(Did::new("did:method:val-ue").is_ok());
+
assert!(Did::new("did:method:val.ue").is_ok());
+
assert!(Did::new("did:method:val%20ue").is_ok());
+
}
+
+
#[test]
+
fn prefix_stripping() {
+
assert_eq!(Did::new("at://did:plc:foo").unwrap().as_str(), "did:plc:foo");
+
assert_eq!(Did::new("did:plc:foo").unwrap().as_str(), "did:plc:foo");
+
}
+
+
#[test]
+
fn must_start_with_did() {
+
assert!(Did::new("DID:plc:foo").is_err());
+
assert!(Did::new("plc:foo").is_err());
+
assert!(Did::new("foo").is_err());
+
}
+
+
#[test]
+
fn method_must_be_lowercase() {
+
assert!(Did::new("did:plc:foo").is_ok());
+
assert!(Did::new("did:PLC:foo").is_err());
+
assert!(Did::new("did:Plc:foo").is_err());
+
}
+
+
#[test]
+
fn cannot_end_with_colon_or_percent() {
+
assert!(Did::new("did:plc:foo:").is_err());
+
assert!(Did::new("did:plc:foo%").is_err());
+
assert!(Did::new("did:plc:foo:bar").is_ok());
+
}
+
+
#[test]
+
fn max_length() {
+
let valid_2048 = format!("did:plc:{}", "a".repeat(2048 - 8));
+
assert_eq!(valid_2048.len(), 2048);
+
assert!(Did::new(&valid_2048).is_ok());
+
+
let too_long_2049 = format!("did:plc:{}", "a".repeat(2049 - 8));
+
assert_eq!(too_long_2049.len(), 2049);
+
assert!(Did::new(&too_long_2049).is_err());
+
}
+
+
#[test]
+
fn allowed_characters() {
+
assert!(Did::new("did:method:abc123").is_ok());
+
assert!(Did::new("did:method:ABC123").is_ok());
+
assert!(Did::new("did:method:a_b_c").is_ok());
+
assert!(Did::new("did:method:a-b-c").is_ok());
+
assert!(Did::new("did:method:a.b.c").is_ok());
+
assert!(Did::new("did:method:a:b:c").is_ok());
+
}
+
+
#[test]
+
fn disallowed_characters() {
+
assert!(Did::new("did:method:a b").is_err());
+
assert!(Did::new("did:method:a@b").is_err());
+
assert!(Did::new("did:method:a#b").is_err());
+
assert!(Did::new("did:method:a?b").is_err());
+
}
+
+
#[test]
+
fn percent_encoding() {
+
// Valid percent encoding
+
assert!(Did::new("did:method:foo%20bar").is_ok());
+
assert!(Did::new("did:method:foo%2Fbar").is_ok());
+
+
// DIDs cannot end with %
+
assert!(Did::new("did:method:foo%").is_err());
+
+
// IMPORTANT: The regex does NOT validate that percent-encoding is well-formed.
+
// This matches the TypeScript reference implementation's behavior.
+
// While the spec says "percent sign must be followed by two hex characters",
+
// implementations treat this as a best practice, not a hard validation requirement.
+
// Thus, malformed percent encoding like %2x is accepted by the regex.
+
assert!(Did::new("did:method:foo%2x").is_ok());
+
assert!(Did::new("did:method:foo%ZZ").is_ok());
+
}
+
}
+120 -26
crates/jacquard-common/src/types/handle.rs
···
impl<'h> Handle<'h> {
/// Fallible constructor, validates, borrows from input
///
-
/// Accepts (and strips) preceding '@' if present
pub fn new(handle: &'h str) -> Result<Self, AtStrError> {
-
let handle = handle
.strip_prefix("at://")
-
.unwrap_or(handle)
-
.strip_prefix('@')
.unwrap_or(handle);
-
if handle.len() > 253 {
-
Err(AtStrError::too_long("handle", handle, 253, handle.len()))
-
} else if !HANDLE_REGEX.is_match(handle) {
Err(AtStrError::regex(
"handle",
-
handle,
SmolStr::new_static("invalid"),
))
-
} else if ends_with(handle, DISALLOWED_TLDS) {
-
Err(AtStrError::disallowed("handle", handle, DISALLOWED_TLDS))
} else {
-
Ok(Self(CowStr::Borrowed(handle)))
}
}
/// Fallible constructor, validates, takes ownership
pub fn new_owned(handle: impl AsRef<str>) -> Result<Self, AtStrError> {
let handle = handle.as_ref();
-
let handle = handle
.strip_prefix("at://")
-
.unwrap_or(handle)
-
.strip_prefix('@')
.unwrap_or(handle);
if handle.len() > 253 {
Err(AtStrError::too_long("handle", handle, 253, handle.len()))
} else if !HANDLE_REGEX.is_match(handle) {
···
/// Fallible constructor, validates, doesn't allocate
pub fn new_static(handle: &'static str) -> Result<Self, AtStrError> {
-
let handle = handle
.strip_prefix("at://")
-
.unwrap_or(handle)
-
.strip_prefix('@')
.unwrap_or(handle);
if handle.len() > 253 {
Err(AtStrError::too_long("handle", handle, 253, handle.len()))
} else if !HANDLE_REGEX.is_match(handle) {
···
/// or API values you know are valid (rather than using serde), this is the one to use.
/// The From<String> and From<CowStr> impls use the same logic.
///
-
/// Accepts (and strips) preceding '@' if present
pub fn raw(handle: &'h str) -> Self {
-
let handle = handle
.strip_prefix("at://")
-
.unwrap_or(handle)
-
.strip_prefix('@')
.unwrap_or(handle);
if handle.len() > 253 {
panic!("handle too long")
} else if !HANDLE_REGEX.is_match(handle) {
···
/// Infallible constructor for when you *know* the string is a valid handle.
/// Marked unsafe because responsibility for upholding the invariant is on the developer.
///
-
/// Accepts (and strips) preceding '@' if present
pub unsafe fn unchecked(handle: &'h str) -> Self {
-
let handle = handle
.strip_prefix("at://")
-
.unwrap_or(handle)
-
.strip_prefix('@')
.unwrap_or(handle);
-
Self(CowStr::Borrowed(handle))
}
pub fn as_str(&self) -> &str {
···
self.as_str()
}
}
···
impl<'h> Handle<'h> {
/// Fallible constructor, validates, borrows from input
///
+
/// Accepts (and strips) preceding '@' or 'at://' if present
pub fn new(handle: &'h str) -> Result<Self, AtStrError> {
+
let stripped = handle
.strip_prefix("at://")
+
.or_else(|| handle.strip_prefix('@'))
.unwrap_or(handle);
+
+
if stripped.len() > 253 {
+
Err(AtStrError::too_long("handle", stripped, 253, stripped.len()))
+
} else if !HANDLE_REGEX.is_match(stripped) {
Err(AtStrError::regex(
"handle",
+
stripped,
SmolStr::new_static("invalid"),
))
+
} else if ends_with(stripped, DISALLOWED_TLDS) {
+
Err(AtStrError::disallowed("handle", stripped, DISALLOWED_TLDS))
} else {
+
Ok(Self(CowStr::Borrowed(stripped)))
}
}
/// Fallible constructor, validates, takes ownership
pub fn new_owned(handle: impl AsRef<str>) -> Result<Self, AtStrError> {
let handle = handle.as_ref();
+
let stripped = handle
.strip_prefix("at://")
+
.or_else(|| handle.strip_prefix('@'))
.unwrap_or(handle);
+
let handle = stripped;
if handle.len() > 253 {
Err(AtStrError::too_long("handle", handle, 253, handle.len()))
} else if !HANDLE_REGEX.is_match(handle) {
···
/// Fallible constructor, validates, doesn't allocate
pub fn new_static(handle: &'static str) -> Result<Self, AtStrError> {
+
let stripped = handle
.strip_prefix("at://")
+
.or_else(|| handle.strip_prefix('@'))
.unwrap_or(handle);
+
let handle = stripped;
if handle.len() > 253 {
Err(AtStrError::too_long("handle", handle, 253, handle.len()))
} else if !HANDLE_REGEX.is_match(handle) {
···
/// or API values you know are valid (rather than using serde), this is the one to use.
/// The From<String> and From<CowStr> impls use the same logic.
///
+
/// Accepts (and strips) preceding '@' or 'at://' if present
pub fn raw(handle: &'h str) -> Self {
+
let stripped = handle
.strip_prefix("at://")
+
.or_else(|| handle.strip_prefix('@'))
.unwrap_or(handle);
+
let handle = stripped;
if handle.len() > 253 {
panic!("handle too long")
} else if !HANDLE_REGEX.is_match(handle) {
···
/// Infallible constructor for when you *know* the string is a valid handle.
/// Marked unsafe because responsibility for upholding the invariant is on the developer.
///
+
/// Accepts (and strips) preceding '@' or 'at://' if present
pub unsafe fn unchecked(handle: &'h str) -> Self {
+
let stripped = handle
.strip_prefix("at://")
+
.or_else(|| handle.strip_prefix('@'))
.unwrap_or(handle);
+
Self(CowStr::Borrowed(stripped))
}
pub fn as_str(&self) -> &str {
···
self.as_str()
}
}
+
+
#[cfg(test)]
+
mod tests {
+
use super::*;
+
+
#[test]
+
fn valid_handles() {
+
assert!(Handle::new("alice.test").is_ok());
+
assert!(Handle::new("foo.bsky.social").is_ok());
+
assert!(Handle::new("a.b.c.d.e").is_ok());
+
assert!(Handle::new("a1.b2.c3").is_ok());
+
assert!(Handle::new("name-with-dash.com").is_ok());
+
}
+
+
#[test]
+
fn prefix_stripping() {
+
assert_eq!(Handle::new("@alice.test").unwrap().as_str(), "alice.test");
+
assert_eq!(Handle::new("at://alice.test").unwrap().as_str(), "alice.test");
+
assert_eq!(Handle::new("alice.test").unwrap().as_str(), "alice.test");
+
}
+
+
#[test]
+
fn max_length() {
+
// 253 chars: three 63-char segments + one 61-char segment + 3 dots = 253
+
let s1 = format!("a{}a", "b".repeat(61)); // 63
+
let s2 = format!("c{}c", "d".repeat(61)); // 63
+
let s3 = format!("e{}e", "f".repeat(61)); // 63
+
let s4 = format!("g{}g", "h".repeat(59)); // 61
+
let valid_253 = format!("{}.{}.{}.{}", s1, s2, s3, s4);
+
assert_eq!(valid_253.len(), 253);
+
assert!(Handle::new(&valid_253).is_ok());
+
+
// 254 chars: make last segment 62 chars
+
let s4_long = format!("g{}g", "h".repeat(60)); // 62
+
let too_long_254 = format!("{}.{}.{}.{}", s1, s2, s3, s4_long);
+
assert_eq!(too_long_254.len(), 254);
+
assert!(Handle::new(&too_long_254).is_err());
+
}
+
+
#[test]
+
fn segment_length_constraints() {
+
let valid_63_char_segment = format!("{}.com", "a".repeat(63));
+
assert!(Handle::new(&valid_63_char_segment).is_ok());
+
+
let too_long_64_char_segment = format!("{}.com", "a".repeat(64));
+
assert!(Handle::new(&too_long_64_char_segment).is_err());
+
}
+
+
#[test]
+
fn hyphen_placement() {
+
assert!(Handle::new("valid-label.com").is_ok());
+
assert!(Handle::new("-nope.com").is_err());
+
assert!(Handle::new("nope-.com").is_err());
+
}
+
+
#[test]
+
fn tld_must_start_with_letter() {
+
assert!(Handle::new("foo.bar").is_ok());
+
assert!(Handle::new("foo.9bar").is_err());
+
}
+
+
#[test]
+
fn disallowed_tlds() {
+
assert!(Handle::new("foo.local").is_err());
+
assert!(Handle::new("foo.localhost").is_err());
+
assert!(Handle::new("foo.arpa").is_err());
+
assert!(Handle::new("foo.invalid").is_err());
+
assert!(Handle::new("foo.internal").is_err());
+
assert!(Handle::new("foo.example").is_err());
+
assert!(Handle::new("foo.alt").is_err());
+
assert!(Handle::new("foo.onion").is_err());
+
}
+
+
#[test]
+
fn minimum_segments() {
+
assert!(Handle::new("a.b").is_ok());
+
assert!(Handle::new("a").is_err());
+
assert!(Handle::new("com").is_err());
+
}
+
+
#[test]
+
fn invalid_characters() {
+
assert!(Handle::new("foo!bar.com").is_err());
+
assert!(Handle::new("foo_bar.com").is_err());
+
assert!(Handle::new("foo bar.com").is_err());
+
assert!(Handle::new("foo@bar.com").is_err());
+
}
+
+
#[test]
+
fn empty_segments() {
+
assert!(Handle::new("foo..com").is_err());
+
assert!(Handle::new(".foo.com").is_err());
+
assert!(Handle::new("foo.com.").is_err());
+
}
+
}
+37
crates/jacquard-common/src/types/ident.rs
···
}
}
}
···
}
}
}
+
+
#[cfg(test)]
+
mod tests {
+
use super::*;
+
+
#[test]
+
fn parses_did() {
+
let ident = AtIdentifier::new("did:plc:foo").unwrap();
+
assert!(matches!(ident, AtIdentifier::Did(_)));
+
assert_eq!(ident.as_str(), "did:plc:foo");
+
}
+
+
#[test]
+
fn parses_handle() {
+
let ident = AtIdentifier::new("alice.test").unwrap();
+
assert!(matches!(ident, AtIdentifier::Handle(_)));
+
assert_eq!(ident.as_str(), "alice.test");
+
}
+
+
#[test]
+
fn did_takes_precedence() {
+
// DID is tried first, so valid DIDs are parsed as DIDs
+
let ident = AtIdentifier::new("did:web:alice.test").unwrap();
+
assert!(matches!(ident, AtIdentifier::Did(_)));
+
}
+
+
#[test]
+
fn from_types() {
+
let did = Did::new("did:plc:foo").unwrap();
+
let ident: AtIdentifier = did.into();
+
assert!(matches!(ident, AtIdentifier::Did(_)));
+
+
let handle = Handle::new("alice.test").unwrap();
+
let ident: AtIdentifier = handle.into();
+
assert!(matches!(ident, AtIdentifier::Handle(_)));
+
}
+
}
+35 -3
crates/jacquard-common/src/types/language.rs
···
T: AsRef<str> + ?Sized,
{
let tag = langtag::LangTag::new(lang)?;
-
Ok(Language(SmolStr::new_inline(tag.as_str())))
}
/// Infallible constructor for when you *know* the string is a valid IETF language tag.
···
pub fn raw(lang: impl AsRef<str>) -> Self {
let lang = lang.as_ref();
let tag = langtag::LangTag::new(lang).expect("valid IETF language tag");
-
Language(SmolStr::new_inline(tag.as_str()))
}
/// Infallible constructor for when you *know* the string is a valid IETF language tag.
/// Marked unsafe because responsibility for upholding the invariant is on the developer.
pub unsafe fn unchecked(lang: impl AsRef<str>) -> Self {
let lang = lang.as_ref();
-
Self(SmolStr::new_inline(lang))
}
/// Returns the LANG as a string slice.
···
self.as_str()
}
}
···
T: AsRef<str> + ?Sized,
{
let tag = langtag::LangTag::new(lang)?;
+
Ok(Language(SmolStr::new(tag.as_str())))
+
}
+
+
/// Parses an IETF language tag from a static string.
+
pub fn new_static(lang: &'static str) -> Result<Self, langtag::InvalidLangTag<&'static str>> {
+
let tag = langtag::LangTag::new(lang)?;
+
Ok(Language(SmolStr::new_static(tag.as_str())))
}
/// Infallible constructor for when you *know* the string is a valid IETF language tag.
···
pub fn raw(lang: impl AsRef<str>) -> Self {
let lang = lang.as_ref();
let tag = langtag::LangTag::new(lang).expect("valid IETF language tag");
+
Language(SmolStr::new(tag.as_str()))
}
/// Infallible constructor for when you *know* the string is a valid IETF language tag.
/// Marked unsafe because responsibility for upholding the invariant is on the developer.
pub unsafe fn unchecked(lang: impl AsRef<str>) -> Self {
let lang = lang.as_ref();
+
Self(SmolStr::new(lang))
}
/// Returns the LANG as a string slice.
···
self.as_str()
}
}
+
+
#[cfg(test)]
+
mod tests {
+
use super::*;
+
+
#[test]
+
fn valid_language_tags() {
+
assert!(Language::new("en").is_ok());
+
assert!(Language::new("en-US").is_ok());
+
assert!(Language::new("zh-Hans").is_ok());
+
assert!(Language::new("es-419").is_ok());
+
}
+
+
#[test]
+
fn case_insensitive_but_preserves() {
+
let lang = Language::new("en-US").unwrap();
+
assert_eq!(lang.as_str(), "en-US");
+
}
+
+
#[test]
+
fn invalid_tags() {
+
assert!(Language::new("").is_err());
+
assert!(Language::new("not_a_tag").is_err());
+
assert!(Language::new("123").is_err());
+
}
+
}
+94
crates/jacquard-common/src/types/nsid.rs
···
self.as_str()
}
}
···
self.as_str()
}
}
+
+
#[cfg(test)]
+
mod tests {
+
use super::*;
+
+
#[test]
+
fn valid_nsids() {
+
assert!(Nsid::new("com.example.foo").is_ok());
+
assert!(Nsid::new("com.example.fooBar").is_ok());
+
assert!(Nsid::new("com.long-domain.foo").is_ok());
+
assert!(Nsid::new("a.b.c").is_ok());
+
assert!(Nsid::new("a1.b2.c3").is_ok());
+
}
+
+
#[test]
+
fn minimum_segments() {
+
assert!(Nsid::new("a.b.c").is_ok()); // 3 segments minimum
+
assert!(Nsid::new("a.b").is_err());
+
assert!(Nsid::new("a").is_err());
+
}
+
+
#[test]
+
fn domain_and_name_parsing() {
+
let nsid = Nsid::new("com.example.fooBar").unwrap();
+
assert_eq!(nsid.domain_authority(), "com.example");
+
assert_eq!(nsid.name(), "fooBar");
+
}
+
+
#[test]
+
fn max_length() {
+
// 317 chars: 63 + 63 + 63 + 63 + 63 = 315 + 4 dots + 1 = 320, too much
+
// try: 63 + 63 + 63 + 63 + 62 = 314 + 4 dots = 318, still too much
+
// try: 63 + 63 + 63 + 63 + 61 = 313 + 4 dots = 317
+
let s1 = format!("a{}a", "b".repeat(61));
+
let s2 = format!("c{}c", "d".repeat(61));
+
let s3 = format!("e{}e", "f".repeat(61));
+
let s4 = format!("g{}g", "h".repeat(61));
+
let s5 = format!("i{}i", "j".repeat(59));
+
let valid_317 = format!("{}.{}.{}.{}.{}", s1, s2, s3, s4, s5);
+
assert_eq!(valid_317.len(), 317);
+
assert!(Nsid::new(&valid_317).is_ok());
+
+
let s5_long = format!("i{}i", "j".repeat(60));
+
let too_long_318 = format!("{}.{}.{}.{}.{}", s1, s2, s3, s4, s5_long);
+
assert_eq!(too_long_318.len(), 318);
+
assert!(Nsid::new(&too_long_318).is_err());
+
}
+
+
#[test]
+
fn segment_length() {
+
let valid_63 = format!("{}.{}.foo", "a".repeat(63), "b".repeat(63));
+
assert!(Nsid::new(&valid_63).is_ok());
+
+
let too_long_64 = format!("{}.b.foo", "a".repeat(64));
+
assert!(Nsid::new(&too_long_64).is_err());
+
}
+
+
#[test]
+
fn first_segment_cannot_start_with_digit() {
+
assert!(Nsid::new("com.example.foo").is_ok());
+
assert!(Nsid::new("9com.example.foo").is_err());
+
}
+
+
#[test]
+
fn name_segment_rules() {
+
assert!(Nsid::new("com.example.foo").is_ok());
+
assert!(Nsid::new("com.example.fooBar123").is_ok());
+
assert!(Nsid::new("com.example.9foo").is_err()); // can't start with digit
+
assert!(Nsid::new("com.example.foo-bar").is_err()); // no hyphens in name
+
}
+
+
#[test]
+
fn domain_segment_rules() {
+
assert!(Nsid::new("foo-bar.example.baz").is_ok());
+
assert!(Nsid::new("foo.bar-baz.qux").is_ok());
+
assert!(Nsid::new("-foo.bar.baz").is_err()); // can't start with hyphen
+
assert!(Nsid::new("foo-.bar.baz").is_err()); // can't end with hyphen
+
}
+
+
#[test]
+
fn case_sensitivity() {
+
// Domain should be case-insensitive per spec (but not enforced in validation)
+
// Name is case-sensitive
+
assert!(Nsid::new("com.example.fooBar").is_ok());
+
assert!(Nsid::new("com.example.FooBar").is_ok());
+
}
+
+
#[test]
+
fn no_hyphens_in_name() {
+
assert!(Nsid::new("com.example.foo").is_ok());
+
assert!(Nsid::new("com.example.foo-bar").is_err());
+
assert!(Nsid::new("com.example.fooBar").is_ok());
+
}
+
}
+69
crates/jacquard-common/src/types/recordkey.rs
···
self.as_str()
}
}
···
self.as_str()
}
}
+
+
#[cfg(test)]
+
mod tests {
+
use super::*;
+
+
#[test]
+
fn valid_rkeys() {
+
assert!(Rkey::new("3jzfcijpj2z2a").is_ok()); // TID format
+
assert!(Rkey::new("self").is_ok()); // literal
+
assert!(Rkey::new("com.example.foo").is_ok()); // NSID format
+
assert!(Rkey::new("foo-bar_baz").is_ok());
+
assert!(Rkey::new("foo:bar").is_ok());
+
assert!(Rkey::new("foo~bar").is_ok());
+
}
+
+
#[test]
+
fn length_constraints() {
+
assert!(Rkey::new("a").is_ok()); // min 1
+
let valid_512 = "a".repeat(512);
+
assert_eq!(valid_512.len(), 512);
+
assert!(Rkey::new(&valid_512).is_ok());
+
+
let too_long_513 = "a".repeat(513);
+
assert_eq!(too_long_513.len(), 513);
+
assert!(Rkey::new(&too_long_513).is_err());
+
}
+
+
#[test]
+
fn disallowed_literals() {
+
assert!(Rkey::new(".").is_err());
+
assert!(Rkey::new("..").is_err());
+
assert!(Rkey::new("...").is_ok()); // 3+ dots is fine
+
}
+
+
#[test]
+
fn allowed_characters() {
+
assert!(Rkey::new("abc123").is_ok());
+
assert!(Rkey::new("ABC123").is_ok());
+
assert!(Rkey::new("foo-bar").is_ok());
+
assert!(Rkey::new("foo_bar").is_ok());
+
assert!(Rkey::new("foo.bar").is_ok());
+
assert!(Rkey::new("foo:bar").is_ok());
+
assert!(Rkey::new("foo~bar").is_ok());
+
}
+
+
#[test]
+
fn disallowed_characters() {
+
assert!(Rkey::new("foo bar").is_err());
+
assert!(Rkey::new("foo@bar").is_err());
+
assert!(Rkey::new("foo#bar").is_err());
+
assert!(Rkey::new("foo/bar").is_err());
+
assert!(Rkey::new("foo\\bar").is_err());
+
}
+
+
#[test]
+
fn literal_key_self() {
+
let key = LiteralKey::<SelfRecord>::new("self").unwrap();
+
assert_eq!(key.as_str(), "self");
+
+
assert!(LiteralKey::<SelfRecord>::new("Self").is_ok()); // case insensitive
+
assert!(LiteralKey::<SelfRecord>::new("other").is_err());
+
}
+
+
#[test]
+
fn literal_key_disallowed() {
+
assert!(LiteralKey::<SelfRecord>::new(".").is_err());
+
assert!(LiteralKey::<SelfRecord>::new("..").is_err());
+
}
+
}
+1 -1
crates/jacquard-common/src/types/string.rs
···
CowStr,
types::{
aturi::AtUri,
-
cid::Cid,
datetime::Datetime,
did::Did,
handle::Handle,
···
CowStr,
types::{
aturi::AtUri,
+
cid::{Cid, CidLink},
datetime::Datetime,
did::Did,
handle::Handle,
+70
crates/jacquard-common/src/types/tid.rs
···
Self::new()
}
}
···
Self::new()
}
}
+
+
#[cfg(test)]
+
mod tests {
+
use super::*;
+
+
#[test]
+
fn valid_tids() {
+
assert!(Tid::new("3jzfcijpj2z2a").is_ok());
+
assert!(Tid::new("2222222222222").is_ok());
+
assert!(Tid::new("j7777777777777").is_err()); // j is valid for first char but makes high bit set
+
}
+
+
#[test]
+
fn exact_length() {
+
assert!(Tid::new("3jzfcijpj2z2a").is_ok());
+
assert!(Tid::new("3jzfcijpj2z2").is_err()); // 12 chars
+
assert!(Tid::new("3jzfcijpj2z2aa").is_err()); // 14 chars
+
}
+
+
#[test]
+
fn first_char_constraint() {
+
// First char must be 2-7 or a-j (not k-z)
+
assert!(Tid::new("2222222222222").is_ok());
+
assert!(Tid::new("7777777777777").is_ok());
+
assert!(Tid::new("a222222222222").is_ok());
+
assert!(Tid::new("j222222222222").is_ok());
+
assert!(Tid::new("k222222222222").is_err());
+
assert!(Tid::new("z222222222222").is_err());
+
}
+
+
#[test]
+
fn remaining_chars_constraint() {
+
// Remaining 12 chars must be 2-7 or a-z
+
assert!(Tid::new("3abcdefghijkl").is_ok());
+
assert!(Tid::new("3zzzzzzzzzzzz").is_ok());
+
assert!(Tid::new("3222222222222").is_ok());
+
assert!(Tid::new("3777777777777").is_ok());
+
}
+
+
#[test]
+
fn disallowed_characters() {
+
assert!(Tid::new("3jzfcijpj2z2A").is_err()); // uppercase
+
assert!(Tid::new("3jzfcijpj2z21").is_err()); // 1 not allowed
+
assert!(Tid::new("3jzfcijpj2z28").is_err()); // 8 not allowed
+
assert!(Tid::new("3jzfcijpj2z2-").is_err()); // special char
+
}
+
+
#[test]
+
fn generation_and_comparison() {
+
let tid1 = Tid::now_0();
+
std::thread::sleep(std::time::Duration::from_micros(10));
+
let tid2 = Tid::now_0();
+
+
assert!(tid1.as_str().len() == 13);
+
assert!(tid2.as_str().len() == 13);
+
assert!(tid2.newer_than(&tid1));
+
assert!(tid1.older_than(&tid2));
+
}
+
+
#[test]
+
fn ticker_monotonic() {
+
let mut ticker = Ticker::new();
+
let tid1 = ticker.next(None);
+
let tid2 = ticker.next(Some(tid1.clone()));
+
let tid3 = ticker.next(Some(tid2.clone()));
+
+
assert!(tid2.newer_than(&tid1));
+
assert!(tid3.newer_than(&tid2));
+
}
+
}
+21 -324
crates/jacquard-common/src/types/value.rs
···
-
use base64::{
-
Engine,
-
prelude::{BASE64_STANDARD, BASE64_STANDARD_NO_PAD, BASE64_URL_SAFE, BASE64_URL_SAFE_NO_PAD},
-
};
use bytes::Bytes;
use ipld_core::ipld::Ipld;
-
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use smol_str::{SmolStr, ToSmolStr};
-
use std::{collections::BTreeMap, str::FromStr};
-
use url::Url;
-
use crate::types::{
-
DataModelType, LexiconStringType,
-
blob::{Blob, MimeType},
-
string::*,
-
};
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum Data<'s> {
···
Boolean(bool),
Integer(i64),
String(AtprotoStr<'s>),
-
Bytes(Bytes), // maybe need custom type for serialization
-
CidLink(Cid<'s>), // maybe need custom type for serialization
Array(Array<'s>),
Object(Object<'s>),
Blob(Blob<'s>),
···
json: &'s serde_json::Map<String, serde_json::Value>,
) -> Result<Data<'s>, AtDataError> {
if let Some(type_field) = json.get("$type").and_then(|v| v.as_str()) {
-
if infer_from_type(type_field) == DataModelType::Blob {
-
if let Some(blob) = json_to_blob(json) {
return Ok(Data::Blob(blob));
}
}
···
if key == "$type" {
map.insert(key.to_smolstr(), Data::from_json(value)?);
}
-
match string_key_type_guess(key) {
DataModelType::Null if value.is_null() => {
map.insert(key.to_smolstr(), Data::Null);
}
···
map.insert(key.to_smolstr(), Data::Integer(value.as_i64().unwrap()));
}
DataModelType::Bytes if value.is_string() => {
-
map.insert(key.to_smolstr(), decode_bytes(value.as_str().unwrap()));
}
DataModelType::CidLink => {
if let Some(value) = value.as_object() {
···
);
}
DataModelType::String(string_type) if value.is_string() => {
-
insert_string(&mut map, key, value.as_str().unwrap(), string_type);
}
_ => {
map.insert(key.to_smolstr(), Data::from_json(value)?);
···
pub fn from_cbor(cbor: &'s BTreeMap<String, Ipld>) -> Result<Data<'s>, AtDataError> {
if let Some(Ipld::String(type_field)) = cbor.get("$type") {
-
if infer_from_type(type_field) == DataModelType::Blob {
-
if let Some(blob) = cbor_to_blob(cbor) {
return Ok(Data::Blob(blob));
}
}
···
if key == "$type" {
map.insert(key.to_smolstr(), Data::from_cbor(value)?);
}
-
match (string_key_type_guess(key), value) {
(DataModelType::Null, Ipld::Null) => {
map.insert(key.to_smolstr(), Data::Null);
}
···
map.insert(key.to_smolstr(), Object::from_cbor(value)?);
}
(DataModelType::String(string_type), Ipld::String(value)) => {
-
insert_string(&mut map, key, value, string_type);
}
_ => {
map.insert(key.to_smolstr(), Data::from_cbor(value)?);
···
Ok(Data::Object(Object(map)))
}
}
-
-
pub fn insert_string<'s>(
-
map: &mut BTreeMap<SmolStr, Data<'s>>,
-
key: &'s str,
-
value: &'s str,
-
string_type: LexiconStringType,
-
) -> Result<(), AtDataError> {
-
match string_type {
-
LexiconStringType::Datetime => {
-
if let Ok(datetime) = Datetime::from_str(value) {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::Datetime(datetime)),
-
);
-
} else {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::String(value.into())),
-
);
-
}
-
}
-
LexiconStringType::AtUri => {
-
if let Ok(value) = AtUri::new(value) {
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::AtUri(value)));
-
} else {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::String(value.into())),
-
);
-
}
-
}
-
LexiconStringType::Did => {
-
if let Ok(value) = Did::new(value) {
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Did(value)));
-
} else {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::String(value.into())),
-
);
-
}
-
}
-
LexiconStringType::Handle => {
-
if let Ok(value) = Handle::new(value) {
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Handle(value)));
-
} else {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::String(value.into())),
-
);
-
}
-
}
-
LexiconStringType::AtIdentifier => {
-
if let Ok(value) = AtIdentifier::new(value) {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::AtIdentifier(value)),
-
);
-
} else {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::String(value.into())),
-
);
-
}
-
}
-
LexiconStringType::Nsid => {
-
if let Ok(value) = Nsid::new(value) {
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Nsid(value)));
-
} else {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::String(value.into())),
-
);
-
}
-
}
-
LexiconStringType::Cid => {
-
if let Ok(value) = Cid::new(value.as_bytes()) {
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Cid(value)));
-
} else {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::String(value.into())),
-
);
-
}
-
}
-
LexiconStringType::Language => {
-
if let Ok(value) = Language::new(value) {
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Language(value)));
-
} else {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::String(value.into())),
-
);
-
}
-
}
-
LexiconStringType::Tid => {
-
if let Ok(value) = Tid::new(value) {
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Tid(value)));
-
} else {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::String(value.into())),
-
);
-
}
-
}
-
LexiconStringType::RecordKey => {
-
if let Ok(value) = Rkey::new(value) {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::RecordKey(RecordKey::from(value))),
-
);
-
} else {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::String(value.into())),
-
);
-
}
-
}
-
LexiconStringType::Uri(_) => {
-
if let Ok(uri) = Uri::new(value) {
-
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Uri(uri)));
-
} else {
-
map.insert(
-
key.to_smolstr(),
-
Data::String(AtprotoStr::String(value.into())),
-
);
-
}
-
}
-
LexiconStringType::String => {
-
map.insert(key.to_smolstr(), Data::String(parse_string(value)));
-
}
-
}
-
Ok(())
-
}
-
-
/// smarter parsing to avoid trying as many posibilities.
-
pub fn parse_string<'s>(string: &'s str) -> AtprotoStr<'s> {
-
if string.len() < 2048 && string.starts_with("did:") {
-
if let Ok(did) = Did::new(string) {
-
return AtprotoStr::Did(did);
-
}
-
} else if string.starts_with("20") && string.ends_with("Z") {
-
// probably a date (for the next 75 years)
-
if let Ok(datetime) = Datetime::from_str(string) {
-
return AtprotoStr::Datetime(datetime);
-
}
-
} else if string.starts_with("at://") {
-
if let Ok(uri) = AtUri::new(string) {
-
return AtprotoStr::AtUri(uri);
-
}
-
} else if string.starts_with("https://") {
-
if let Ok(uri) = Url::parse(string) {
-
return AtprotoStr::Uri(Uri::Https(uri));
-
}
-
} else if string.starts_with("wss://") {
-
if let Ok(uri) = Url::parse(string) {
-
return AtprotoStr::Uri(Uri::Https(uri));
-
}
-
} else if string.starts_with("ipfs://") {
-
return AtprotoStr::Uri(Uri::Cid(Cid::str(string)));
-
} else if string.contains('.') && !string.contains([' ', '\n']) {
-
if string.len() < 253 && Url::parse(string).is_ok() {
-
// probably a handle
-
if let Ok(handle) = AtIdentifier::new(string) {
-
return AtprotoStr::AtIdentifier(handle);
-
} else {
-
return AtprotoStr::Uri(Uri::Any(string.into()));
-
}
-
} else if let Ok(nsid) = Nsid::new(string) {
-
return AtprotoStr::Nsid(nsid);
-
}
-
} else if string.len() == 13 {
-
if let Ok(tid) = Tid::new(string) {
-
return AtprotoStr::Tid(tid);
-
}
-
} else if !string.contains([' ', '\n']) {
-
// cid?
-
if let Ok(cid) = Cid::new(string.as_bytes()) {
-
return AtprotoStr::Cid(cid);
-
}
-
}
-
-
AtprotoStr::String(string.into())
-
}
-
-
/// First-level guess at what we should parse the corresponding value as
-
/// Helps speed up parsing, avoids some ambiguities.
-
pub fn string_key_type_guess(key: &str) -> DataModelType {
-
match key {
-
"cid" => DataModelType::String(LexiconStringType::Cid),
-
"uri" => DataModelType::String(LexiconStringType::Uri(super::UriType::Any)),
-
"did" => DataModelType::String(LexiconStringType::Did),
-
"handle" => DataModelType::String(LexiconStringType::AtIdentifier),
-
"ref" => DataModelType::CidLink,
-
"list" => DataModelType::String(LexiconStringType::AtUri),
-
"blobref" => DataModelType::Blob,
-
"createdAt" | "created" | "indexedAt" | "issuedAt" | "updatedAt" | "playedTime" => {
-
DataModelType::String(LexiconStringType::Datetime)
-
}
-
"size" | "width" | "height" => DataModelType::Integer,
-
"value" | "record" | "embed" => DataModelType::Object,
-
"text" | "displayName" | "alt" | "name" | "description" => {
-
DataModelType::String(LexiconStringType::String)
-
}
-
"langs" | "blobs" | "images" | "labels" => DataModelType::Array,
-
"$bytes" => DataModelType::Bytes,
-
"$link" => DataModelType::String(LexiconStringType::Cid),
-
"$type" => DataModelType::String(LexiconStringType::String),
-
-
// we assume others are strings speficially because it's easy to check if a serde_json::Value
-
// or Ipld value is at least a string, and then we fall back to Object/Map.
-
_ => DataModelType::String(LexiconStringType::String),
-
}
-
}
-
-
pub fn cbor_to_blob<'b>(blob: &'b BTreeMap<String, Ipld>) -> Option<Blob<'b>> {
-
let mime_type = blob.get("mimeType").and_then(|o| {
-
if let Ipld::String(string) = o {
-
Some(string)
-
} else {
-
None
-
}
-
});
-
if let Some(Ipld::Link(value)) = blob.get("ref") {
-
let size = blob.get("size").and_then(|o| {
-
if let Ipld::Integer(i) = o {
-
Some(*i as i64)
-
} else {
-
None
-
}
-
});
-
if let (Some(mime_type), Some(size)) = (mime_type, size) {
-
return Some(Blob {
-
r#ref: Cid::ipld(*value),
-
mime_type: MimeType::raw(mime_type),
-
size: size as usize,
-
});
-
}
-
} else if let Some(Ipld::String(value)) = blob.get("cid") {
-
if let Some(mime_type) = mime_type {
-
return Some(Blob {
-
r#ref: Cid::str(value),
-
mime_type: MimeType::raw(mime_type),
-
size: 0,
-
});
-
}
-
}
-
-
None
-
}
-
-
pub fn json_to_blob<'b>(blob: &'b serde_json::Map<String, serde_json::Value>) -> Option<Blob<'b>> {
-
let mime_type = blob.get("mimeType").and_then(|v| v.as_str());
-
if let Some(value) = blob.get("ref") {
-
if let Some(value) = value
-
.as_object()
-
.and_then(|o| o.get("$link"))
-
.and_then(|v| v.as_str())
-
{
-
let size = blob.get("size").and_then(|v| v.as_u64());
-
if let (Some(mime_type), Some(size)) = (mime_type, size) {
-
return Some(Blob {
-
r#ref: Cid::str(value),
-
mime_type: MimeType::raw(mime_type),
-
size: size as usize,
-
});
-
}
-
}
-
} else if let Some(value) = blob.get("cid").and_then(|v| v.as_str()) {
-
if let Some(mime_type) = mime_type {
-
return Some(Blob {
-
r#ref: Cid::str(value),
-
mime_type: MimeType::raw(mime_type),
-
size: 0,
-
});
-
}
-
}
-
-
None
-
}
-
-
pub fn infer_from_type(type_field: &str) -> DataModelType {
-
match type_field {
-
"blob" => DataModelType::Blob,
-
_ => DataModelType::Object,
-
}
-
}
-
-
pub fn decode_bytes<'s>(bytes: &'s str) -> Data<'s> {
-
// First one should just work. rest are insurance.
-
if let Ok(bytes) = BASE64_STANDARD.decode(bytes) {
-
Data::Bytes(Bytes::from_owner(bytes))
-
} else if let Ok(bytes) = BASE64_STANDARD_NO_PAD.decode(bytes) {
-
Data::Bytes(Bytes::from_owner(bytes))
-
} else if let Ok(bytes) = BASE64_URL_SAFE.decode(bytes) {
-
Data::Bytes(Bytes::from_owner(bytes))
-
} else if let Ok(bytes) = BASE64_URL_SAFE_NO_PAD.decode(bytes) {
-
Data::Bytes(Bytes::from_owner(bytes))
-
} else {
-
Data::String(AtprotoStr::String(bytes.into()))
-
}
-
}
···
+
use crate::types::{DataModelType, blob::Blob, string::*};
use bytes::Bytes;
use ipld_core::ipld::Ipld;
use smol_str::{SmolStr, ToSmolStr};
+
use std::collections::BTreeMap;
+
pub mod parsing;
+
pub mod serde_impl;
+
+
#[cfg(test)]
+
mod tests;
#[derive(Debug, Clone, PartialEq, Eq)]
pub enum Data<'s> {
···
Boolean(bool),
Integer(i64),
String(AtprotoStr<'s>),
+
Bytes(Bytes),
+
CidLink(Cid<'s>),
Array(Array<'s>),
Object(Object<'s>),
Blob(Blob<'s>),
···
json: &'s serde_json::Map<String, serde_json::Value>,
) -> Result<Data<'s>, AtDataError> {
if let Some(type_field) = json.get("$type").and_then(|v| v.as_str()) {
+
if parsing::infer_from_type(type_field) == DataModelType::Blob {
+
if let Some(blob) = parsing::json_to_blob(json) {
return Ok(Data::Blob(blob));
}
}
···
if key == "$type" {
map.insert(key.to_smolstr(), Data::from_json(value)?);
}
+
match parsing::string_key_type_guess(key) {
DataModelType::Null if value.is_null() => {
map.insert(key.to_smolstr(), Data::Null);
}
···
map.insert(key.to_smolstr(), Data::Integer(value.as_i64().unwrap()));
}
DataModelType::Bytes if value.is_string() => {
+
map.insert(
+
key.to_smolstr(),
+
parsing::decode_bytes(value.as_str().unwrap()),
+
);
}
DataModelType::CidLink => {
if let Some(value) = value.as_object() {
···
);
}
DataModelType::String(string_type) if value.is_string() => {
+
parsing::insert_string(&mut map, key, value.as_str().unwrap(), string_type)?;
}
_ => {
map.insert(key.to_smolstr(), Data::from_json(value)?);
···
pub fn from_cbor(cbor: &'s BTreeMap<String, Ipld>) -> Result<Data<'s>, AtDataError> {
if let Some(Ipld::String(type_field)) = cbor.get("$type") {
+
if parsing::infer_from_type(type_field) == DataModelType::Blob {
+
if let Some(blob) = parsing::cbor_to_blob(cbor) {
return Ok(Data::Blob(blob));
}
}
···
if key == "$type" {
map.insert(key.to_smolstr(), Data::from_cbor(value)?);
}
+
match (parsing::string_key_type_guess(key), value) {
(DataModelType::Null, Ipld::Null) => {
map.insert(key.to_smolstr(), Data::Null);
}
···
map.insert(key.to_smolstr(), Object::from_cbor(value)?);
}
(DataModelType::String(string_type), Ipld::String(value)) => {
+
parsing::insert_string(&mut map, key, value, string_type)?;
}
_ => {
map.insert(key.to_smolstr(), Data::from_cbor(value)?);
···
Ok(Data::Object(Object(map)))
}
}
+320
crates/jacquard-common/src/types/value/parsing.rs
···
···
+
use crate::{
+
IntoStatic,
+
types::{
+
DataModelType, LexiconStringType, UriType,
+
blob::{Blob, MimeType},
+
string::*,
+
value::{AtDataError, Data},
+
},
+
};
+
use base64::{
+
Engine,
+
prelude::{BASE64_STANDARD, BASE64_STANDARD_NO_PAD, BASE64_URL_SAFE, BASE64_URL_SAFE_NO_PAD},
+
};
+
use bytes::Bytes;
+
use ipld_core::ipld::Ipld;
+
use smol_str::{SmolStr, ToSmolStr};
+
use std::{collections::BTreeMap, str::FromStr};
+
use url::Url;
+
+
pub fn insert_string<'s>(
+
map: &mut BTreeMap<SmolStr, Data<'s>>,
+
key: &'s str,
+
value: &'s str,
+
string_type: LexiconStringType,
+
) -> Result<(), AtDataError> {
+
match string_type {
+
LexiconStringType::Datetime => {
+
if let Ok(datetime) = Datetime::from_str(value) {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::Datetime(datetime)),
+
);
+
} else {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::String(value.into())),
+
);
+
}
+
}
+
LexiconStringType::AtUri => {
+
if let Ok(value) = AtUri::new(value) {
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::AtUri(value)));
+
} else {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::String(value.into())),
+
);
+
}
+
}
+
LexiconStringType::Did => {
+
if let Ok(value) = Did::new(value) {
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Did(value)));
+
} else {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::String(value.into())),
+
);
+
}
+
}
+
LexiconStringType::Handle => {
+
if let Ok(value) = Handle::new(value) {
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Handle(value)));
+
} else {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::String(value.into())),
+
);
+
}
+
}
+
LexiconStringType::AtIdentifier => {
+
if let Ok(value) = AtIdentifier::new(value) {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::AtIdentifier(value)),
+
);
+
} else {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::String(value.into())),
+
);
+
}
+
}
+
LexiconStringType::Nsid => {
+
if let Ok(value) = Nsid::new(value) {
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Nsid(value)));
+
} else {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::String(value.into())),
+
);
+
}
+
}
+
LexiconStringType::Cid => {
+
if let Ok(value) = Cid::new(value.as_bytes()) {
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Cid(value)));
+
} else {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::String(value.into())),
+
);
+
}
+
}
+
LexiconStringType::Language => {
+
if let Ok(value) = Language::new(value) {
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Language(value)));
+
} else {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::String(value.into())),
+
);
+
}
+
}
+
LexiconStringType::Tid => {
+
if let Ok(value) = Tid::new(value) {
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Tid(value)));
+
} else {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::String(value.into())),
+
);
+
}
+
}
+
LexiconStringType::RecordKey => {
+
if let Ok(value) = Rkey::new(value) {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::RecordKey(RecordKey::from(value))),
+
);
+
} else {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::String(value.into())),
+
);
+
}
+
}
+
LexiconStringType::Uri(_) => {
+
if let Ok(uri) = Uri::new(value) {
+
map.insert(key.to_smolstr(), Data::String(AtprotoStr::Uri(uri)));
+
} else {
+
map.insert(
+
key.to_smolstr(),
+
Data::String(AtprotoStr::String(value.into())),
+
);
+
}
+
}
+
LexiconStringType::String => {
+
map.insert(key.to_smolstr(), Data::String(parse_string(value)));
+
}
+
}
+
Ok(())
+
}
+
+
/// smarter parsing to avoid trying as many posibilities.
+
pub fn parse_string<'s>(string: &'s str) -> AtprotoStr<'s> {
+
if string.len() < 2048 && string.starts_with("did:") {
+
if let Ok(did) = Did::new(string) {
+
return AtprotoStr::Did(did);
+
}
+
} else if string.starts_with("20") && string.ends_with("Z") {
+
// probably a date (for the next 75 years)
+
if let Ok(datetime) = Datetime::from_str(string) {
+
return AtprotoStr::Datetime(datetime);
+
}
+
} else if string.starts_with("at://") {
+
if let Ok(uri) = AtUri::new(string) {
+
return AtprotoStr::AtUri(uri);
+
}
+
} else if string.starts_with("https://") {
+
if let Ok(uri) = Url::parse(string) {
+
return AtprotoStr::Uri(Uri::Https(uri));
+
}
+
} else if string.starts_with("wss://") {
+
if let Ok(uri) = Url::parse(string) {
+
return AtprotoStr::Uri(Uri::Https(uri));
+
}
+
} else if string.starts_with("ipfs://") {
+
return AtprotoStr::Uri(Uri::Cid(Cid::str(string)));
+
} else if string.contains('.') && !string.contains([' ', '\n']) {
+
if string.len() < 253 && Url::parse(string).is_ok() {
+
// probably a handle
+
if let Ok(handle) = AtIdentifier::new(string) {
+
return AtprotoStr::AtIdentifier(handle);
+
} else {
+
return AtprotoStr::Uri(Uri::Any(string.into()));
+
}
+
} else if let Ok(nsid) = Nsid::new(string) {
+
return AtprotoStr::Nsid(nsid);
+
}
+
} else if string.len() == 13 {
+
if let Ok(tid) = Tid::new(string) {
+
return AtprotoStr::Tid(tid);
+
}
+
} else if !string.contains([' ', '\n']) && string.len() > 20 {
+
// CID: must be longer than typical short strings to avoid false positives
+
// Most CIDs are 46+ chars (base32 encoded), minimum realistic is around 30
+
if let Ok(cid) = Cid::new(string.as_bytes()) {
+
return AtprotoStr::Cid(cid);
+
}
+
}
+
+
AtprotoStr::String(string.into())
+
}
+
+
/// First-level guess at what we should parse the corresponding value as
+
/// Helps speed up parsing, avoids some ambiguities.
+
pub fn string_key_type_guess(key: &str) -> DataModelType {
+
match key {
+
"cid" => DataModelType::String(LexiconStringType::Cid),
+
"uri" => DataModelType::String(LexiconStringType::Uri(UriType::Any)),
+
"did" => DataModelType::String(LexiconStringType::Did),
+
"handle" => DataModelType::String(LexiconStringType::AtIdentifier),
+
"ref" => DataModelType::CidLink,
+
"list" => DataModelType::String(LexiconStringType::AtUri),
+
"blobref" => DataModelType::Blob,
+
"createdAt" | "created" | "indexedAt" | "issuedAt" | "updatedAt" | "playedTime" => {
+
DataModelType::String(LexiconStringType::Datetime)
+
}
+
"size" | "width" | "height" => DataModelType::Integer,
+
"value" | "record" | "embed" => DataModelType::Object,
+
"text" | "displayName" | "alt" | "name" | "description" => {
+
DataModelType::String(LexiconStringType::String)
+
}
+
"langs" | "blobs" | "images" | "labels" => DataModelType::Array,
+
"$bytes" => DataModelType::Bytes,
+
"$link" => DataModelType::String(LexiconStringType::Cid),
+
"$type" => DataModelType::String(LexiconStringType::String),
+
+
// we assume others are strings speficially because it's easy to check if a serde_json::Value
+
// or Ipld value is at least a string, and then we fall back to Object/Map.
+
_ => DataModelType::String(LexiconStringType::String),
+
}
+
}
+
+
pub fn cbor_to_blob<'b>(blob: &'b BTreeMap<String, Ipld>) -> Option<Blob<'b>> {
+
let mime_type = blob.get("mimeType").and_then(|o| {
+
if let Ipld::String(string) = o {
+
Some(string)
+
} else {
+
None
+
}
+
});
+
if let Some(Ipld::Link(value)) = blob.get("ref") {
+
let size = blob.get("size").and_then(|o| {
+
if let Ipld::Integer(i) = o {
+
Some(*i as i64)
+
} else {
+
None
+
}
+
});
+
if let (Some(mime_type), Some(size)) = (mime_type, size) {
+
return Some(Blob {
+
r#ref: Cid::ipld(*value),
+
mime_type: MimeType::raw(mime_type),
+
size: size as usize,
+
});
+
}
+
} else if let Some(Ipld::String(value)) = blob.get("cid") {
+
if let Some(mime_type) = mime_type {
+
return Some(Blob {
+
r#ref: Cid::str(value),
+
mime_type: MimeType::raw(mime_type),
+
size: 0,
+
});
+
}
+
}
+
+
None
+
}
+
+
pub fn json_to_blob<'b>(blob: &'b serde_json::Map<String, serde_json::Value>) -> Option<Blob<'b>> {
+
let mime_type = blob.get("mimeType").and_then(|v| v.as_str());
+
if let Some(value) = blob.get("ref") {
+
if let Some(value) = value
+
.as_object()
+
.and_then(|o| o.get("$link"))
+
.and_then(|v| v.as_str())
+
{
+
let size = blob.get("size").and_then(|v| v.as_u64());
+
if let (Some(mime_type), Some(size)) = (mime_type, size) {
+
return Some(Blob {
+
r#ref: Cid::str(value),
+
mime_type: MimeType::raw(mime_type),
+
size: size as usize,
+
});
+
}
+
}
+
} else if let Some(value) = blob.get("cid").and_then(|v| v.as_str()) {
+
if let Some(mime_type) = mime_type {
+
return Some(Blob {
+
r#ref: Cid::str(value),
+
mime_type: MimeType::raw(mime_type),
+
size: 0,
+
});
+
}
+
}
+
+
None
+
}
+
+
pub fn infer_from_type(type_field: &str) -> DataModelType {
+
match type_field {
+
"blob" => DataModelType::Blob,
+
_ => DataModelType::Object,
+
}
+
}
+
+
pub fn decode_bytes<'s>(bytes: &str) -> Data<'s> {
+
// First one should just work. rest are insurance.
+
if let Ok(bytes) = BASE64_STANDARD.decode(bytes) {
+
Data::Bytes(Bytes::from_owner(bytes))
+
} else if let Ok(bytes) = BASE64_STANDARD_NO_PAD.decode(bytes) {
+
Data::Bytes(Bytes::from_owner(bytes))
+
} else if let Ok(bytes) = BASE64_URL_SAFE.decode(bytes) {
+
Data::Bytes(Bytes::from_owner(bytes))
+
} else if let Ok(bytes) = BASE64_URL_SAFE_NO_PAD.decode(bytes) {
+
Data::Bytes(Bytes::from_owner(bytes))
+
} else {
+
Data::String(AtprotoStr::String(CowStr::Borrowed(bytes).into_static()))
+
}
+
}
+390
crates/jacquard-common/src/types/value/serde_impl.rs
···
···
+
use core::fmt;
+
use std::{collections::BTreeMap, str::FromStr};
+
+
use base64::{Engine, prelude::BASE64_STANDARD};
+
use bytes::Bytes;
+
use serde::{Deserialize, Deserializer, Serialize, Serializer};
+
use smol_str::SmolStr;
+
+
use crate::{
+
IntoStatic,
+
types::{
+
DataModelType, LexiconStringType,
+
blob::{Blob, MimeType},
+
string::*,
+
value::{
+
Array, AtDataError, Data, Object,
+
parsing::{decode_bytes, infer_from_type, parse_string, string_key_type_guess},
+
},
+
},
+
};
+
+
impl Serialize for Data<'_> {
+
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
+
where
+
S: Serializer,
+
{
+
match self {
+
Data::Null => serializer.serialize_none(),
+
Data::Boolean(b) => serializer.serialize_bool(*b),
+
Data::Integer(i) => serializer.serialize_i64(*i),
+
Data::String(s) => s.serialize(serializer),
+
Data::Bytes(bytes) => {
+
if serializer.is_human_readable() {
+
// JSON: {"$bytes": "base64 string"}
+
use serde::ser::SerializeMap;
+
let mut map = serializer.serialize_map(Some(1))?;
+
map.serialize_entry("$bytes", &BASE64_STANDARD.encode(bytes))?;
+
map.end()
+
} else {
+
// CBOR: raw bytes
+
serializer.serialize_bytes(bytes)
+
}
+
}
+
Data::CidLink(cid) => {
+
if serializer.is_human_readable() {
+
// JSON: {"$link": "cid_string"}
+
use serde::ser::SerializeMap;
+
let mut map = serializer.serialize_map(Some(1))?;
+
map.serialize_entry("$link", cid.as_str())?;
+
map.end()
+
} else {
+
// CBOR: raw cid (Cid's serialize handles this)
+
cid.serialize(serializer)
+
}
+
}
+
Data::Array(arr) => arr.serialize(serializer),
+
Data::Object(obj) => obj.serialize(serializer),
+
Data::Blob(blob) => blob.serialize(serializer),
+
}
+
}
+
}
+
+
impl<'de> Deserialize<'de> for Data<'de> {
+
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
+
where
+
D: Deserializer<'de>,
+
{
+
deserializer.deserialize_any(DataVisitor)
+
}
+
}
+
+
struct DataVisitor;
+
+
impl<'de: 'v, 'v> serde::de::Visitor<'v> for DataVisitor {
+
type Value = Data<'v>;
+
+
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
+
formatter.write_str("any valid AT Protocol data value")
+
}
+
+
fn visit_none<E>(self) -> Result<Self::Value, E>
+
where
+
E: serde::de::Error,
+
{
+
Ok(Data::Null)
+
}
+
+
fn visit_unit<E>(self) -> Result<Self::Value, E>
+
where
+
E: serde::de::Error,
+
{
+
Ok(Data::Null)
+
}
+
+
fn visit_bool<E>(self, v: bool) -> Result<Self::Value, E>
+
where
+
E: serde::de::Error,
+
{
+
Ok(Data::Boolean(v))
+
}
+
+
fn visit_i64<E>(self, v: i64) -> Result<Self::Value, E>
+
where
+
E: serde::de::Error,
+
{
+
Ok(Data::Integer(v))
+
}
+
+
fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>
+
where
+
E: serde::de::Error,
+
{
+
Ok(Data::Integer(v as i64))
+
}
+
+
fn visit_f64<E>(self, _v: f64) -> Result<Self::Value, E>
+
where
+
E: serde::de::Error,
+
{
+
Err(E::custom(
+
"floating point numbers not allowed in AT protocol data",
+
))
+
}
+
+
fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
+
where
+
E: serde::de::Error,
+
{
+
Ok(Data::String(AtprotoStr::String(
+
CowStr::Borrowed(v).into_static(),
+
)))
+
}
+
+
fn visit_borrowed_str<E>(self, v: &'v str) -> Result<Self::Value, E>
+
where
+
E: serde::de::Error,
+
{
+
// Don't infer type here - just store as plain string
+
// Type inference happens in apply_type_inference based on field names
+
Ok(Data::String(AtprotoStr::String(v.into())))
+
}
+
+
fn visit_string<E>(self, v: String) -> Result<Self::Value, E>
+
where
+
E: serde::de::Error,
+
{
+
Ok(Data::String(AtprotoStr::String(v.into())))
+
}
+
+
fn visit_bytes<E>(self, v: &[u8]) -> Result<Self::Value, E>
+
where
+
E: serde::de::Error,
+
{
+
Ok(Data::Bytes(Bytes::copy_from_slice(v)))
+
}
+
+
fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
+
where
+
A: serde::de::SeqAccess<'v>,
+
{
+
let mut array = Vec::new();
+
while let Some(elem) = seq.next_element()? {
+
array.push(elem);
+
}
+
Ok(Data::Array(Array(array)))
+
}
+
+
fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
+
where
+
A: serde::de::MapAccess<'v>,
+
{
+
use serde::de::Error;
+
+
// Peek at first key to check for special single-key patterns
+
let mut temp_map: BTreeMap<SmolStr, Data<'v>> = BTreeMap::new();
+
+
while let Some(key) = map.next_key::<SmolStr>()? {
+
// Check for special patterns on single-key maps
+
if temp_map.is_empty() {
+
if key.as_str() == "$link" {
+
// {"$link": "cid_string"} pattern
+
let cid_str: String = map.next_value()?;
+
// Check if there are more keys
+
if let Some(next_key) = map.next_key::<SmolStr>()? {
+
// More keys, treat as regular object
+
temp_map.insert(key, Data::String(AtprotoStr::String(cid_str.into())));
+
let next_value: Data = map.next_value()?;
+
temp_map.insert(next_key, next_value);
+
continue;
+
} else {
+
// Only key, return CidLink
+
return Ok(Data::CidLink(Cid::from(cid_str)));
+
}
+
} else if key.as_str() == "$bytes" {
+
// {"$bytes": "base64_string"} pattern
+
let bytes_str: String = map.next_value()?;
+
// Check if there are more keys
+
if map.next_key::<SmolStr>()?.is_some() {
+
// More keys, treat as regular object - shouldn't happen but handle it
+
temp_map.insert(key, Data::String(AtprotoStr::String(bytes_str.into())));
+
continue;
+
} else {
+
// Only key, decode and return bytes
+
return Ok(decode_bytes(&bytes_str));
+
}
+
}
+
}
+
+
let value: Data = map.next_value()?;
+
temp_map.insert(key, value);
+
}
+
+
// Second pass: apply type inference and check for special patterns
+
apply_type_inference(temp_map).map_err(A::Error::custom)
+
}
+
}
+
+
fn apply_type_inference<'s>(mut map: BTreeMap<SmolStr, Data<'s>>) -> Result<Data<'s>, AtDataError> {
+
// Check for CID link pattern first: {"$link": "cid_string"}
+
if map.len() == 1 {
+
if let Some(Data::String(AtprotoStr::String(link))) = map.get("$link") {
+
// Need to extract ownership, can't borrow from map we're about to consume
+
let link_owned = link.clone();
+
return Ok(Data::CidLink(Cid::cow_str(link_owned)));
+
}
+
}
+
+
// Check for $type field to detect special structures
+
let type_field = map.get("$type").and_then(|v| {
+
if let Data::String(AtprotoStr::String(s)) = v {
+
Some(s.as_ref())
+
} else {
+
None
+
}
+
});
+
+
// Check for blob
+
if let Some(type_str) = type_field {
+
if type_str == "blob" && infer_from_type(type_str) == DataModelType::Blob {
+
// Try to construct blob from the collected data
+
let ref_cid = map.get("ref").and_then(|v| {
+
if let Data::CidLink(cid) = v {
+
Some(cid.clone())
+
} else {
+
None
+
}
+
});
+
+
let mime_type = map.get("mimeType").and_then(|v| {
+
if let Data::String(AtprotoStr::String(s)) = v {
+
Some(s.clone())
+
} else {
+
None
+
}
+
});
+
+
let size = map.get("size").and_then(|v| {
+
if let Data::Integer(i) = v {
+
Some(*i as usize)
+
} else {
+
None
+
}
+
});
+
+
if let (Some(ref_cid), Some(mime_cowstr), Some(size)) = (ref_cid, mime_type, size) {
+
return Ok(Data::Blob(Blob {
+
r#ref: ref_cid,
+
mime_type: MimeType::from(mime_cowstr),
+
size,
+
}));
+
}
+
}
+
}
+
+
// Apply type inference for string fields based on key names (mutate in place)
+
for (key, value) in map.iter_mut() {
+
if let Data::String(AtprotoStr::String(s)) = value.to_owned() {
+
let type_hint = string_key_type_guess(key.as_str());
+
let refined = match type_hint {
+
DataModelType::String(string_type) => refine_string_by_type(s, string_type),
+
DataModelType::Bytes => {
+
// Decode base64
+
decode_bytes(&s)
+
}
+
DataModelType::CidLink if key.as_str() == "$link" => {
+
Data::CidLink(Cid::from_str(&s).unwrap())
+
}
+
_ => continue, // no refinement needed
+
};
+
*value = refined;
+
}
+
}
+
+
Ok(Data::Object(Object(map)))
+
}
+
+
fn refine_string_by_type<'s>(s: CowStr<'s>, string_type: LexiconStringType) -> Data<'s> {
+
match string_type {
+
LexiconStringType::Datetime => Datetime::from_str(&s)
+
.map(|dt| Data::String(AtprotoStr::Datetime(dt)))
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
+
LexiconStringType::AtUri => AtUri::new_owned(s.clone())
+
.map(|uri| Data::String(AtprotoStr::AtUri(uri)))
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
+
LexiconStringType::Did => Did::new_owned(s.clone())
+
.map(|did| Data::String(AtprotoStr::Did(did)))
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
+
LexiconStringType::Handle => Handle::new_owned(s.clone())
+
.map(|handle| Data::String(AtprotoStr::Handle(handle)))
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
+
LexiconStringType::AtIdentifier => AtIdentifier::new_owned(s.clone())
+
.map(|ident| Data::String(AtprotoStr::AtIdentifier(ident)))
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
+
LexiconStringType::Nsid => Nsid::new_owned(s.clone())
+
.map(|nsid| Data::String(AtprotoStr::Nsid(nsid)))
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
+
LexiconStringType::Cid => Cid::new_owned(s.as_bytes())
+
.map(|cid| Data::String(AtprotoStr::Cid(cid)))
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.into()))),
+
LexiconStringType::Language => Language::new(&s)
+
.map(|lang| Data::String(AtprotoStr::Language(lang)))
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
+
LexiconStringType::Tid => Tid::new(s.clone())
+
.map(|tid| Data::String(AtprotoStr::Tid(tid)))
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
+
LexiconStringType::RecordKey => Rkey::new_owned(s.clone())
+
.map(|rkey| Data::String(AtprotoStr::RecordKey(RecordKey::from(rkey))))
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
+
LexiconStringType::Uri(_) => Uri::new_owned(s.clone())
+
.map(|uri| Data::String(AtprotoStr::Uri(uri)))
+
.unwrap_or_else(|_| Data::String(AtprotoStr::String(s.clone()))),
+
LexiconStringType::String => Data::String(parse_string(&s).into_static()),
+
}
+
}
+
+
impl Serialize for Array<'_> {
+
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
+
where
+
S: Serializer,
+
{
+
use serde::ser::SerializeSeq;
+
let mut seq = serializer.serialize_seq(Some(self.0.len()))?;
+
for item in &self.0 {
+
seq.serialize_element(item)?;
+
}
+
seq.end()
+
}
+
}
+
+
impl<'de> Deserialize<'de> for Array<'de> {
+
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
+
where
+
D: Deserializer<'de>,
+
{
+
// Just deserialize as Vec<Data> directly - the Data visitor handles everything
+
let vec: Vec<Data<'de>> = Deserialize::deserialize(deserializer)?;
+
Ok(Array(vec))
+
}
+
}
+
+
impl Serialize for Object<'_> {
+
fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
+
where
+
S: Serializer,
+
{
+
use serde::ser::SerializeMap;
+
let mut map = serializer.serialize_map(Some(self.0.len()))?;
+
for (key, value) in &self.0 {
+
map.serialize_entry(key.as_str(), value)?;
+
}
+
map.end()
+
}
+
}
+
+
impl<'de> Deserialize<'de> for Object<'de> {
+
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
+
where
+
D: Deserializer<'de>,
+
{
+
use serde::de::Error;
+
+
// Deserialize via Data, then extract the Object
+
// The Data visitor handles all the type inference and special cases
+
let data: Data<'de> = Data::deserialize(deserializer)?;
+
match data {
+
Data::Object(obj) => Ok(obj),
+
_ => Err(D::Error::custom("expected object, got something else")),
+
}
+
}
+
}
+364
crates/jacquard-common/src/types/value/test_thread.json
···
···
+
{
+
"hasOtherReplies": false,
+
"thread": [
+
{
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k",
+
"depth": 0,
+
"value": {
+
"$type": "app.bsky.unspecced.defs#threadItemPost",
+
"post": {
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k",
+
"cid": "bafyreicvplbzmlrbwdxv2zpbhibziexxnwmiskvzbapjtnidofzlh4yk64",
+
"author": {
+
"did": "did:plc:hbpefio3f5csc44msmbgioxz",
+
"handle": "sharonk.bsky.social",
+
"displayName": "Sharon",
+
"avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:hbpefio3f5csc44msmbgioxz/bafkreia7dcruptjvvv7t46322zqsuqukkwblihzrm3f45r246o5zjulyn4@jpeg",
+
"associated": {
+
"chat": { "allowIncoming": "following" },
+
"activitySubscription": { "allowSubscriptions": "mutuals" }
+
},
+
"viewer": {
+
"muted": false,
+
"blockedBy": false,
+
"following": "at://did:plc:yfvwmnlztr4dwkb7hwz55r2g/app.bsky.graph.follow/3l6aft4mu2324",
+
"followedBy": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.graph.follow/3kygkag25zo2v"
+
},
+
"labels": [
+
{
+
"cts": "2024-05-11T03:48:55.341Z",
+
"src": "did:plc:e4elbtctnfqocyfcml6h2lf7",
+
"uri": "did:plc:hbpefio3f5csc44msmbgioxz",
+
"val": "bluesky-elder",
+
"ver": 1
+
},
+
{
+
"src": "did:plc:hbpefio3f5csc44msmbgioxz",
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.actor.profile/self",
+
"cid": "bafyreihtzylnytd2224tatvvpktt5rwvdwppkyhlsrweshlrmnqcq32vsy",
+
"val": "!no-unauthenticated",
+
"cts": "1970-01-01T00:00:00.000Z"
+
}
+
],
+
"createdAt": "2023-04-13T22:29:27.076Z"
+
},
+
"record": {
+
"$type": "app.bsky.feed.post",
+
"createdAt": "2025-10-01T17:15:19.282Z",
+
"embed": {
+
"$type": "app.bsky.embed.record",
+
"record": {
+
"cid": "bafyreidmo5ot3qoctmgw2vcckrqzy5hexocp5vto554a5kprwxhbsy3oqi",
+
"uri": "at://did:plc:2whlowi5jjjqrdrrj4lxh2lx/app.bsky.feed.post/3m25ixj2fec2a"
+
}
+
},
+
"langs": ["en"],
+
"text": "Sora 2 going to hit boomer epistemology like a hurricane"
+
},
+
"embed": {
+
"$type": "app.bsky.embed.record#view",
+
"record": {
+
"$type": "app.bsky.embed.record#viewRecord",
+
"uri": "at://did:plc:2whlowi5jjjqrdrrj4lxh2lx/app.bsky.feed.post/3m25ixj2fec2a",
+
"cid": "bafyreidmo5ot3qoctmgw2vcckrqzy5hexocp5vto554a5kprwxhbsy3oqi",
+
"author": {
+
"did": "did:plc:2whlowi5jjjqrdrrj4lxh2lx",
+
"handle": "eliothiggins.bsky.social",
+
"displayName": "Eliot Higgins",
+
"avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:2whlowi5jjjqrdrrj4lxh2lx/bafkreiarcjakxx7hkgtfocqilj22vxgmyskl43blurk6vwu2nfvd5ihueu@jpeg",
+
"associated": { "activitySubscription": { "allowSubscriptions": "followers" } },
+
"viewer": { "muted": false, "blockedBy": false },
+
"labels": [],
+
"createdAt": "2024-11-06T08:20:47.084Z",
+
"verification": {
+
"verifications": [
+
{
+
"issuer": "did:plc:z72i7hdynmk6r22z27h6tvur",
+
"uri": "at://did:plc:z72i7hdynmk6r22z27h6tvur/app.bsky.graph.verification/3lv44jce72j2q",
+
"isValid": true,
+
"createdAt": "2025-07-29T12:33:45.033Z"
+
}
+
],
+
"verifiedStatus": "valid",
+
"trustedVerifierStatus": "none"
+
}
+
},
+
"value": {
+
"$type": "app.bsky.feed.post",
+
"createdAt": "2025-10-01T16:55:04.861Z",
+
"embed": {
+
"$type": "app.bsky.embed.video",
+
"aspectRatio": { "height": 720, "width": 1280 },
+
"video": {
+
"$type": "blob",
+
"ref": { "$link": "bafkreid7ybejd5s2vv2j7d4aajjlmdgazguemcnuliiyfn6coxpwp2mi6y" },
+
"mimeType": "video/mp4",
+
"size": 2244592
+
}
+
},
+
"langs": ["en"],
+
"text": "Really good news for fans of garbage in their timelines, Sora 2 allows anyone to use copyrighted characters to sell you cryptocurrency."
+
},
+
"labels": [],
+
"likeCount": 248,
+
"replyCount": 18,
+
"repostCount": 67,
+
"quoteCount": 52,
+
"indexedAt": "2025-10-01T16:55:06.653Z",
+
"embeds": [
+
{
+
"$type": "app.bsky.embed.video#view",
+
"cid": "bafkreid7ybejd5s2vv2j7d4aajjlmdgazguemcnuliiyfn6coxpwp2mi6y",
+
"playlist": "https://video.bsky.app/watch/did%3Aplc%3A2whlowi5jjjqrdrrj4lxh2lx/bafkreid7ybejd5s2vv2j7d4aajjlmdgazguemcnuliiyfn6coxpwp2mi6y/playlist.m3u8",
+
"thumbnail": "https://video.bsky.app/watch/did%3Aplc%3A2whlowi5jjjqrdrrj4lxh2lx/bafkreid7ybejd5s2vv2j7d4aajjlmdgazguemcnuliiyfn6coxpwp2mi6y/thumbnail.jpg",
+
"aspectRatio": { "height": 720, "width": 1280 }
+
}
+
]
+
}
+
},
+
"bookmarkCount": 3,
+
"replyCount": 1,
+
"repostCount": 7,
+
"likeCount": 76,
+
"quoteCount": 1,
+
"indexedAt": "2025-10-01T17:15:19.523Z",
+
"viewer": { "bookmarked": false, "threadMuted": false, "replyDisabled": false, "embeddingDisabled": false },
+
"labels": [],
+
"threadgate": {
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.threadgate/3m25k3p7lek2k",
+
"cid": "bafyreiclfmvpqfsfhpl4cffaw6elfdusx5wosinugljpwy2u3ckjse3rse",
+
"record": {
+
"$type": "app.bsky.feed.threadgate",
+
"allow": [
+
{ "$type": "app.bsky.feed.threadgate#followingRule" },
+
{ "$type": "app.bsky.feed.threadgate#mentionRule" },
+
{ "$type": "app.bsky.feed.threadgate#followerRule" }
+
],
+
"createdAt": "2025-10-01T17:15:19.285Z",
+
"hiddenReplies": [],
+
"post": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k"
+
},
+
"lists": []
+
}
+
},
+
"moreParents": false,
+
"moreReplies": 0,
+
"opThread": true,
+
"hiddenByThreadgate": false,
+
"mutedByViewer": false
+
}
+
},
+
{
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k6zjhps2k",
+
"depth": 1,
+
"value": {
+
"$type": "app.bsky.unspecced.defs#threadItemPost",
+
"post": {
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k6zjhps2k",
+
"cid": "bafyreieqxxi7nwep5nuhogkv3tgub4rk4pv5tbh3m6yyf66nqdmmrknwsa",
+
"author": {
+
"did": "did:plc:hbpefio3f5csc44msmbgioxz",
+
"handle": "sharonk.bsky.social",
+
"displayName": "Sharon",
+
"avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:hbpefio3f5csc44msmbgioxz/bafkreia7dcruptjvvv7t46322zqsuqukkwblihzrm3f45r246o5zjulyn4@jpeg",
+
"associated": {
+
"chat": { "allowIncoming": "following" },
+
"activitySubscription": { "allowSubscriptions": "mutuals" }
+
},
+
"viewer": {
+
"muted": false,
+
"blockedBy": false,
+
"following": "at://did:plc:yfvwmnlztr4dwkb7hwz55r2g/app.bsky.graph.follow/3l6aft4mu2324",
+
"followedBy": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.graph.follow/3kygkag25zo2v"
+
},
+
"labels": [
+
{
+
"cts": "2024-05-11T03:48:55.341Z",
+
"src": "did:plc:e4elbtctnfqocyfcml6h2lf7",
+
"uri": "did:plc:hbpefio3f5csc44msmbgioxz",
+
"val": "bluesky-elder",
+
"ver": 1
+
},
+
{
+
"src": "did:plc:hbpefio3f5csc44msmbgioxz",
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.actor.profile/self",
+
"cid": "bafyreihtzylnytd2224tatvvpktt5rwvdwppkyhlsrweshlrmnqcq32vsy",
+
"val": "!no-unauthenticated",
+
"cts": "1970-01-01T00:00:00.000Z"
+
}
+
],
+
"createdAt": "2023-04-13T22:29:27.076Z"
+
},
+
"record": {
+
"$type": "app.bsky.feed.post",
+
"createdAt": "2025-10-01T17:17:10.755Z",
+
"langs": ["en"],
+
"reply": {
+
"parent": {
+
"cid": "bafyreicvplbzmlrbwdxv2zpbhibziexxnwmiskvzbapjtnidofzlh4yk64",
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k"
+
},
+
"root": {
+
"cid": "bafyreicvplbzmlrbwdxv2zpbhibziexxnwmiskvzbapjtnidofzlh4yk64",
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k"
+
}
+
},
+
"text": "it's funny how these all feel like stuff that would be playing on the Cyberpunk 2077 mediafeeds"
+
},
+
"bookmarkCount": 0,
+
"replyCount": 2,
+
"repostCount": 1,
+
"likeCount": 28,
+
"quoteCount": 0,
+
"indexedAt": "2025-10-01T17:17:11.164Z",
+
"viewer": { "bookmarked": false, "threadMuted": false, "replyDisabled": false, "embeddingDisabled": false },
+
"labels": []
+
},
+
"moreParents": false,
+
"moreReplies": 0,
+
"opThread": true,
+
"hiddenByThreadgate": false,
+
"mutedByViewer": false
+
}
+
},
+
{
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25l6doydc26",
+
"depth": 2,
+
"value": {
+
"$type": "app.bsky.unspecced.defs#threadItemPost",
+
"post": {
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25l6doydc26",
+
"cid": "bafyreia7yd5gbxdw4djkmocr6sbg2bgczup6d6pdmctsspdrne3vhcll6q",
+
"author": {
+
"did": "did:plc:hbpefio3f5csc44msmbgioxz",
+
"handle": "sharonk.bsky.social",
+
"displayName": "Sharon",
+
"avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:hbpefio3f5csc44msmbgioxz/bafkreia7dcruptjvvv7t46322zqsuqukkwblihzrm3f45r246o5zjulyn4@jpeg",
+
"associated": {
+
"chat": { "allowIncoming": "following" },
+
"activitySubscription": { "allowSubscriptions": "mutuals" }
+
},
+
"viewer": {
+
"muted": false,
+
"blockedBy": false,
+
"following": "at://did:plc:yfvwmnlztr4dwkb7hwz55r2g/app.bsky.graph.follow/3l6aft4mu2324",
+
"followedBy": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.graph.follow/3kygkag25zo2v"
+
},
+
"labels": [
+
{
+
"cts": "2024-05-11T03:48:55.341Z",
+
"src": "did:plc:e4elbtctnfqocyfcml6h2lf7",
+
"uri": "did:plc:hbpefio3f5csc44msmbgioxz",
+
"val": "bluesky-elder",
+
"ver": 1
+
},
+
{
+
"src": "did:plc:hbpefio3f5csc44msmbgioxz",
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.actor.profile/self",
+
"cid": "bafyreihtzylnytd2224tatvvpktt5rwvdwppkyhlsrweshlrmnqcq32vsy",
+
"val": "!no-unauthenticated",
+
"cts": "1970-01-01T00:00:00.000Z"
+
}
+
],
+
"createdAt": "2023-04-13T22:29:27.076Z"
+
},
+
"record": {
+
"$type": "app.bsky.feed.post",
+
"createdAt": "2025-10-01T17:34:41.609Z",
+
"langs": ["en"],
+
"reply": {
+
"parent": {
+
"cid": "bafyreieqxxi7nwep5nuhogkv3tgub4rk4pv5tbh3m6yyf66nqdmmrknwsa",
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k6zjhps2k"
+
},
+
"root": {
+
"cid": "bafyreicvplbzmlrbwdxv2zpbhibziexxnwmiskvzbapjtnidofzlh4yk64",
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k"
+
}
+
},
+
"text": "\"Morty, I've turned myself into a rugpull, Morty!\""
+
},
+
"bookmarkCount": 0,
+
"replyCount": 0,
+
"repostCount": 0,
+
"likeCount": 3,
+
"quoteCount": 0,
+
"indexedAt": "2025-10-01T17:34:41.858Z",
+
"viewer": { "bookmarked": false, "threadMuted": false, "replyDisabled": false, "embeddingDisabled": false },
+
"labels": []
+
},
+
"moreParents": false,
+
"moreReplies": 0,
+
"opThread": true,
+
"hiddenByThreadgate": false,
+
"mutedByViewer": false
+
}
+
},
+
{
+
"uri": "at://did:plc:duaatzbzy7qm4ppl2hluilpg/app.bsky.feed.post/3m25knxro3s2t",
+
"depth": 2,
+
"value": {
+
"$type": "app.bsky.unspecced.defs#threadItemPost",
+
"post": {
+
"uri": "at://did:plc:duaatzbzy7qm4ppl2hluilpg/app.bsky.feed.post/3m25knxro3s2t",
+
"cid": "bafyreidppmo5vaepnafjlwc5doxmk7n3hrctve6cmcz67hhkcuxf5cr6tm",
+
"author": {
+
"did": "did:plc:duaatzbzy7qm4ppl2hluilpg",
+
"handle": "gentlemanengineer.bsky.social",
+
"displayName": "Chris Magerkurth",
+
"avatar": "https://cdn.bsky.app/img/avatar/plain/did:plc:duaatzbzy7qm4ppl2hluilpg/bafkreifqxgriyccwjt5pwahsx6sju2bdmtoytqyubxoxlfwk4rv3eedkia@jpeg",
+
"associated": { "activitySubscription": { "allowSubscriptions": "followers" } },
+
"viewer": { "muted": false, "blockedBy": false },
+
"labels": [],
+
"createdAt": "2025-02-22T02:03:41.644Z"
+
},
+
"record": {
+
"$type": "app.bsky.feed.post",
+
"createdAt": "2025-10-01T17:25:32.243Z",
+
"langs": ["en"],
+
"reply": {
+
"parent": {
+
"cid": "bafyreieqxxi7nwep5nuhogkv3tgub4rk4pv5tbh3m6yyf66nqdmmrknwsa",
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k6zjhps2k"
+
},
+
"root": {
+
"cid": "bafyreicvplbzmlrbwdxv2zpbhibziexxnwmiskvzbapjtnidofzlh4yk64",
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k"
+
}
+
},
+
"text": "Or on Interdimensional Cable from Rick and Morty."
+
},
+
"bookmarkCount": 0,
+
"replyCount": 0,
+
"repostCount": 0,
+
"likeCount": 1,
+
"quoteCount": 0,
+
"indexedAt": "2025-10-01T17:25:25.858Z",
+
"viewer": { "bookmarked": false, "threadMuted": false, "replyDisabled": false, "embeddingDisabled": false },
+
"labels": []
+
},
+
"moreParents": false,
+
"moreReplies": 0,
+
"opThread": false,
+
"hiddenByThreadgate": false,
+
"mutedByViewer": false
+
}
+
}
+
],
+
"threadgate": {
+
"uri": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.threadgate/3m25k3p7lek2k",
+
"cid": "bafyreiclfmvpqfsfhpl4cffaw6elfdusx5wosinugljpwy2u3ckjse3rse",
+
"record": {
+
"$type": "app.bsky.feed.threadgate",
+
"allow": [
+
{ "$type": "app.bsky.feed.threadgate#followingRule" },
+
{ "$type": "app.bsky.feed.threadgate#mentionRule" },
+
{ "$type": "app.bsky.feed.threadgate#followerRule" }
+
],
+
"createdAt": "2025-10-01T17:15:19.285Z",
+
"hiddenReplies": [],
+
"post": "at://did:plc:hbpefio3f5csc44msmbgioxz/app.bsky.feed.post/3m25k3p7lek2k"
+
},
+
"lists": []
+
}
+
}
+305
crates/jacquard-common/src/types/value/tests.rs
···
···
+
use super::*;
+
+
/// Canonicalize JSON by sorting object keys recursively
+
fn canonicalize_json(value: &serde_json::Value) -> serde_json::Value {
+
match value {
+
serde_json::Value::Object(map) => {
+
let mut sorted_map = serde_json::Map::new();
+
let mut keys: Vec<_> = map.keys().collect();
+
keys.sort();
+
for key in keys {
+
sorted_map.insert(key.clone(), canonicalize_json(&map[key]));
+
}
+
serde_json::Value::Object(sorted_map)
+
}
+
serde_json::Value::Array(arr) => {
+
serde_json::Value::Array(arr.iter().map(canonicalize_json).collect())
+
}
+
other => other.clone(),
+
}
+
}
+
+
#[test]
+
fn serialize_deserialize_null() {
+
let data = Data::Null;
+
+
// JSON roundtrip
+
let json = serde_json::to_string(&data).unwrap();
+
assert_eq!(json, "null");
+
let parsed: Data = serde_json::from_str(&json).unwrap();
+
assert_eq!(data, parsed);
+
assert!(matches!(parsed, Data::Null));
+
}
+
+
#[test]
+
fn serialize_deserialize_boolean() {
+
let data = Data::Boolean(true);
+
+
let json = serde_json::to_string(&data).unwrap();
+
assert_eq!(json, "true");
+
let parsed: Data = serde_json::from_str(&json).unwrap();
+
assert_eq!(data, parsed);
+
}
+
+
#[test]
+
fn serialize_deserialize_integer() {
+
let data = Data::Integer(42);
+
+
let json = serde_json::to_string(&data).unwrap();
+
assert_eq!(json, "42");
+
let parsed: Data = serde_json::from_str(&json).unwrap();
+
assert_eq!(data, parsed);
+
}
+
+
#[test]
+
fn serialize_deserialize_string() {
+
let data = Data::String(AtprotoStr::String("hello world".into()));
+
+
let json = serde_json::to_string(&data).unwrap();
+
assert_eq!(json, r#""hello world""#);
+
let parsed: Data = serde_json::from_str(&json).unwrap();
+
assert_eq!(data, parsed);
+
}
+
+
#[test]
+
fn serialize_deserialize_bytes_json() {
+
let data = Data::Bytes(Bytes::from_static(b"hello"));
+
+
// JSON: should be {"$bytes": "base64"}
+
let json = serde_json::to_string(&data).unwrap();
+
assert!(json.contains("$bytes"));
+
assert!(json.contains("aGVsbG8=")); // base64("hello")
+
+
let parsed: Data = serde_json::from_str(&json).unwrap();
+
assert_eq!(data, parsed);
+
}
+
+
#[test]
+
fn serialize_deserialize_cid_link_json() {
+
let data = Data::CidLink(Cid::str("bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha"));
+
+
// JSON: should be {"$link": "cid_string"}
+
let json = serde_json::to_string(&data).unwrap();
+
assert!(json.contains("$link"));
+
assert!(json.contains("bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha"));
+
+
let parsed: Data = serde_json::from_str(&json).unwrap();
+
match parsed {
+
Data::CidLink(cid) => assert_eq!(cid.as_str(), "bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha"),
+
_ => panic!("expected CidLink"),
+
}
+
}
+
+
#[test]
+
fn serialize_deserialize_array() {
+
let data = Data::Array(Array(vec![
+
Data::Null,
+
Data::Boolean(true),
+
Data::Integer(42),
+
Data::String(AtprotoStr::String("test".into())),
+
]));
+
+
let json = serde_json::to_string(&data).unwrap();
+
let parsed: Data = serde_json::from_str(&json).unwrap();
+
assert_eq!(data, parsed);
+
+
// Verify structure
+
if let Data::Array(Array(items)) = parsed {
+
assert_eq!(items.len(), 4);
+
assert!(matches!(items[0], Data::Null));
+
assert!(matches!(items[1], Data::Boolean(true)));
+
assert!(matches!(items[2], Data::Integer(42)));
+
if let Data::String(AtprotoStr::String(s)) = &items[3] {
+
assert_eq!(s.as_ref(), "test");
+
} else {
+
panic!("expected plain string");
+
}
+
} else {
+
panic!("expected array");
+
}
+
}
+
+
#[test]
+
fn serialize_deserialize_object() {
+
let mut map = BTreeMap::new();
+
map.insert("name".to_smolstr(), Data::String(AtprotoStr::String("alice".into())));
+
map.insert("age".to_smolstr(), Data::Integer(30));
+
map.insert("active".to_smolstr(), Data::Boolean(true));
+
+
let data = Data::Object(Object(map));
+
+
let json = serde_json::to_string(&data).unwrap();
+
let parsed: Data = serde_json::from_str(&json).unwrap();
+
assert_eq!(data, parsed);
+
}
+
+
#[test]
+
fn type_inference_datetime() {
+
// Field name "createdAt" should infer datetime type
+
let json = r#"{"createdAt": "2023-01-15T12:30:45.123456Z"}"#;
+
let data: Data = serde_json::from_str(json).unwrap();
+
+
if let Data::Object(obj) = data {
+
if let Some(Data::String(AtprotoStr::Datetime(dt))) = obj.0.get("createdAt") {
+
// Verify it's actually parsed correctly
+
assert_eq!(dt.as_str(), "2023-01-15T12:30:45.123456Z");
+
} else {
+
panic!("createdAt should be parsed as Datetime");
+
}
+
} else {
+
panic!("expected object");
+
}
+
}
+
+
#[test]
+
fn type_inference_did() {
+
let json = r#"{"did": "did:plc:abc123"}"#;
+
let data: Data = serde_json::from_str(json).unwrap();
+
+
if let Data::Object(obj) = data {
+
if let Some(Data::String(AtprotoStr::Did(did))) = obj.0.get("did") {
+
assert_eq!(did.as_str(), "did:plc:abc123");
+
} else {
+
panic!("did should be parsed as Did");
+
}
+
} else {
+
panic!("expected object");
+
}
+
}
+
+
#[test]
+
fn type_inference_uri() {
+
let json = r#"{"uri": "at://alice.test/com.example.foo/123"}"#;
+
let data: Data = serde_json::from_str(json).unwrap();
+
+
if let Data::Object(obj) = data {
+
// "uri" field gets inferred as Uri type, but at:// should parse to AtUri
+
match obj.0.get("uri") {
+
Some(Data::String(AtprotoStr::AtUri(_))) | Some(Data::String(AtprotoStr::Uri(_))) => {
+
// Success
+
}
+
_ => panic!("uri should be parsed as Uri or AtUri"),
+
}
+
} else {
+
panic!("expected object");
+
}
+
}
+
+
#[test]
+
fn blob_deserialization() {
+
let json = r#"{
+
"$type": "blob",
+
"ref": {"$link": "bafyreih4g7bvo6hdq2juolev5bfzpbo4ewkxh5mzxwgvkjp3kitc6hqkha"},
+
"mimeType": "image/png",
+
"size": 12345
+
}"#;
+
+
let data: Data = serde_json::from_str(json).unwrap();
+
+
if let Data::Blob(blob) = data {
+
assert_eq!(blob.mime_type.as_str(), "image/png");
+
assert_eq!(blob.size, 12345);
+
} else {
+
panic!("expected blob");
+
}
+
}
+
+
#[test]
+
fn reject_floats() {
+
let json = "42.5"; // float literal
+
+
let result: Result<Data, _> = serde_json::from_str(json);
+
assert!(result.is_err());
+
}
+
+
#[test]
+
fn nested_objects() {
+
let json = r#"{
+
"user": {
+
"name": "alice",
+
"profile": {
+
"bio": "test bio",
+
"createdAt": "2023-01-15T12:30:45Z"
+
}
+
}
+
}"#;
+
+
let data: Data = serde_json::from_str(json).unwrap();
+
+
// Should successfully parse with nested type inference
+
if let Data::Object(obj) = data {
+
assert!(obj.0.contains_key("user"));
+
} else {
+
panic!("expected object");
+
}
+
}
+
+
#[test]
+
fn integration_bluesky_thread() {
+
// Real bluesky thread data with complex nested structures
+
let json = include_str!("test_thread.json");
+
let data: Data = serde_json::from_str(json).unwrap();
+
+
// Verify top-level structure
+
if let Data::Object(obj) = data {
+
// Should have "thread" array
+
assert!(obj.0.contains_key("thread"));
+
+
// Verify thread is an array
+
if let Some(Data::Array(thread)) = obj.0.get("thread") {
+
assert!(!thread.0.is_empty());
+
+
// Check first thread item
+
if let Some(Data::Object(item)) = thread.0.first() {
+
// Should have "uri" field parsed as AtUri
+
if let Some(Data::String(AtprotoStr::AtUri(uri))) = item.0.get("uri") {
+
assert!(uri.as_str().starts_with("at://did:plc:"));
+
}
+
+
// Should have "value" object
+
if let Some(Data::Object(value)) = item.0.get("value") {
+
// Should have post object
+
if let Some(Data::Object(post)) = value.0.get("post") {
+
// CID should be parsed as Cid
+
if let Some(Data::String(AtprotoStr::Cid(cid))) = post.0.get("cid") {
+
assert!(cid.as_str().starts_with("bafy"));
+
}
+
+
// Author should have DID
+
if let Some(Data::Object(author)) = post.0.get("author") {
+
if let Some(Data::String(AtprotoStr::Did(did))) = author.0.get("did") {
+
assert!(did.as_str().starts_with("did:plc:"));
+
}
+
+
// createdAt should be parsed as Datetime
+
if let Some(Data::String(AtprotoStr::Datetime(_))) =
+
author.0.get("createdAt")
+
{
+
// Success
+
} else {
+
panic!("author.createdAt should be Datetime");
+
}
+
}
+
}
+
}
+
}
+
} else {
+
panic!("thread should be an array");
+
}
+
+
// Verify serialization produces same JSON structure
+
let serialized = serde_json::to_string(&obj).unwrap();
+
+
// Parse both as generic serde_json::Value to compare structure
+
let original_value: serde_json::Value = serde_json::from_str(json).unwrap();
+
let serialized_value: serde_json::Value = serde_json::from_str(&serialized).unwrap();
+
+
// Canonicalize by sorting keys
+
let original_canonical = canonicalize_json(&original_value);
+
let serialized_canonical = canonicalize_json(&serialized_value);
+
+
assert_eq!(original_canonical, serialized_canonical, "Serialized JSON should match original structure")
+
} else {
+
panic!("expected top-level object");
+
}
+
}
+29
crates/jacquard-derive/Cargo.toml
···
···
+
[package]
+
name = "jacquard-derive"
+
edition.workspace = true
+
version.workspace = true
+
authors.workspace = true
+
repository.workspace = true
+
keywords.workspace = true
+
categories.workspace = true
+
readme.workspace = true
+
documentation.workspace = true
+
exclude.workspace = true
+
description.workspace = true
+
+
[lib]
+
proc-macro = true
+
+
[dependencies]
+
heck = "0.5.0"
+
itertools = "0.14.0"
+
jacquard-common = { version = "0.1.0", path = "../jacquard-common" }
+
jacquard-lexicon = { version = "0.1.0", path = "../jacquard-lexicon" }
+
prettyplease = "0.2.37"
+
proc-macro2 = "1.0.101"
+
quote = "1.0.41"
+
serde = { version = "1.0.228", features = ["derive"] }
+
serde_json = "1.0.145"
+
serde_repr = "0.1.20"
+
serde_with = "3.14.1"
+
syn = "2.0.106"
+117
crates/jacquard-derive/src/lib.rs
···
···
+
use proc_macro::TokenStream;
+
use quote::quote;
+
use syn::{Data, DeriveInput, Fields, parse_macro_input};
+
+
/// Attribute macro that adds an `extra_data` field to structs to capture unknown fields
+
/// during deserialization.
+
///
+
/// # Example
+
/// ```ignore
+
/// #[lexicon]
+
/// struct Post<'s> {
+
/// text: &'s str,
+
/// }
+
/// // Expands to:
+
/// // struct Post<'s> {
+
/// // text: &'s str,
+
/// // #[serde(flatten)]
+
/// // pub extra_data: BTreeMap<SmolStr, Data<'s>>,
+
/// // }
+
/// ```
+
#[proc_macro_attribute]
+
pub fn lexicon(_attr: TokenStream, item: TokenStream) -> TokenStream {
+
let mut input = parse_macro_input!(item as DeriveInput);
+
+
match &mut input.data {
+
Data::Struct(data_struct) => {
+
if let Fields::Named(fields) = &mut data_struct.fields {
+
// Check if extra_data field already exists
+
let has_extra_data = fields
+
.named
+
.iter()
+
.any(|f| f.ident.as_ref().map(|i| i == "extra_data").unwrap_or(false));
+
+
if !has_extra_data {
+
// Determine the lifetime parameter to use
+
let lifetime = if let Some(lt) = input.generics.lifetimes().next() {
+
quote! { #lt }
+
} else {
+
quote! { 'static }
+
};
+
+
// Add the extra_data field
+
let new_field: syn::Field = syn::parse_quote! {
+
#[serde(flatten)]
+
pub extra_data: ::std::collections::BTreeMap<
+
::jacquard_common::smol_str::SmolStr,
+
::jacquard_common::types::value::Data<#lifetime>
+
>
+
};
+
fields.named.push(new_field);
+
}
+
} else {
+
return syn::Error::new_spanned(
+
input,
+
"lexicon attribute can only be used on structs with named fields",
+
)
+
.to_compile_error()
+
.into();
+
}
+
+
quote! { #input }.into()
+
}
+
_ => syn::Error::new_spanned(input, "lexicon attribute can only be used on structs")
+
.to_compile_error()
+
.into(),
+
}
+
}
+
+
/// Attribute macro that adds an `Other(Data)` variant to enums to make them open unions.
+
///
+
/// # Example
+
/// ```ignore
+
/// #[open_union]
+
/// enum RecordEmbed<'s> {
+
/// #[serde(rename = "app.bsky.embed.images")]
+
/// Images(Images),
+
/// }
+
/// // Expands to:
+
/// // enum RecordEmbed<'s> {
+
/// // #[serde(rename = "app.bsky.embed.images")]
+
/// // Images(Images),
+
/// // #[serde(untagged)]
+
/// // Unknown(Data<'s>),
+
/// // }
+
/// ```
+
#[proc_macro_attribute]
+
pub fn open_union(_attr: TokenStream, item: TokenStream) -> TokenStream {
+
let mut input = parse_macro_input!(item as DeriveInput);
+
+
match &mut input.data {
+
Data::Enum(data_enum) => {
+
// Check if Other variant already exists
+
let has_other = data_enum.variants.iter().any(|v| v.ident == "Other");
+
+
if !has_other {
+
// Determine the lifetime parameter to use
+
let lifetime = if let Some(lt) = input.generics.lifetimes().next() {
+
quote! { #lt }
+
} else {
+
quote! { 'static }
+
};
+
+
// Add the Other variant
+
let new_variant: syn::Variant = syn::parse_quote! {
+
#[serde(untagged)]
+
Unknown(::jacquard_common::types::value::Data<#lifetime>)
+
};
+
data_enum.variants.push(new_variant);
+
}
+
+
quote! { #input }.into()
+
}
+
_ => syn::Error::new_spanned(input, "open_union attribute can only be used on enums")
+
.to_compile_error()
+
.into(),
+
}
+
}
+89
crates/jacquard-derive/tests/lexicon.rs
···
···
+
use jacquard_derive::lexicon;
+
use serde::{Deserialize, Serialize};
+
+
#[lexicon]
+
#[derive(Serialize, Deserialize, Debug, PartialEq)]
+
#[serde(rename_all = "camelCase")]
+
struct TestRecord<'s> {
+
text: &'s str,
+
count: i64,
+
}
+
+
#[test]
+
fn test_lexicon_adds_extra_data_field() {
+
let json = r#"{"text":"hello","count":42,"unknown":"field","another":123}"#;
+
+
let record: TestRecord = serde_json::from_str(json).unwrap();
+
+
assert_eq!(record.text, "hello");
+
assert_eq!(record.count, 42);
+
assert_eq!(record.extra_data.len(), 2);
+
assert!(record.extra_data.contains_key("unknown"));
+
assert!(record.extra_data.contains_key("another"));
+
}
+
+
#[test]
+
fn test_lexicon_roundtrip() {
+
use jacquard_common::CowStr;
+
use jacquard_common::types::value::Data;
+
use std::collections::BTreeMap;
+
+
let mut extra = BTreeMap::new();
+
extra.insert(
+
"custom".into(),
+
Data::String(jacquard_common::types::string::AtprotoStr::String(
+
CowStr::Borrowed("value"),
+
)),
+
);
+
extra.insert(
+
"number".into(),
+
Data::Integer(42),
+
);
+
extra.insert(
+
"nested".into(),
+
Data::Object(jacquard_common::types::value::Object({
+
let mut nested_map = BTreeMap::new();
+
nested_map.insert(
+
"inner".into(),
+
Data::Boolean(true),
+
);
+
nested_map
+
})),
+
);
+
+
let record = TestRecord {
+
text: "test",
+
count: 100,
+
extra_data: extra,
+
};
+
+
let json = serde_json::to_string(&record).unwrap();
+
let parsed: TestRecord = serde_json::from_str(&json).unwrap();
+
+
assert_eq!(record, parsed);
+
assert_eq!(parsed.extra_data.len(), 3);
+
+
// Verify the extra fields were preserved
+
assert!(parsed.extra_data.contains_key("custom"));
+
assert!(parsed.extra_data.contains_key("number"));
+
assert!(parsed.extra_data.contains_key("nested"));
+
+
// Verify the values
+
if let Some(Data::String(s)) = parsed.extra_data.get("custom") {
+
assert_eq!(s.as_str(), "value");
+
} else {
+
panic!("expected custom field to be a string");
+
}
+
+
if let Some(Data::Integer(n)) = parsed.extra_data.get("number") {
+
assert_eq!(*n, 42);
+
} else {
+
panic!("expected number field to be an integer");
+
}
+
+
if let Some(Data::Object(obj)) = parsed.extra_data.get("nested") {
+
assert!(obj.0.contains_key("inner"));
+
} else {
+
panic!("expected nested field to be an object");
+
}
+
}
+117
crates/jacquard-derive/tests/open_union.rs
···
···
+
use jacquard_derive::open_union;
+
use serde::{Deserialize, Serialize};
+
+
#[open_union]
+
#[derive(Serialize, Deserialize, Debug, PartialEq)]
+
#[serde(tag = "$type")]
+
enum TestUnion<'s> {
+
#[serde(rename = "com.example.typeA")]
+
TypeA { value: &'s str },
+
#[serde(rename = "com.example.typeB")]
+
TypeB { count: i64 },
+
}
+
+
#[test]
+
fn test_open_union_known_variant() {
+
let json = r#"{"$type":"com.example.typeA","value":"hello"}"#;
+
let union: TestUnion = serde_json::from_str(json).unwrap();
+
+
match union {
+
TestUnion::TypeA { value } => assert_eq!(value, "hello"),
+
_ => panic!("expected TypeA"),
+
}
+
}
+
+
#[test]
+
fn test_open_union_unknown_variant() {
+
use jacquard_common::types::value::{Data, Object};
+
+
let json = r#"{"$type":"com.example.unknown","data":"something"}"#;
+
let union: TestUnion = serde_json::from_str(json).unwrap();
+
+
match union {
+
TestUnion::Unknown(Data::Object(obj)) => {
+
// Verify the captured data contains the expected fields
+
assert!(obj.0.contains_key("$type"));
+
assert!(obj.0.contains_key("data"));
+
+
// Check the actual values
+
if let Some(Data::String(type_str)) = obj.0.get("$type") {
+
assert_eq!(type_str.as_str(), "com.example.unknown");
+
} else {
+
panic!("expected $type field to be a string");
+
}
+
+
if let Some(Data::String(data_str)) = obj.0.get("data") {
+
assert_eq!(data_str.as_str(), "something");
+
} else {
+
panic!("expected data field to be a string");
+
}
+
}
+
_ => panic!("expected Unknown variant with Object data"),
+
}
+
}
+
+
#[test]
+
fn test_open_union_roundtrip() {
+
let union = TestUnion::TypeB { count: 42 };
+
let json = serde_json::to_string(&union).unwrap();
+
let parsed: TestUnion = serde_json::from_str(&json).unwrap();
+
+
assert_eq!(union, parsed);
+
+
// Verify the $type field is present
+
assert!(json.contains(r#""$type":"com.example.typeB""#));
+
}
+
+
#[test]
+
fn test_open_union_unknown_roundtrip() {
+
use jacquard_common::types::value::{Data, Object};
+
use std::collections::BTreeMap;
+
+
// Create an Unknown variant with complex data
+
let mut map = BTreeMap::new();
+
map.insert(
+
"$type".into(),
+
Data::String(jacquard_common::types::string::AtprotoStr::String(
+
"com.example.custom".into(),
+
)),
+
);
+
map.insert("field1".into(), Data::Integer(123));
+
map.insert("field2".into(), Data::Boolean(false));
+
+
let union = TestUnion::Unknown(Data::Object(Object(map)));
+
+
let json = serde_json::to_string(&union).unwrap();
+
let parsed: TestUnion = serde_json::from_str(&json).unwrap();
+
+
// Should deserialize back as Unknown since the type is not recognized
+
match parsed {
+
TestUnion::Unknown(Data::Object(obj)) => {
+
assert_eq!(obj.0.len(), 3);
+
assert!(obj.0.contains_key("$type"));
+
assert!(obj.0.contains_key("field1"));
+
assert!(obj.0.contains_key("field2"));
+
+
// Verify values
+
if let Some(Data::String(s)) = obj.0.get("$type") {
+
assert_eq!(s.as_str(), "com.example.custom");
+
} else {
+
panic!("expected $type to be a string");
+
}
+
+
if let Some(Data::Integer(n)) = obj.0.get("field1") {
+
assert_eq!(*n, 123);
+
} else {
+
panic!("expected field1 to be an integer");
+
}
+
+
if let Some(Data::Boolean(b)) = obj.0.get("field2") {
+
assert_eq!(*b, false);
+
} else {
+
panic!("expected field2 to be a boolean");
+
}
+
}
+
_ => panic!("expected Unknown variant"),
+
}
+
}
+11
crates/jacquard-lexicon/Cargo.toml
···
description.workspace = true
[dependencies]
···
description.workspace = true
[dependencies]
+
heck = "0.5.0"
+
itertools = "0.14.0"
+
jacquard-common = { version = "0.1.0", path = "../jacquard-common" }
+
prettyplease = "0.2.37"
+
proc-macro2 = "1.0.101"
+
quote = "1.0.41"
+
serde = { version = "1.0.228", features = ["derive"] }
+
serde_json = "1.0.145"
+
serde_repr = "0.1.20"
+
serde_with = "3.14.1"
+
syn = "2.0.106"
+36
crates/jacquard-lexicon/src/fs.rs
···
···
+
// Forked from atrium-codegen
+
// https://github.com/sugyan/atrium/blob/main/lexicon/atrium-codegen/src/fs.rs
+
+
use std::ffi::OsStr;
+
use std::fs::read_dir;
+
use std::io::Result;
+
use std::path::{Path, PathBuf};
+
+
fn walk<F>(path: &Path, results: &mut Vec<PathBuf>, f: &mut F) -> Result<()>
+
where
+
F: FnMut(&Path) -> bool,
+
{
+
if f(path) {
+
results.push(path.into());
+
}
+
if path.is_dir() {
+
for entry in read_dir(path)? {
+
walk(&entry?.path(), results, f)?;
+
}
+
}
+
Ok(())
+
}
+
+
pub(crate) fn find_schemas(path: &Path) -> Result<Vec<impl AsRef<Path>>> {
+
let mut results = Vec::new();
+
walk(path, &mut results, &mut |path| {
+
path.extension().and_then(OsStr::to_str) == Some("json")
+
})?;
+
Ok(results)
+
}
+
+
pub(crate) fn find_dirs(path: &Path) -> Result<Vec<impl AsRef<Path>>> {
+
let mut results = Vec::new();
+
walk(path, &mut results, &mut |path| path.is_dir())?;
+
Ok(results)
+
}
+433
crates/jacquard-lexicon/src/lexicon.rs
···
···
+
// Forked from atrium-lexicon
+
// https://github.com/atrium-rs/atrium/blob/main/lexicon/atrium-lex/src/lexicon.rs
+
// https://github.com/atrium-rs/atrium/blob/main/lexicon/atrium-lex/src/lib.rs
+
+
use jacquard_common::{CowStr, smol_str::SmolStr, types::blob::MimeType};
+
use serde::{Deserialize, Serialize};
+
use serde_repr::{Deserialize_repr, Serialize_repr};
+
use serde_with::skip_serializing_none;
+
use std::collections::BTreeMap;
+
+
#[derive(Debug, Serialize_repr, Deserialize_repr, PartialEq, Eq, Clone, Copy)]
+
#[repr(u8)]
+
pub enum Lexicon {
+
Lexicon1 = 1,
+
}
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexiconDoc<'s> {
+
pub lexicon: Lexicon,
+
#[serde(borrow)]
+
pub id: CowStr<'s>,
+
pub revision: Option<u32>,
+
pub description: Option<CowStr<'s>>,
+
pub defs: BTreeMap<SmolStr, LexUserType<'s>>,
+
}
+
+
// primitives
+
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexBoolean<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub default: Option<bool>,
+
pub r#const: Option<bool>,
+
}
+
+
/// The Lexicon type `integer`.
+
///
+
/// Lexicon integers are [specified] as signed and 64-bit, which means that values will
+
/// always fit in an `i64`.
+
///
+
/// [specified]: https://atproto.com/specs/data-model#data-types
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexInteger<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub default: Option<i64>,
+
pub minimum: Option<i64>,
+
pub maximum: Option<i64>,
+
pub r#enum: Option<Vec<i64>>,
+
pub r#const: Option<i64>,
+
}
+
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone, Copy)]
+
#[serde(rename_all = "kebab-case")]
+
pub enum LexStringFormat {
+
Datetime,
+
Uri,
+
AtUri,
+
Did,
+
Handle,
+
AtIdentifier,
+
Nsid,
+
Cid,
+
Language,
+
Tid,
+
RecordKey,
+
}
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(rename_all = "camelCase")]
+
pub struct LexString<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub format: Option<LexStringFormat>,
+
pub default: Option<CowStr<'s>>,
+
pub min_length: Option<usize>,
+
pub max_length: Option<usize>,
+
pub min_graphemes: Option<usize>,
+
pub max_graphemes: Option<usize>,
+
pub r#enum: Option<Vec<CowStr<'s>>>,
+
pub r#const: Option<CowStr<'s>>,
+
pub known_values: Option<Vec<CowStr<'s>>>,
+
}
+
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexUnknown<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
}
+
// ipld types
+
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(rename_all = "camelCase")]
+
pub struct LexBytes<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub max_length: Option<usize>,
+
pub min_length: Option<usize>,
+
}
+
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexCidLink<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
}
+
+
// references
+
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexRef<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub r#ref: CowStr<'s>,
+
}
+
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexRefUnion<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub refs: Vec<CowStr<'s>>,
+
pub closed: Option<bool>,
+
}
+
+
// blobs
+
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(rename_all = "camelCase")]
+
pub struct LexBlob<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub accept: Option<Vec<MimeType<'s>>>,
+
pub max_size: Option<usize>,
+
}
+
+
// complex types
+
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(tag = "type", rename_all = "kebab-case")]
+
pub enum LexArrayItem<'s> {
+
// lexPrimitive
+
Boolean(LexBoolean<'s>),
+
Integer(LexInteger<'s>),
+
String(LexString<'s>),
+
Unknown(LexUnknown<'s>),
+
// lexIpldType
+
Bytes(LexBytes<'s>),
+
CidLink(LexCidLink<'s>),
+
// lexBlob
+
#[serde(borrow)]
+
Blob(LexBlob<'s>),
+
// lexRefVariant
+
Ref(LexRef<'s>),
+
Union(LexRefUnion<'s>),
+
}
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(rename_all = "camelCase")]
+
pub struct LexArray<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub items: LexArrayItem<'s>,
+
pub min_length: Option<usize>,
+
pub max_length: Option<usize>,
+
}
+
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(tag = "type", rename_all = "lowercase")]
+
pub enum LexPrimitiveArrayItem<'s> {
+
// lexPrimitive
+
#[serde(borrow)]
+
Boolean(LexBoolean<'s>),
+
Integer(LexInteger<'s>),
+
String(LexString<'s>),
+
Unknown(LexUnknown<'s>),
+
}
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(rename_all = "camelCase")]
+
pub struct LexPrimitiveArray<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub items: LexPrimitiveArrayItem<'s>,
+
pub min_length: Option<usize>,
+
pub max_length: Option<usize>,
+
}
+
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexToken<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
}
+
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(tag = "type", rename_all = "kebab-case")]
+
pub enum LexObjectProperty<'s> {
+
// lexRefVariant
+
#[serde(borrow)]
+
Ref(LexRef<'s>),
+
Union(LexRefUnion<'s>),
+
// lexIpldType
+
Bytes(LexBytes<'s>),
+
CidLink(LexCidLink<'s>),
+
// lexArray
+
Array(LexArray<'s>),
+
// lexBlob
+
Blob(LexBlob<'s>),
+
// lexPrimitive
+
Boolean(LexBoolean<'s>),
+
Integer(LexInteger<'s>),
+
String(LexString<'s>),
+
Unknown(LexUnknown<'s>),
+
}
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexObject<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub required: Option<Vec<SmolStr>>,
+
pub nullable: Option<Vec<SmolStr>>,
+
pub properties: BTreeMap<SmolStr, LexObjectProperty<'s>>,
+
}
+
+
// xrpc
+
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(tag = "type", rename_all = "lowercase")]
+
pub enum LexXrpcParametersProperty<'s> {
+
// lexPrimitive
+
#[serde(borrow)]
+
Boolean(LexBoolean<'s>),
+
Integer(LexInteger<'s>),
+
String(LexString<'s>),
+
Unknown(LexUnknown<'s>),
+
// lexPrimitiveArray
+
Array(LexPrimitiveArray<'s>),
+
}
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexXrpcParameters<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub required: Option<Vec<SmolStr>>,
+
pub properties: BTreeMap<SmolStr, LexXrpcParametersProperty<'s>>,
+
}
+
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(tag = "type", rename_all = "lowercase")]
+
pub enum LexXrpcBodySchema<'s> {
+
// lexRefVariant
+
#[serde(borrow)]
+
Ref(LexRef<'s>),
+
Union(LexRefUnion<'s>),
+
// lexObject
+
Object(LexObject<'s>),
+
}
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexXrpcBody<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub encoding: CowStr<'s>,
+
pub schema: Option<LexXrpcBodySchema<'s>>,
+
}
+
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(tag = "type", rename_all = "lowercase")]
+
pub enum LexXrpcSubscriptionMessageSchema<'s> {
+
// lexRefVariant
+
#[serde(borrow)]
+
Ref(LexRef<'s>),
+
Union(LexRefUnion<'s>),
+
// lexObject
+
Object(LexObject<'s>),
+
}
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexXrpcSubscriptionMessage<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub schema: Option<LexXrpcSubscriptionMessageSchema<'s>>,
+
}
+
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexXrpcError<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub name: CowStr<'s>,
+
}
+
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(tag = "type", rename_all = "lowercase")]
+
pub enum LexXrpcQueryParameter<'s> {
+
#[serde(borrow)]
+
Params(LexXrpcParameters<'s>),
+
}
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexXrpcQuery<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub parameters: Option<LexXrpcQueryParameter<'s>>,
+
pub output: Option<LexXrpcBody<'s>>,
+
pub errors: Option<Vec<LexXrpcError<'s>>>,
+
}
+
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(tag = "type", rename_all = "lowercase")]
+
pub enum LexXrpcProcedureParameter<'s> {
+
#[serde(borrow)]
+
Params(LexXrpcParameters<'s>),
+
}
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexXrpcProcedure<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub parameters: Option<LexXrpcProcedureParameter<'s>>,
+
pub input: Option<LexXrpcBody<'s>>,
+
pub output: Option<LexXrpcBody<'s>>,
+
pub errors: Option<Vec<LexXrpcError<'s>>>,
+
}
+
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(tag = "type", rename_all = "lowercase")]
+
pub enum LexXrpcSubscriptionParameter<'s> {
+
#[serde(borrow)]
+
Params(LexXrpcParameters<'s>),
+
}
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexXrpcSubscription<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub parameters: Option<LexXrpcSubscriptionParameter<'s>>,
+
pub message: Option<LexXrpcSubscriptionMessage<'s>>,
+
pub infos: Option<Vec<LexXrpcError<'s>>>,
+
pub errors: Option<Vec<LexXrpcError<'s>>>,
+
}
+
+
// database
+
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(tag = "type", rename_all = "lowercase")]
+
pub enum LexRecordRecord<'s> {
+
#[serde(borrow)]
+
Object(LexObject<'s>),
+
}
+
#[skip_serializing_none]
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
pub struct LexRecord<'s> {
+
#[serde(borrow)]
+
pub description: Option<CowStr<'s>>,
+
pub key: Option<CowStr<'s>>,
+
pub record: LexRecordRecord<'s>,
+
}
+
+
// core
+
+
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone)]
+
#[serde(tag = "type", rename_all = "kebab-case")]
+
pub enum LexUserType<'s> {
+
// lexRecord
+
#[serde(borrow)]
+
Record(LexRecord<'s>),
+
// lexXrpcQuery
+
#[serde(rename = "query")]
+
XrpcQuery(LexXrpcQuery<'s>),
+
// lexXrpcProcedure
+
#[serde(rename = "procedure")]
+
XrpcProcedure(LexXrpcProcedure<'s>),
+
// lexXrpcSubscription
+
#[serde(rename = "subscription")]
+
XrpcSubscription(LexXrpcSubscription<'s>),
+
// lexBlob
+
Blob(LexBlob<'s>),
+
// lexArray
+
Array(LexArray<'s>),
+
// lexToken
+
Token(LexToken<'s>),
+
// lexObject
+
Object(LexObject<'s>),
+
// lexBoolean,
+
Boolean(LexBoolean<'s>),
+
// lexInteger,
+
Integer(LexInteger<'s>),
+
// lexString,
+
String(LexString<'s>),
+
// lexBytes
+
Bytes(LexBytes<'s>),
+
// lexCidLink
+
CidLink(LexCidLink<'s>),
+
// lexUnknown
+
Unknown(LexUnknown<'s>),
+
}
+
+
#[cfg(test)]
+
mod tests {
+
use super::*;
+
+
const LEXICON_EXAMPLE_TOKEN: &str = r#"
+
{
+
"lexicon": 1,
+
"id": "com.socialapp.actorUser",
+
"defs": {
+
"main": {
+
"type": "token",
+
"description": "Actor type of 'User'"
+
}
+
}
+
}"#;
+
+
#[test]
+
fn parse() {
+
let doc = serde_json::from_str::<LexiconDoc>(LEXICON_EXAMPLE_TOKEN)
+
.expect("failed to deserialize");
+
assert_eq!(doc.lexicon, Lexicon::Lexicon1);
+
assert_eq!(doc.id, "com.socialapp.actorUser");
+
assert_eq!(doc.revision, None);
+
assert_eq!(doc.description, None);
+
assert_eq!(doc.defs.len(), 1);
+
}
+
}
+49 -12
crates/jacquard-lexicon/src/lib.rs
···
-
pub fn add(left: u64, right: u64) -> u64 {
-
left + right
-
}
-
#[cfg(test)]
-
mod tests {
-
use super::*;
-
#[test]
-
fn it_works() {
-
let result = add(2, 2);
-
assert_eq!(result, 4);
-
}
-
}
···
+
pub mod fs;
+
pub mod lexicon;
+
pub mod output;
+
pub mod schema;
+
// #[lexicon]
+
// #[derive(serde::Serialize, serde::Deserialize, Debug, Clone, PartialEq, Eq)]
+
// #[serde(rename_all = "camelCase")]
+
// pub struct Post<'s> {
+
// ///Client-declared timestamp when this post was originally created.
+
// pub created_at: jacquard_common::types::string::Datetime,
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
+
// pub embed: core::option::Option<RecordEmbed<'s>>,
+
// ///DEPRECATED: replaced by app.bsky.richtext.facet.
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
+
// pub entities: core::option::Option<Vec<Entity<'s>>>,
+
// ///Annotations of text (mentions, URLs, hashtags, etc)
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
+
// pub facets: core::option::Option<Vec<jacquard_api::app_bsky::richtext::Facet<'s>>>,
+
// ///Self-label values for this post. Effectively content warnings.
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
+
// pub labels: core::option::Option<RecordLabels<'s>>,
+
// ///Indicates human language of post primary text content.
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
+
// pub langs: core::option::Option<Vec<jacquard_common::types::string::Language>>,
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
+
// pub reply: core::option::Option<ReplyRef<'s>>,
+
// ///Additional hashtags, in addition to any included in post text and facets.
+
// #[serde(skip_serializing_if = "core::option::Option::is_none")]
+
// pub tags: core::option::Option<Vec<jacquard_common::CowStr<'s>>>,
+
// ///The primary post content. May be an empty string, if there are embeds.
+
// #[serde(borrow)]
+
// pub text: jacquard_common::CowStr<'s>,
+
// }
+
// #[open_union]
+
// #[derive(serde::Serialize, serde::Deserialize, Debug, Clone, PartialEq, Eq)]
+
// #[serde(tag = "$type")]
+
// pub enum RecordEmbed<'s> {
+
// #[serde(borrow)]
+
// #[serde(rename = "app.bsky.embed.images")]
+
// EmbedImages(Box<jacquard_api::app_bsky::embed::Images<'s>>),
+
// #[serde(rename = "app.bsky.embed.video")]
+
// EmbedVideo(Box<jacquard_api::app_bsky::embed::Video<'s>>),
+
// #[serde(rename = "app.bsky.embed.external")]
+
// EmbedExternal(Box<jacquard_api::app_bsky::embed::External<'s>>),
+
// #[serde(rename = "app.bsky.embed.record")]
+
// EmbedRecord(Box<jacquard_api::app_bsky::embed::Record<'s>>),
+
// #[serde(rename = "app.bsky.embed.recordWithMedia")]
+
// EmbedRecordWithMedia(Box<jacquard_api::app_bsky::embed::RecordWithMedia<'s>>),
+
// }
+41
crates/jacquard-lexicon/src/output.rs
···
···
+
use crate::lexicon::*;
+
use heck::{ToPascalCase, ToShoutySnakeCase, ToSnakeCase};
+
use itertools::Itertools;
+
use jacquard_common::CowStr;
+
use proc_macro2::TokenStream;
+
use quote::{format_ident, quote};
+
use std::collections::{HashMap, HashSet};
+
use syn::{Path, Result};
+
+
fn string_type<'s>(string: &'s LexString<'s>) -> Result<(TokenStream, TokenStream)> {
+
let description = description(&string.description);
+
let typ = match string.format {
+
Some(LexStringFormat::AtIdentifier) => {
+
quote!(jacquard_common::types::string::AtIdentifier<'s>)
+
}
+
Some(LexStringFormat::Cid) => quote!(jacquard_common::types::string::Cid<'s>),
+
Some(LexStringFormat::Datetime) => quote!(jacquard_common::types::string::Datetime),
+
Some(LexStringFormat::Did) => quote!(jacquard_common::types::string::Did<'s>),
+
Some(LexStringFormat::Handle) => quote!(jacquard_common::types::string::Handle<'s>),
+
Some(LexStringFormat::Nsid) => quote!(jacquard_common::types::string::Nsid<'s>),
+
Some(LexStringFormat::Language) => quote!(jacquard_common::types::string::Language),
+
Some(LexStringFormat::Tid) => quote!(jacquard_common::types::string::Tid),
+
Some(LexStringFormat::RecordKey) => quote!(
+
jacquard_common::types::string::RecordKey<jacquard_common::types::string::Rkey<'s>>
+
),
+
Some(LexStringFormat::Uri) => quote!(jacquard_common::types::string::Uri<'s>),
+
Some(LexStringFormat::AtUri) => quote!(jacquard_common::types::string::AtUri<'s>),
+
// TODO: other formats (uri, at-uri)
+
_ => quote!(CowStr<'s>),
+
};
+
Ok((description, typ))
+
}
+
+
fn description<'s>(description: &Option<CowStr<'s>>) -> TokenStream {
+
if let Some(description) = description {
+
let description = description.as_ref();
+
quote!(#[doc = #description])
+
} else {
+
quote!()
+
}
+
}
+142
crates/jacquard-lexicon/src/schema.rs
···
···
+
// Forked from atrium-codegen
+
// https://github.com/sugyan/atrium/blob/main/lexicon/atrium-codegen/src/schema.rs
+
+
use crate::lexicon::*;
+
use heck::ToPascalCase;
+
use jacquard_common::{
+
CowStr, IntoStatic,
+
smol_str::{self, SmolStr, ToSmolStr},
+
};
+
use std::collections::BTreeMap;
+
+
pub(crate) fn find_ref_unions<'s>(
+
defs: &'s BTreeMap<SmolStr, LexUserType<'s>>,
+
) -> Vec<(SmolStr, LexRefUnion<'s>)> {
+
let mut unions = Vec::new();
+
for (key, def) in defs {
+
match def {
+
LexUserType::Record(record) => {
+
let LexRecordRecord::Object(object) = &record.record;
+
find_ref_unions_in_object(object, SmolStr::new_static("Record"), &mut unions);
+
}
+
LexUserType::XrpcQuery(query) => {
+
if let Some(output) = &query.output {
+
if let Some(schema) = &output.schema {
+
find_ref_unions_in_body_schema(
+
schema,
+
SmolStr::new_static("Output"),
+
&mut unions,
+
);
+
}
+
}
+
}
+
LexUserType::XrpcProcedure(procedure) => {
+
if let Some(input) = &procedure.input {
+
if let Some(schema) = &input.schema {
+
find_ref_unions_in_body_schema(
+
schema,
+
SmolStr::new_static("Input"),
+
&mut unions,
+
);
+
}
+
}
+
if let Some(output) = &procedure.output {
+
if let Some(schema) = &output.schema {
+
find_ref_unions_in_body_schema(
+
schema,
+
SmolStr::new_static("Output"),
+
&mut unions,
+
);
+
}
+
}
+
}
+
LexUserType::XrpcSubscription(subscription) => {
+
if let Some(message) = &subscription.message {
+
if let Some(schema) = &message.schema {
+
find_ref_unions_in_subscription_message_schema(
+
schema,
+
SmolStr::new_static("Message"),
+
&mut unions,
+
);
+
}
+
}
+
}
+
LexUserType::Array(array) => {
+
find_ref_unions_in_array(
+
array,
+
CowStr::Borrowed(&key.to_pascal_case()).into_static(),
+
&mut unions,
+
);
+
}
+
LexUserType::Object(object) => {
+
find_ref_unions_in_object(object, key.to_pascal_case().to_smolstr(), &mut unions);
+
}
+
_ => {}
+
}
+
}
+
unions.sort_by_cached_key(|(name, _)| name.clone());
+
unions
+
}
+
+
fn find_ref_unions_in_body_schema<'s>(
+
schema: &'s LexXrpcBodySchema,
+
name: SmolStr,
+
unions: &mut Vec<(SmolStr, LexRefUnion<'s>)>,
+
) {
+
match schema {
+
LexXrpcBodySchema::Union(_) => unimplemented!(),
+
LexXrpcBodySchema::Object(object) => find_ref_unions_in_object(object, name, unions),
+
_ => {}
+
}
+
}
+
+
fn find_ref_unions_in_subscription_message_schema<'s>(
+
schema: &'s LexXrpcSubscriptionMessageSchema,
+
name: SmolStr,
+
unions: &mut Vec<(SmolStr, LexRefUnion<'s>)>,
+
) {
+
match schema {
+
LexXrpcSubscriptionMessageSchema::Union(union) => {
+
unions.push((name.into(), union.clone()));
+
}
+
LexXrpcSubscriptionMessageSchema::Object(object) => {
+
find_ref_unions_in_object(object, name, unions)
+
}
+
_ => {}
+
}
+
}
+
+
fn find_ref_unions_in_array<'s>(
+
array: &'s LexArray,
+
name: CowStr<'s>,
+
unions: &mut Vec<(SmolStr, LexRefUnion<'s>)>,
+
) {
+
if let LexArrayItem::Union(union) = &array.items {
+
unions.push((smol_str::format_smolstr!("{}", name), union.clone()));
+
}
+
}
+
+
fn find_ref_unions_in_object<'s>(
+
object: &'s LexObject,
+
name: SmolStr,
+
unions: &mut Vec<(SmolStr, LexRefUnion<'s>)>,
+
) {
+
for (k, property) in &object.properties {
+
match property {
+
LexObjectProperty::Union(union) => {
+
unions.push((
+
smol_str::format_smolstr!("{name}{}", k.to_pascal_case()),
+
union.clone(),
+
));
+
}
+
LexObjectProperty::Array(array) => {
+
find_ref_unions_in_array(
+
array,
+
CowStr::Borrowed(&(name.to_string() + &k.to_pascal_case())).into_static(),
+
unions,
+
);
+
}
+
_ => {}
+
}
+
}
+
}