code
Clone this repository
https://tangled.org/bretton.dev/coves
git@knot.bretton.dev:bretton.dev/coves
For self-hosted knots, clone URLs may differ based on your setup.
This commit implements a complete, secure OAuth 2.0 + atProto authentication system
for Coves, including comprehensive security fixes based on code review.
## ๐ Core Features
### OAuth 2.0 + atProto Authentication
- **DPoP Token Binding (RFC 9449)**: Each session has unique cryptographic key
- **PKCE (RFC 7636)**: S256 challenge method prevents code interception
- **PAR (RFC 9126)**: Pre-registration of authorization requests
- **Complete OAuth Flow**: Login โ Authorize โ Callback โ Session Management
### Implementation Architecture
- **Handlers**: [internal/api/handlers/oauth/](internal/api/handlers/oauth/)
- `login.go` - Initiates OAuth flow with handle resolution
- `callback.go` - Processes authorization code and creates session
- `logout.go` - Session termination
- `metadata.go` - RFC 7591 client metadata endpoint
- `jwks.go` - Public key exposure (JWK Set)
- **OAuth Client**: [internal/atproto/oauth/](internal/atproto/oauth/)
- `client.go` - OAuth HTTP client with PAR, token exchange, refresh
- `dpop.go` - DPoP proof generation (ES256 signatures)
- `pkce.go` - PKCE challenge generation
- **Session Management**: [internal/core/oauth/](internal/core/oauth/)
- `session.go` - OAuth data models (OAuthRequest, OAuthSession)
- `repository.go` - PostgreSQL storage with atomic operations
- `auth_service.go` - Authentication business logic
- **Middleware**: [internal/api/middleware/auth.go](internal/api/middleware/auth.go)
- `RequireAuth` - Enforces authentication
- `OptionalAuth` - Loads user context if available
- Automatic token refresh (< 5 min to expiry)
### Database Schema
- **oauth_requests**: Temporary state during authorization flow (10-min TTL)
- **oauth_sessions**: Long-lived authenticated sessions
- **Indexes**: Performance optimizations for session queries
- **Auto-cleanup**: Trigger-based expiration handling
### DPoP Transport
- **HTTP RoundTripper**: [internal/atproto/xrpc/dpop_transport.go](internal/atproto/xrpc/dpop_transport.go)
- Automatic DPoP proof injection on all requests
- Nonce rotation handling (automatic retry on 401)
- PDS and auth server nonce tracking
## ๐ Security Features (PR Review Hardening)
### Critical Security Fixes
โ
**CSRF/Replay Protection**: Atomic `GetAndDeleteRequest()` prevents state reuse
โ
**Cookie Secret Validation**: Enforced minimum 32 bytes for session security
โ
**Error Sanitization**: No internal error details exposed to users
โ
**HTTPS Enforcement**: Production-only HTTPS cookies with explicit localhost checks
โ
**Clean Architecture**: Business logic extracted to `AuthService` layer
### Additional Security Measures
โ
**No Token Leakage**: Never log response bodies containing credentials
โ
**Race-Free**: Fixed concurrent access to DPoP nonces with proper mutex handling
โ
**Input Validation**: Handle format checking, state parameter verification
โ
**Session Isolation**: One active session per DID (upgradeable to multiple)
โ
**Automatic Cleanup**: Hourly background job removes expired sessions/requests
### Token Binding & Proof-of-Possession
- Each session generates unique ES256 key pair
- Access tokens cryptographically bound to client
- DPoP proofs include:
- JWK header (public key)
- HTTP method and URL (prevents token replay)
- Access token hash (`ath` claim)
- JTI (unique token ID)
- Server nonce (when required)
## ๐ฏ Configuration & Setup
### Environment Variables
```bash
# OAuth Configuration (.env.dev)
OAUTH_PRIVATE_JWK=base64:... # Client private key (ES256)
OAUTH_COOKIE_SECRET=... # Session cookie secret (min 32 bytes)
APPVIEW_PUBLIC_URL=http://127.0.0.1:8081
```
### Base64 Encoding Support
- Helper: `GetEnvBase64OrPlain()` supports both plain and base64-encoded values
- Prevents shell escaping issues with JSON in environment variables
- Format: `OAUTH_PRIVATE_JWK=base64:eyJhbGci...` or plain JSON
### Cookie Store Singleton
- Global singleton initialized at startup: `oauth.InitCookieStore(secret)`
- Shared across all handlers for consistent session management
- Validates secret length on initialization
### Database Migration
```sql
-- Migration 003: OAuth tables
-- Migration 004: Performance indexes
```
## ๐ Code Quality & Testing
### Test Coverage
- `env_test.go` - Base64 environment variable handling (8 test cases)
- `dpop_test.go` - DPoP proof structure validation
- `oauth_test.go` - Integration tests for OAuth endpoints
### Linter Compliance
- Fixed errcheck violations (defer close, error handling)
- Formatted with gofmt
- Added nolint directives where appropriate
### Constants & Configuration
- [constants.go](internal/api/handlers/oauth/constants.go) - Named configuration values
- `SessionMaxAge = 7 * 24 * 60 * 60`
- `TokenRefreshThreshold = 5 * time.Minute`
- `MinCookieSecretLength = 32`
## ๐ Implementation Decisions
### Custom OAuth vs Indigo Library
- **Decision**: Implement custom OAuth client
- **Rationale**: Indigo OAuth library explicitly unstable; custom implementation gives full control over edge cases and nonce retry logic
- **Future**: Migrate when indigo reaches stable v1.0
### Session Storage
- **Decision**: PostgreSQL with one session per DID
- **Rationale**: Simple for initial implementation, easy to upgrade to multiple sessions later, transaction support
### DPoP Key Management
- **Decision**: Unique key per session, stored in database
- **Rationale**: RFC 9449 compliance, token binding security, survives server restarts
## ๐ Performance Optimizations
- `idx_oauth_sessions_did_expires` - Fast session expiry queries
- Partial index for active sessions (`WHERE expires_at > NOW()`)
- Hourly cleanup prevents table bloat
- Cookie store singleton reduces memory allocations
## โ
Production Readiness
### Real-World Validation
โ
Successfully tested with live PDS: `https://pds.bretton.dev`
โ
Handle resolution: `bretton.dev` โ DID โ PDS discovery
โ
Complete authorization flow with DPoP nonce retry
โ
Session storage and retrieval validated
โ
Token refresh logic confirmed working
### Security Checklist
โ
DPoP token binding prevents theft/replay
โ
PKCE prevents authorization code interception
โ
PAR reduces attack surface
โ
Atomic state operations prevent CSRF
โ
HTTP-only, secure, SameSite cookies
โ
Private keys never exposed in public endpoints
โ
Automatic token expiration (60 min access, ~90 day refresh)
## ๐ฆ Files Changed
- **27 files**: 3,130 additions, 1 deletion
- **New packages**: oauth handlers, OAuth client, auth middleware
- **New migrations**: OAuth tables + indexes
- **Updated**: main.go (OAuth initialization), .env.dev (configuration docs)
๐ค Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
## Core Features
### Identity Resolution System (internal/atproto/identity/)
- Implement DNS/HTTPS handle resolution using Bluesky Indigo library
- Add DID document resolution via PLC directory
- Create PostgreSQL-backed caching layer (24h TTL)
- Support bidirectional caching (handle โ DID)
- Add atomic cache purge with single-query CTE optimization
### Database Schema
- Add identity_cache table with timezone-aware timestamps
- Support handle normalization via database trigger
- Enable automatic expiry checking
### Handle Update System
- Add UpdateHandle method to UserService and UserRepository
- Implement handle change detection in Jetstream consumer
- Update database BEFORE purging cache (prevents race condition)
- Purge BOTH old handle and DID entries on handle changes
- Add structured logging for cache operations
### Bug Fixes
- Fix timezone bugs throughout codebase (use UTC consistently)
- Fix rate limiter timestamp handling
- Resolve pre-existing test isolation bug in identity cache tests
- Fix Makefile test command to exclude restricted directories
### Testing
- Add comprehensive identity resolution test suite (450+ lines)
- Add handle change integration test with cache verification
- All 46+ integration test subtests passing
- Test both local and real atProto handle resolution
### Configuration
- Add IDENTITY_PLC_URL and IDENTITY_CACHE_TTL env vars
- Add golangci-lint configuration
- Update Makefile to avoid permission denied errors
## Architecture Decisions
- Use decorator pattern for caching resolver
- Maintain layer separation (no SQL in handlers)
- Reject database triggers for cache invalidation (keeps logic in app layer)
- Follow atProto best practices from QuickDID
## Files Changed
- 7 new files (identity system + migration + tests)
- 12 modified files (integration + bug fixes)
- ~800 lines of production code
- ~450 lines of tests
๐ค Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Groups all atProto-related code under internal/atproto/ for better
code organization:
internal/
โโโ atproto/
โ โโโ lexicon/ # Protocol definitions
โ โโโ jetstream/ # Firehose consumer
โโโ core/ # Business logic
โโโ db/ # Persistence
Changes:
- Moved internal/jetstream/ โ internal/atproto/jetstream/
- Updated import paths in cmd/server/main.go
- Updated import paths in all test files
All tests passing:
โ
Unit tests
โ
Integration tests
โ
E2E tests
๐ค Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
IDE configuration files should not be tracked in git.
These files are already in .gitignore but were committed before
the ignore rule was added.
๐ค Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Addresses PR review feedback with security, validation, and reliability improvements.
## Security & Validation Improvements
- Add lexicon-compliant error types (InvalidHandle, WeakPassword, etc.)
- Implement official atProto handle validation per spec
- Normalizes to lowercase before validation
- Validates TLD restrictions (.local, .onion, etc. disallowed)
- Max 253 char length enforcement
- Reference: https://atproto.com/specs/handle
- Add password validation (min 8 chars)
- Protects PDS from spam by malicious third-party clients
- PDS remains authoritative on final acceptance
- Add HTTP client timeout (10s) to prevent hanging on slow PDS
- Map service errors to proper XRPC error responses with correct status codes
## Test Reliability Improvements
- Replace fixed time.Sleep() with retry-with-timeout pattern
- Inline retry loops with 500ms polling intervals
- Configurable deadlines per test scenario (10-15s)
- 2x faster test execution on fast systems
- More reliable on slow CI environments
- Add E2E test database setup helper
- Fix test expectations to match new error messages
## Architecture Documentation
- Add TODO comments for future improvements:
- Race condition in Jetstream consumer (sync.Once needed)
- DIDโPDS URL resolution via PLC directory for federation
- Document that current implementation works for local dev
- Mark federation support as future enhancement
## Files Changed
New files:
- internal/core/users/errors.go - Domain error types
- tests/e2e/user_signup_test.go - Full E2E test coverage
- internal/atproto/lexicon/social/coves/actor/signup.json - Lexicon spec
- docs/E2E_TESTING.md - E2E testing guide
- internal/jetstream/user_consumer.go - Event consumer
- tests/integration/jetstream_consumer_test.go - Consumer tests
- tests/integration/user_test.go - User service tests
Modified:
- internal/core/users/service.go - Enhanced validation + HTTP timeout
- internal/api/routes/user.go - Lexicon error mapping
- tests/integration/user_test.go - Updated test expectations
## Test Results
โ
All unit/integration tests pass
โ
Full E2E test suite passes (10.3s)
โ
Validates complete signup flow: XRPC โ PDS โ Jetstream โ AppView โ PostgreSQL
๐ค Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Implements a minimal, production-ready user management system for Coves
with atProto DID-based identity and comprehensive security improvements.
## Core Features
- atProto-compliant user model (DID + handle)
- Single clean migration (001_create_users_table.sql)
- XRPC endpoint: social.coves.actor.getProfile
- Handle-based authentication (resolves handle โ DID)
- PostgreSQL AppView indexing
## Security & Performance Fixes
- **Rate limiting**: 100 req/min per IP (in-memory middleware)
- **Input validation**: atProto handle regex validation
- Alphanumeric + hyphens + dots only
- No consecutive hyphens, must start/end with alphanumeric
- 1-253 character length limit
- **Database constraints**: Proper unique constraint error handling
- Clear error messages for duplicate DID/handle
- No internal details leaked to API consumers
- **Performance**: Removed duplicate DB checks (3 calls โ 1 call)
## Breaking Changes
- Replaced email/username model with DID/handle
- Deleted legacy migrations (001, 005)
- Removed old repository and service test files
## Architecture
- Repository: Parameterized queries, context-aware
- Service: Business logic with proper validation
- Handler: Minimal XRPC implementation
- Middleware: Rate limiting for public endpoints
## Testing
- Full integration test coverage (4 test suites, all passing)
- Duplicate creation validation tests
- Handle format validation (9 edge cases)
- XRPC endpoint tests (success/error scenarios)
## Documentation
- Updated TESTING_SUMMARY.md with .test handle convention
- Added TODO for federated PDS support
- RFC3339 timestamp formatting
๐ค Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>