whalescale/AGENTS.md
2026-04-18 20:03:54 -05:00

82 lines
5.7 KiB
Markdown

# Agents Guide: Whalescale
## Architecture: Integrated Plane
Whalescale is a unified agent — the control plane and data plane share the same process, the same encryption sessions, and the same UDP sockets. **There is no separate WireGuard process.** All logic — discovery, NAT traversal, encryption, multipath scheduling, and TUN management — lives in one program.
```
┌──────────────────────────────────────────────┐
│ TUN Device (IP packets) │
├──────────────────────────────────────────────┤
│ Reordering Buffer │
├──────────────────────────────────────────────┤
│ Path Scheduler │
├───────────┬───────────┬──────────────────────┤
│ Path 1 │ Path 2 │ Path N │
│ 5G/UDP │ WiFi/UDP │ IPv6/UDP │
├───────────┴───────────┴──────────────────────┤
│ Noise_IK Session Manager │
├──────────────────────────────────────────────┤
│ Whalescale Agent (single process) │
└──────────────────────────────────────────────┘
```
**Never split logic across processes or re-introduce a WireGuard dependency.** The unified architecture exists specifically to avoid the coordination conflicts that arise when the control plane and data plane are separate.
## Core Technical Constraints
* **Identity:** Uses Ed25519 keys. The Whalescale Node ID **is** the Ed25519 public key. No separate WireGuard key.
* **Handshake:** Noise_IK, implemented in userspace via an established Rust Noise library (not the WireGuard kernel module).
* **Discovery:** No DHT. Discovery relies on **Manual Bootstrap****Anchors****Gossip****LKG Cache**.
* **Connectivity:** Best-effort UDP hole punching. Symmetric ↔ Symmetric pairs communicate via anchor relay (encrypted forwarding, not TURN).
* **Multipath:** A single peer session can use multiple network interfaces simultaneously (e.g., 5G + WiFi) with a reordering buffer.
* **IPv6:** Preferred transport when available. Eliminates NAT entirely. Always attempt IPv6 paths first.
* **Anchors:** First-class concept. At least one anchor (cone NAT or public IP, stable address) is required for the network to support mobile/symmetric-NAT nodes. Recommend at least two anchors on different ISPs.
* **No port prediction/sweeping:** Categorically ineffective against CGNAT. Do not re-implement.
## What Whalescale Is NOT
* **Not a WireGuard wrapper.** Whalescale owns its data plane.
* **Not a DHT.** No Kademlia, no distributed hash tables.
* **Not a TURN/STUN service.** Anchor relay is packet forwarding through existing encrypted tunnels, not dedicated relay infrastructure.
* **Not designed for internet-scale.** The target is trusted, known-peer networks (tens to low hundreds of nodes).
## Module Structure (Target)
Rust workspace with the following crates:
* `crates/whalescale-agent/` — Main entry point, event loop, configuration (binary crate)
* `crates/whalescale-session/` — Noise_IK handshake, session state, key rotation
* `crates/whalescale-transport/` — UDP socket management, packet framing, send/receive
* `crates/whalescale-path/` — Path discovery, scheduling, health monitoring, reordering buffer
* `crates/whalescale-multipath/` — Multipath scheduler (weighted round-robin), bandwidth estimation, feedback loop, reordering buffer logic, test bench framework
* `crates/whalescale-gossip/` — Gossip protocol, LKG cache, anti-entropy, conflict resolution
* `crates/whalescale-nat/` — NAT type detection, hole punching, UPnP/NAT-PMP/PCP
* `crates/whalescale-anchor/` — Anchor management, relay forwarding, mutual keepalive
* `crates/whalescale-tun/` — TUN device read/write, IP packet routing
* `crates/whalescale-bootstrap/` — Pre-session discovery protocol, manual bootstrap
* `crates/whalescale-crypto/` — Ed25519 key management, Noise library integration (thin wrapper)
* `crates/whalescale-types/` — Shared types, constants, and protocol definitions
## Development Context
* **Language:** Rust.
* **Module:** `eganshub.net/whalescale`.
* **Critical Files:**
* `DESIGN.md`: The primary source of truth for architectural decisions.
* `MULTIPATH.md`: Full specification for the multipath transport subsystem.
* `ANTI_PATTERN.md`: Crucial for avoiding regressive design (e.g., trying to add a DHT, re-introducing WireGuard, or implementing port prediction).
## Implementation Phases
| Phase | Scope |
|-------|-------|
| 1 | Noise_IK session, single path, single peer, TUN integration |
| 2 | Multipath transport — weighted round-robin scheduler, reordering buffer, path management, feedback loop, bandwidth estimation |
| 3 | Multi-peer session management, LKG cache, gossip |
| 4 | NAT traversal (hole punching, UPnP/PCP, anchor signaling) |
| 5 | LAN discovery, IPv6 preference, anchor relay |
| 6 | Adaptive path scheduling, test bench framework, scheduler comparison |
**Phase 2 is the novel work.** The multipath transport has significant open questions (inner TCP interaction, reordering depth tuning, scheduler optimality) that require empirical validation. See `MULTIPATH.md` §11 for the test bench specification.