Transport stack, protocol design, bridge API, NAT traversal, and resilience — the full picture of agent-to-agent communication.
From your agent code down to raw UDP — every layer is purpose-built for agent-to-agent communication.
TCP head-of-line blocks on packet loss. QUIC multiplexes independent streams over UDP. One lost packet on stream 3 doesn't block streams 1, 2, 4. Connection setup is 1 RTT vs TCP's 3 for TLS.
Every agent has an Ed25519 keypair. When two agents connect, they perform a Noise handshake:
The relay facilitates the connection but cannot decrypt the traffic. Even if someone compromises the relay, they see gibberish.
The workspace ships everything an agent needs — core API, networking, bridge, CLI, and protobuf definitions.
AgentNode API — connect, send, call, broadcast, subscribe. The primary interface for building agents.
Standalone P2P networking layer. Request-response, gossipsub, name registry. Zero external deps.
REST + WebSocket bridge for non-Rust clients. Combined actix-web server on a single port.
Protobuf message definitions — AgentMessage, RpcRequest, RpcResponse.
CLI binary. Commands: bridge, agent, send, broadcast. Agent mode with interactive shell + control socket.
Separate private repo. Handles relay server logic. Deployed to Fly.io.
A hashmap and a circuit breaker. Nothing more.
The relay does two things. It maps names to peer IDs (an in-memory hashmap). And it bridges connections for agents behind NAT (relay circuits). That's it. No message queue. No database. No state to back up.
Think of it as DNS combined with a mail carrier. The relay knows addresses and helps agents behind firewalls reach each other. Once connected, agents talk directly. The relay sees ciphertext.
Three messaging primitives. Everything else builds on top.
One-way message. Agent A sends to Agent B by name. ACK prevents circuit teardown. Used for notifications, status updates, commands.
let msg = node.new_agent_message("task", payload);
node.send("beta.relay", msg).await?;Blocks until target responds or 30s timeout. Correlation ID ties request to response. Async handler spawns per-request tasks. Also available by PeerId directly.
let resp = node.call("beta.relay", req).await?;
// or by peer:
node.call_by_peer_id(pid, req).await?;Wildcard subscriptions (metrics.*) match any sub-topic. No self-delivery — broadcasters never receive their own messages. Topics routed via message metadata through the shared gossipsub mesh.
node.subscribe("metrics.*", |msg| {});
node.broadcast("metrics.cpu", m).await?;
node.unsubscribe("metrics.*").await?;on_message(handler)Receive all direct messages (SUBWAY_MSG)
handle_rpc(handler)Serve RPC requests synchronously
handle_rpc_async(handler)Serve RPC requests with async handler (spawns per-request)
resolve(name) → PeerIdName resolution via relay
new_agent_message(type, payload)Create message with auto-generated ID + timestamp
connected_peer_count() → usizeNumber of connected peers on the mesh
Subscribe to topics. Broadcast to topics. Wildcards match sub-topics automatically. Broadcasters never receive their own messages. Production verified on the live mesh.
metrics.* → metrics.cpu, metrics.memagents.* → agents.status, agents.error* → matches everythingWhen an agent broadcasts, it never receives its own message — even if subscribed to a matching topic. Filtered by sender PeerId at the subscriber level.
Topics are embedded in message metadata and routed through the shared gossipsub mesh. No per-topic gossipsub subscriptions needed. Scales to any number of topics.
node.subscribe() / node.broadcast()
subscribe <topic> / broadcast <topic> <msg>
POST /v1/broadcast · GET /v1/subscribe SSE
The bridge exposes Subway's P2P capabilities over HTTP and WebSocket. TypeScript, Python, Go — anything that speaks JSON can connect.
/v1/send/v1/call/v1/broadcast/v1/subscribe?topic=X/v1/resolve/{name}/v1/health/v1/stats. TLD get .relay appended automatically.Most devices don't have public IP addresses. They sit behind NAT — outbound connections work, inbound connections are blocked. Two agents behind NAT can't directly connect to each other.
The relay has a public IP. Both agents connect outbound. The relay bridges them via a circuit. Then libp2p attempts DCUtR — a direct connection upgrade through simultaneous connect. If it works, traffic bypasses the relay entirely.
# one command — detects brew, falls back to binary $ curl -sSL https://subway.dev/install.sh | sh # what happens: # brew found? → brew install subway-dev/tap/subway # no brew? → downloads binary to ~/.local/bin ╭─────────────────────────────────────╮ │ subway — P2P transport for agents │ ╰─────────────────────────────────────╯ Homebrew detected — installing via brew... ✓ installed via homebrew ✓ upgrade later with: brew upgrade subway # verify $ subway --version subway 0.0.1
“Subway is designed to survive network disruptions without manual intervention.”
In-memory hashmap. Not persisted to disk.
libp2p connection state. Not persisted.
Ed25519 keys on disk. The only state that persists.
Agents shouldn't depend on centralized brokers. Subway provides a native peer-to-peer network for autonomous systems.
Requires ZooKeeper/KRaft cluster. Stateful brokers. Partition management. Schema registry. Ops overhead.
Single point of failure. Pub/sub is ephemeral. Cluster mode adds complexity. Memory-bound scaling.
No central broker. Relay is stateless. Agents connect directly. End-to-end encrypted. Scales with the mesh.