Common types for the Velo distributed systems stack.
## Overview
This crate provides the foundational types used across Velo for identity and addressing. The design prioritizes:
-**Compact representations** for embedding in fixed-size handles
-**Transport-agnostic addressing** without enumerating all possible transports
-**KV-store friendly** serialization using opaque `Bytes`
## Identity Types
### InstanceId
`InstanceId` is a UUID-based identifier that serves as the **source of truth** for identifying a running Velo instance. It is used for:
- Transport-level routing
- Discovery registration
- Peer management
```rust
usevelo_common::InstanceId;
letinstance_id=InstanceId::new_v4();
letuuid=instance_id.as_uuid();
```
### WorkerId
`WorkerId` is a deterministic 64-bit identifier derived from `InstanceId` via xxh3 hash. The compact representation enables embedding worker identity into fixed-size handles.
**Design rationale**: A `u128` handle can encode:
- 64 bits for `WorkerId`
- 64 bits for additional data (sequence numbers, flags, etc.)
This value-semantics approach simplifies passing identity through systems that work with fixed-size integers.
The derivation is always consistent—calling `worker_id()` multiple times returns the same value.
## Address Types
### WorkerAddress
`WorkerAddress` is an opaque byte container holding transport endpoint information. Internally, it's a MessagePack-encoded map of `TransportKey -> Bytes`, but this structure is intentionally hidden from consumers.
**Key design decisions**:
1.**Opaque values**: Transport endpoints are stored as raw bytes. They could be simple strings (`"tcp://127.0.0.1:5555"`) or complex serialized objects. The interpretation is left to the transport implementation.
2.**No transport enum**: Rather than defining an enum of all possible transports with their configurations, we use string keys (`"tcp"`, `"rdma"`, `"grpc"`, etc.). This allows transports to be added without modifying the common types.
3.**KV-store friendly**: The entire address serializes to a `Bytes` blob, suitable for storage in etcd, Redis, or any key-value store without schema changes.
```rust
usevelo_common::WorkerAddress;
usestd::collections::HashMap;
// Addresses are typically constructed by velo-transports transport builders,
assert!(map.get("tcp").is_some());// &str lookup works
```
### PeerInfo
Combines `InstanceId` and `WorkerAddress` into a single structure representing a discoverable peer. This is the primary type exchanged during peer discovery and registration.
`WorkerAddress` instances are constructed by transport builders in `velo-transports`. Each transport (TCP, gRPC, NATS, UCX, etc.) contributes its endpoint data:
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Build & Test
```bash
# Build
cargo build -p velo-events
# Run all tests
cargo test-p velo-events
# Run a single test
cargo test-p velo-events <test_name>
# Check (no codegen)
cargo check -p velo-events
```
## Architecture
`velo-events` is a generational event system for coordinating async awaiters with minimal overhead. Events can be triggered (success) or poisoned (error), and entries are recycled across generations.
### Core types (`event.rs`, `manager.rs`)
-**`Event`** — concrete RAII guard for a single event. Dropping without calling `trigger(self)` or `poison(self, ...)` auto-poisons the event. `into_handle(self)` disarms the guard and returns the bare handle. `trigger` and `poison` consume `self`, preventing double-completion at compile time.
-**`EventManager`** — concrete struct that manages a collection of events: `new_event`, `awaiter`, `poll`, `trigger`, `poison`, `merge_events`, `force_shutdown`. Create with `EventManager::local()` for local use or `EventManager::new(base, backend)` for distributed setups.
-**`EventBackend`** — public trait with 3 methods (`trigger`, `poison`, `awaiter`) that serves as the routing customization point. `EventSystemBase` implements this for the local path; distributed backends implement it to add network routing.
### Base implementation (`base/`)
-**`EventSystemBase`** — the core event storage, allocation, and recycling engine. Uses `DashMap` for concurrent event storage with a free-list for entry recycling. Implements `EventBackend` for local trigger/poison/awaiter routing. Constructors: `EventSystemBase::local()` (random system_id, local flag set) and `EventSystemBase::distributed(system_id)` (explicit id, no local flag). Public `_inner` methods (`trigger_inner`, `poison_inner`, `awaiter_inner`) allow distributed backends to delegate local operations.
### Handle encoding (`handle.rs`)
`EventHandle` packs identity into a single `u128`: `[system_id: 64][local_index: 32][generation: 32]`. Bit 31 of `local_index` distinguishes local (bit set) from distributed (bit clear) handles. Both local and distributed systems have unique non-zero `system_id` values. `EventSystemBase` validates that handles belong to the system that created them.
### Slot machinery (`slot/`)
Single-lock synchronization primitives. See [docs/slot-state-machine.md](docs/slot-state-machine.md)
for invariants. Any change to `slot/` must preserve all invariants (I1-I6)
and update the document.
Key types:
-**`EventEntry`** — per-index state machine with a single `ParkingMutex<EventState>` protecting generation tracking, waker registration, and poison history.
-**`EventAwaiter`** — `Future` impl that resolves to `Result<()>`. Supports both immediate (already-complete) and pending modes. Delegates poll to `EventEntry::poll_waiter`.
`DistributedEventFactory` creates an `EventManager` pre-configured with a `system_id` for distributed (Nova-managed) deployments.
## Key Design Decisions
-`Event` is an RAII guard by default — dropping without triggering auto-poisons. `into_handle()` is the explicit opt-out for manager-level operations. `Clone` is intentionally not implemented; each event is a unique ownership token.
-`EventManager` is a concrete `Clone` struct holding `Arc<EventSystemBase>` (lifecycle) + `Arc<dyn EventBackend>` (routing). `EventManager::local()` creates both from the same `EventSystemBase`. `EventManager::new(base, backend)` accepts a custom backend for distributed routing.
-`EventBackend` is the public routing trait (3 methods) that enables distributed routing without touching the core event lifecycle. Distributed backends call `EventSystemBase::trigger_inner` / `poison_inner` / `awaiter_inner` for local handles and route remote handles over the network.
- Slot entries track a `BTreeMap<Generation, PoisonArc>` for poison history, allowing past-generation poison queries.
- Generation overflow causes entry retirement and a new entry allocation (transparent retry loop in `new_event_with_backend`).
-`force_shutdown` poisons all pending events and rejects future allocations via an `AtomicBool` flag.
A generational event system for coordinating async tasks with minimal overhead.
Events can be created, awaited, merged into precondition graphs, and poisoned
on failure. The local implementation lives in this crate; a distributed event
system can be built on top via active messaging.
## Core concepts
| Operation | What it does |
|-----------|-------------|
| **Create** | `manager.new_event()` allocates a pending event and returns an `Event` — an RAII guard you can trigger or await. |
| **Await** | `manager.awaiter(handle)?.await` suspends the current task until the event completes (or is poisoned). |
| **Merge** | `manager.merge_events(vec![a, b, c])` creates a new event that completes only after **all** inputs complete — this is how you build precondition graphs. |
| **Poison** | Events can fail with a reason string. Dropping an `Event` without triggering it auto-poisons so events are never silently lost. |
## Usage
### Create, trigger, await
```rust,no_run
usevelo_events::EventManager;
#[tokio::main]
asyncfnmain()->anyhow::Result<()>{
letmanager=EventManager::local();
letevent=manager.new_event()?;
lethandle=event.handle();
// Spawn a task that waits for the event
letmgr=manager.clone();
letwaiter=tokio::spawn(asyncmove{
mgr.awaiter(handle)?.await
});
// Complete the event — consumes self, disarms the drop guard
event.trigger()?;
waiter.await??;
Ok(())
}
```
### RAII drop safety
`Event` is an RAII guard: dropping it without calling `trigger()` or `poison()`
automatically poisons the event so waiters are never silently abandoned. Both
`trigger` and `poison` consume `self`, preventing double-completion at compile
time.
To opt out of auto-poisoning (e.g. when handing ownership to a manager-level
operation), call `into_handle()`:
```rust,no_run
usevelo_events::EventManager;
#[tokio::main]
asyncfnmain()->anyhow::Result<()>{
letmanager=EventManager::local();
letevent=manager.new_event()?;
lethandle=event.handle();
// If this function returns early or panics, the event
// drops and is automatically poisoned.
do_work()?;
event.trigger()?;// success — consumes the event
Ok(())
}
fndo_work()->anyhow::Result<()>{Ok(())}
```
### Merging events (precondition graphs)
`merge_events` lets you express "wait for all of these before proceeding":
```rust,no_run
usevelo_events::EventManager;
#[tokio::main]
asyncfnmain()->anyhow::Result<()>{
letmanager=EventManager::local();
letload_weights=manager.new_event()?;
letload_tokenizer=manager.new_event()?;
// merged event completes only after both inputs complete
letready=manager.merge_events(vec![
load_weights.handle(),
load_tokenizer.handle(),
])?;
load_weights.trigger()?;
load_tokenizer.trigger()?;
manager.awaiter(ready)?.await?;
Ok(())
}
```
Because merged events are themselves events, you can merge merges to build
arbitrary DAGs of preconditions.
### Poison propagation
When an event is poisoned, all awaiters receive an error containing the
reason. Merged events accumulate poison reasons from their inputs: