KVM-based microVMM for the Volt platform: - Sub-second VM boot times - Minimal memory footprint - Landlock LSM + seccomp security - Virtio device support - Custom kernel management Copyright (c) Armored Gates LLC. All rights reserved. Licensed under AGPSL v5.0
569 lines
27 KiB
Markdown
569 lines
27 KiB
Markdown
# Volt vs Firecracker: Architecture & Security Comparison
|
|
|
|
**Date:** 2025-07-11
|
|
**Volt version:** 0.1.0 (pre-release)
|
|
**Firecracker version:** 1.6.0
|
|
**Scope:** Qualitative comparison of architecture, security, and features
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
1. [Executive Summary](#1-executive-summary)
|
|
2. [Security Model](#2-security-model)
|
|
3. [Architecture](#3-architecture)
|
|
4. [Feature Comparison Matrix](#4-feature-comparison-matrix)
|
|
5. [Boot Protocol](#5-boot-protocol)
|
|
6. [Maturity & Ecosystem](#6-maturity--ecosystem)
|
|
7. [Volt Advantages](#7-volt-vmm-advantages)
|
|
8. [Gap Analysis & Roadmap](#8-gap-analysis--roadmap)
|
|
|
|
---
|
|
|
|
## 1. Executive Summary
|
|
|
|
Volt and Firecracker are both KVM-based, Rust-written microVMMs designed for fast, secure VM provisioning. Firecracker is a mature, production-proven system (powering AWS Lambda and Fargate) with a battle-tested multi-layer security model. Volt is an early-stage project that targets the same space with a leaner architecture and some distinct design choices — most notably Landlock-first sandboxing (vs. Firecracker's jailer/chroot model), content-addressed storage via Stellarium, and aggressive boot-time optimization targeting <125ms.
|
|
|
|
**Bottom line:** Firecracker is production-ready with a proven security posture. Volt has a solid foundation and several architectural advantages, but requires significant work on security hardening, device integration, and testing before it can be considered production-grade.
|
|
|
|
---
|
|
|
|
## 2. Security Model
|
|
|
|
### 2.1 Firecracker Security Stack
|
|
|
|
Firecracker uses a **defense-in-depth** model with six distinct security layers, orchestrated by its `jailer` companion binary:
|
|
|
|
| Layer | Mechanism | What It Does |
|
|
|-------|-----------|-------------|
|
|
| 1 | **Jailer (chroot + pivot_root)** | Filesystem isolation — the VMM process sees only its own jail directory |
|
|
| 2 | **User/PID namespaces** | UID/GID and PID isolation from the host |
|
|
| 3 | **Network namespaces** | Network stack isolation per VM |
|
|
| 4 | **Cgroups (v1/v2)** | CPU, memory, IO resource limits |
|
|
| 5 | **seccomp-bpf** | Syscall allowlist (~50 syscalls) — everything else is denied |
|
|
| 6 | **Capability dropping** | All Linux capabilities dropped after setup |
|
|
|
|
Additional security features:
|
|
- **CPUID filtering** — strips VMX, SMX, TSX, PMU, power management leaves
|
|
- **CPU templates** (T2, T2CL, T2S, C3, V1N1) — normalize CPUID across host hardware for live migration safety and to reduce guest attack surface
|
|
- **MMDS (MicroVM Metadata Service)** — isolated metadata delivery without host network access (alternative to IMDS)
|
|
- **Rate-limited API** — Unix socket only, no TCP
|
|
- **No PCI bus** — virtio-mmio only, eliminating PCI attack surface
|
|
- **Snapshot security** — encrypted snapshot support for secure state save/restore
|
|
|
|
### 2.2 Volt Security Stack (Current)
|
|
|
|
Volt currently has **two implemented security layers** with plans for more:
|
|
|
|
| Layer | Status | Mechanism |
|
|
|-------|--------|-----------|
|
|
| 1 | ✅ Implemented | **KVM hardware isolation** — inherent to any KVM VMM |
|
|
| 2 | ✅ Implemented | **CPUID filtering** — strips VMX, SMX, TSX, MPX, PMU, power management; sets HYPERVISOR bit |
|
|
| 3 | 📋 Planned | **Landlock LSM** — filesystem path restrictions (see `docs/landlock-analysis.md`) |
|
|
| 4 | 📋 Planned | **seccomp-bpf** — syscall filtering |
|
|
| 5 | 📋 Planned | **Capability dropping** — privilege reduction |
|
|
| 6 | ❌ Not planned | **Jailer-style isolation** — Volt intends to use Landlock instead |
|
|
|
|
### 2.3 CPUID Filtering Comparison
|
|
|
|
Both VMMs filter CPUID to create a minimal guest profile. The approach is very similar:
|
|
|
|
| CPUID Leaf | Volt | Firecracker | Notes |
|
|
|------------|-----------|-------------|-------|
|
|
| 0x1 (Features) | Strips VMX, SMX, DTES64, MONITOR, DS_CPL; sets HYPERVISOR | Same + strips more via templates | Functionally equivalent |
|
|
| 0x4 (Cache topology) | Adjusts core count | Adjusts core count | Match |
|
|
| 0x6 (Thermal/Power) | Clear all | Clear all | Match |
|
|
| 0x7 (Extended features) | Strips TSX (HLE/RTM), MPX, RDT | Same + template-specific stripping | Volt covers the essentials |
|
|
| 0xA (PMU) | Clear all | Clear all | Match |
|
|
| 0xB (Topology) | Sets per-vCPU APIC ID | Sets per-vCPU APIC ID | Match |
|
|
| 0x40000000 (Hypervisor) | KVM signature | KVM signature | Match |
|
|
| 0x80000001 (Extended) | Ensures SYSCALL, NX, LM | Ensures SYSCALL, NX, LM | Match |
|
|
| 0x80000007 (Power mgmt) | Only invariant TSC | Only invariant TSC | Match |
|
|
| CPU templates | ❌ Not supported | ✅ T2, T2CL, T2S, C3, V1N1 | Firecracker normalizes across hardware |
|
|
|
|
### 2.4 Gap Analysis: What Volt Needs
|
|
|
|
| Security Feature | Priority | Effort | Notes |
|
|
|-----------------|----------|--------|-------|
|
|
| **seccomp-bpf filter** | 🔴 Critical | Medium | Must-have for production. ~50 syscall allowlist. |
|
|
| **Capability dropping** | 🔴 Critical | Low | Drop all caps after KVM/TAP setup. Simple to implement. |
|
|
| **Landlock sandboxing** | 🟡 High | Medium | Restrict filesystem to kernel, disk images, /dev/kvm, /dev/net/tun. Kernel 5.13+ required. |
|
|
| **CPU templates** | 🟡 High | Medium | Needed for cross-host migration and security normalization. |
|
|
| **Resource limits (cgroups)** | 🟡 High | Low-Medium | Prevent VM from exhausting host resources. |
|
|
| **Network namespace isolation** | 🟠 Medium | Medium | Isolate VM network from host. Currently relies on TAP device only. |
|
|
| **PID namespace** | 🟠 Medium | Low | Hide host processes from VMM. |
|
|
| **MMDS equivalent** | 🟢 Low | Medium | Metadata service for guests. Not needed for all use cases. |
|
|
| **Snapshot encryption** | 🟢 Low | Medium | Only needed when snapshots are implemented. |
|
|
|
|
---
|
|
|
|
## 3. Architecture
|
|
|
|
### 3.1 Code Structure
|
|
|
|
**Firecracker** (~70K lines Rust, production):
|
|
```
|
|
src/vmm/
|
|
├── arch/x86_64/ # x86 boot, regs, CPUID, MSRs
|
|
├── cpu_config/ # CPU templates (T2, C3, etc.)
|
|
├── devices/ # Virtio backends, legacy, MMDS
|
|
├── vstate/ # VM/vCPU state management
|
|
├── resources/ # Resource allocation
|
|
├── persist/ # Snapshot/restore
|
|
├── rate_limiter/ # IO rate limiting
|
|
├── seccomp/ # seccomp filters
|
|
└── vmm_config/ # Configuration validation
|
|
|
|
src/jailer/ # Separate binary: chroot, namespaces, cgroups
|
|
src/seccompiler/ # Separate binary: BPF compiler
|
|
src/snapshot_editor/ # Separate binary: snapshot manipulation
|
|
src/cpu_template_helper/ # Separate binary: CPU template generation
|
|
```
|
|
|
|
**Volt** (~18K lines Rust, early stage):
|
|
```
|
|
vmm/src/
|
|
├── api/ # REST API (Axum-based Unix socket)
|
|
│ ├── handlers.rs # Request handlers
|
|
│ ├── routes.rs # Route definitions
|
|
│ ├── server.rs # Server setup
|
|
│ └── types.rs # API types
|
|
├── boot/ # Boot protocol
|
|
│ ├── gdt.rs # GDT setup
|
|
│ ├── initrd.rs # Initrd loading
|
|
│ ├── linux.rs # Linux boot params (zero page)
|
|
│ ├── loader.rs # ELF64/bzImage loader
|
|
│ ├── pagetable.rs # Identity + high-half page tables
|
|
│ └── pvh.rs # PVH boot structures
|
|
├── config/ # VM configuration (JSON-based)
|
|
├── devices/
|
|
│ ├── serial.rs # 8250 UART
|
|
│ └── virtio/ # Virtio device framework
|
|
│ ├── block.rs # virtio-blk with file backend
|
|
│ ├── net.rs # virtio-net with TAP backend
|
|
│ ├── mmio.rs # Virtio-MMIO transport
|
|
│ ├── queue.rs # Virtqueue implementation
|
|
│ └── vhost_net.rs # vhost-net acceleration (WIP)
|
|
├── kvm/ # KVM interface
|
|
│ ├── cpuid.rs # CPUID filtering
|
|
│ ├── memory.rs # Guest memory (mmap, huge pages)
|
|
│ ├── vcpu.rs # vCPU run loop, register setup
|
|
│ └── vm.rs # VM lifecycle, IRQ chip, PIT
|
|
├── net/ # Network backends
|
|
│ ├── macvtap.rs # macvtap support
|
|
│ ├── networkd.rs # systemd-networkd integration
|
|
│ └── vhost.rs # vhost-net kernel offload
|
|
├── storage/ # Storage layer
|
|
│ ├── boot.rs # Boot storage
|
|
│ └── stellarium.rs # CAS integration
|
|
└── vmm/ # VMM orchestration
|
|
|
|
stellarium/ # Separate crate: content-addressed image storage
|
|
```
|
|
|
|
### 3.2 Device Model
|
|
|
|
| Device | Volt | Firecracker | Notes |
|
|
|--------|-----------|-------------|-------|
|
|
| **Transport** | virtio-mmio | virtio-mmio | Both avoid PCI for simplicity/security |
|
|
| **virtio-blk** | ✅ Implemented (file backend, BlockBackend trait) | ✅ Production (file, rate-limited, io_uring) | Volt has trait for CAS backends |
|
|
| **virtio-net** | 🔨 Code exists, disabled in mod.rs (`// TODO: Fix net module`) | ✅ Production (TAP, rate-limited, MMDS) | Volt has TAP + macvtap + vhost-net code, but not integrated |
|
|
| **Serial (8250 UART)** | ✅ Inline in vCPU run loop | ✅ Full 8250 emulation | Volt handles COM1 I/O directly in exit handler |
|
|
| **virtio-vsock** | ❌ | ✅ | Host-guest communication channel |
|
|
| **virtio-balloon** | ❌ | ✅ | Dynamic memory management |
|
|
| **virtio-rng** | ❌ | ❌ | Neither implements (guest uses /dev/urandom) |
|
|
| **i8042 (keyboard/reset)** | ❌ | ✅ (minimal) | Firecracker handles reboot via i8042 |
|
|
| **RTC (CMOS)** | ❌ | ❌ | Neither implements (guests use KVM clock) |
|
|
| **In-kernel IRQ chip** | ✅ (8259 PIC + IOAPIC) | ✅ (8259 PIC + IOAPIC) | Both delegate to KVM |
|
|
| **In-kernel PIT** | ✅ (8254 timer) | ✅ (8254 timer) | Both delegate to KVM |
|
|
|
|
### 3.3 API Surface
|
|
|
|
**Firecracker REST API** (Unix socket, well-documented OpenAPI spec):
|
|
```
|
|
PUT /machine-config # Configure VM before boot
|
|
GET /machine-config # Read configuration
|
|
PUT /boot-source # Set kernel, initrd, boot args
|
|
PUT /drives/{id} # Add/configure block device
|
|
PATCH /drives/{id} # Update block device (hotplug)
|
|
PUT /network-interfaces/{id} # Add/configure network device
|
|
PATCH /network-interfaces/{id} # Update network device
|
|
PUT /vsock # Configure vsock
|
|
PUT /actions # Start, pause, resume, stop VM
|
|
GET / # Health check + version
|
|
PUT /snapshot/create # Create snapshot
|
|
PUT /snapshot/load # Load snapshot
|
|
GET /vm # Get VM info
|
|
PATCH /vm # Update VM state
|
|
PUT /metrics # Configure metrics endpoint
|
|
PUT /mmds # Configure MMDS
|
|
GET /mmds # Read MMDS data
|
|
```
|
|
|
|
**Volt REST API** (Unix socket, Axum-based):
|
|
```
|
|
PUT /v1/vm/config # Configure VM
|
|
GET /v1/vm/config # Read configuration
|
|
PUT /v1/vm/state # Change state (start/pause/resume/stop)
|
|
GET /v1/vm/state # Get current state
|
|
GET /health # Health check
|
|
GET /v1/metrics # Prometheus-format metrics
|
|
```
|
|
|
|
**Key differences:**
|
|
- Firecracker's API is **pre-boot configuration** — you configure everything via API, then issue `InstanceStart`
|
|
- Volt currently uses **CLI arguments** for boot configuration; the API is simpler and manages lifecycle
|
|
- Firecracker has per-device endpoints (drives, network interfaces); Volt doesn't yet
|
|
- Firecracker has snapshot/restore APIs; Volt doesn't
|
|
|
|
### 3.4 vCPU Model
|
|
|
|
Both use a **one-thread-per-vCPU** model:
|
|
|
|
| Aspect | Volt | Firecracker |
|
|
|--------|-----------|-------------|
|
|
| Thread model | 1 thread per vCPU | 1 thread per vCPU |
|
|
| Run loop | `crossbeam_channel` commands → `KVM_RUN` → handle exits | Direct `KVM_RUN` in dedicated thread |
|
|
| Serial handling | Inline in vCPU exit handler (writes COM1 directly to stdout) | Separate serial device with event-driven epoll |
|
|
| IO exit handling | Match on port in exit handler | Event-driven device model with registered handlers |
|
|
| Signal handling | `signal-hook-tokio` + broadcast channels | `epoll` + custom signal handling |
|
|
| Async runtime | **Tokio** (full features) | **None** — pure synchronous `epoll` |
|
|
|
|
**Notable difference:** Volt pulls in Tokio for its API server and signal handling. Firecracker uses raw `epoll` with no async runtime, which contributes to its smaller binary size and deterministic behavior. This is a deliberate Firecracker design choice — async runtimes add unpredictable latency from task scheduling.
|
|
|
|
### 3.5 Memory Management
|
|
|
|
| Feature | Volt | Firecracker |
|
|
|---------|-----------|-------------|
|
|
| Huge pages (2MB) | ✅ Default enabled, fallback to 4K | ✅ Supported |
|
|
| MMIO hole handling | ✅ Splits around 3-4GB gap | ✅ Splits around 3-4GB gap |
|
|
| Memory backend | Direct `mmap` (anonymous) | `vm-memory` crate (GuestMemoryMmap) |
|
|
| Dirty page tracking | ✅ API exists | ✅ Production (for snapshots) |
|
|
| Memory ballooning | ❌ | ✅ virtio-balloon |
|
|
| Memory prefaulting | ✅ MAP_POPULATE | ✅ Supported |
|
|
| Guest memory abstraction | Custom `GuestMemoryManager` | `vm-memory` crate (shared across rust-vmm) |
|
|
|
|
---
|
|
|
|
## 4. Feature Comparison Matrix
|
|
|
|
| Feature | Volt | Firecracker | Notes |
|
|
|---------|-----------|-------------|-------|
|
|
| **Core** | | | |
|
|
| KVM-based | ✅ | ✅ | |
|
|
| Written in Rust | ✅ | ✅ | |
|
|
| x86_64 support | ✅ | ✅ | |
|
|
| aarch64 support | ❌ | ✅ | |
|
|
| Multi-vCPU | ✅ (1-255) | ✅ (1-32) | |
|
|
| **Boot** | | | |
|
|
| Linux boot protocol | ✅ | ✅ | |
|
|
| PVH boot structures | ✅ | ✅ | |
|
|
| ELF64 (vmlinux) | ✅ | ✅ | |
|
|
| bzImage | ✅ | ✅ | |
|
|
| PE (EFI stub) | ❌ | ❌ | |
|
|
| **Devices** | | | |
|
|
| virtio-blk | ✅ (file backend) | ✅ (file, rate-limited, io_uring) | |
|
|
| virtio-net | 🔨 (code exists, not integrated) | ✅ (TAP, rate-limited) | |
|
|
| virtio-vsock | ❌ | ✅ | |
|
|
| virtio-balloon | ❌ | ✅ | |
|
|
| Serial console | ✅ (inline) | ✅ (full 8250) | |
|
|
| vhost-net | 🔨 (code exists, not integrated) | ❌ (userspace only) | Potential advantage |
|
|
| **Networking** | | | |
|
|
| TAP backend | ✅ (CLI --tap) | ✅ (API) | |
|
|
| macvtap backend | 🔨 (code exists) | ❌ | Potential advantage |
|
|
| Rate limiting (net) | ❌ | ✅ | |
|
|
| MMDS | ❌ | ✅ | |
|
|
| **Storage** | | | |
|
|
| Raw image files | ✅ | ✅ | |
|
|
| Rate limiting (disk) | ❌ | ✅ | |
|
|
| io_uring backend | ❌ | ✅ | |
|
|
| Content-addressed storage | 🔨 (Stellarium) | ❌ | Unique to Volt |
|
|
| **Security** | | | |
|
|
| CPUID filtering | ✅ | ✅ | |
|
|
| CPU templates | ❌ | ✅ (T2, C3, V1N1, etc.) | |
|
|
| seccomp-bpf | ❌ | ✅ | |
|
|
| Jailer (chroot/namespaces) | ❌ | ✅ | |
|
|
| Landlock LSM | 📋 Planned | ❌ | |
|
|
| Capability dropping | ❌ | ✅ | |
|
|
| Cgroup integration | ❌ | ✅ | |
|
|
| **API** | | | |
|
|
| REST API (Unix socket) | ✅ (Axum) | ✅ (custom HTTP) | |
|
|
| Pre-boot configuration via API | ❌ (CLI only) | ✅ | |
|
|
| Swagger/OpenAPI spec | ❌ | ✅ | |
|
|
| Metrics (Prometheus) | ✅ (basic) | ✅ (comprehensive) | |
|
|
| **Operations** | | | |
|
|
| Snapshot/Restore | ❌ | ✅ | |
|
|
| Live migration | ❌ | ✅ (via snapshots) | |
|
|
| Hot-plug (drives) | ❌ | ✅ | |
|
|
| Logging (structured) | ✅ (tracing, JSON) | ✅ (structured) | |
|
|
| **Configuration** | | | |
|
|
| CLI arguments | ✅ | ❌ (API-only) | |
|
|
| JSON config file | ✅ | ❌ (API-only) | |
|
|
| API-driven config | 🔨 (partial) | ✅ (exclusively) | |
|
|
|
|
---
|
|
|
|
## 5. Boot Protocol
|
|
|
|
### 5.1 Supported Boot Methods
|
|
|
|
| Method | Volt | Firecracker |
|
|
|--------|-----------|-------------|
|
|
| **Linux boot protocol (64-bit)** | ✅ Primary | ✅ Primary |
|
|
| **PVH boot** | ✅ Structures written, used for E820/start_info | ✅ Full PVH with 32-bit entry |
|
|
| **32-bit protected mode entry** | ❌ | ✅ (PVH path) |
|
|
| **EFI handover** | ❌ | ❌ |
|
|
|
|
### 5.2 Kernel Format Support
|
|
|
|
| Format | Volt | Firecracker |
|
|
|--------|-----------|-------------|
|
|
| ELF64 (vmlinux) | ✅ Custom loader (hand-parsed ELF) | ✅ via `linux-loader` crate |
|
|
| bzImage | ✅ Custom loader (hand-parsed setup header) | ✅ via `linux-loader` crate |
|
|
| PE (EFI stub) | ❌ | ❌ |
|
|
|
|
**Interesting difference:** Volt implements its own ELF and bzImage parsers by hand, while Firecracker uses the `linux-loader` crate from the rust-vmm ecosystem. Volt *does* list `linux-loader` as a dependency in Cargo.toml but doesn't use it — the custom loaders in `boot/loader.rs` do their own parsing.
|
|
|
|
### 5.3 Boot Sequence Comparison
|
|
|
|
**Firecracker boot flow:**
|
|
1. API server starts, waits for configuration
|
|
2. User sends `PUT /boot-source`, `/machine-config`, `/drives`, `/network-interfaces`
|
|
3. User sends `PUT /actions` with `InstanceStart`
|
|
4. Firecracker creates VM, memory, vCPUs, devices in sequence
|
|
5. Kernel loaded, boot_params written
|
|
6. vCPU thread starts `KVM_RUN`
|
|
|
|
**Volt boot flow:**
|
|
1. CLI arguments parsed, configuration validated
|
|
2. KVM system initialized, VM created
|
|
3. Memory allocated (with huge pages)
|
|
4. Kernel loaded (ELF64 or bzImage auto-detected)
|
|
5. Initrd loaded (if specified)
|
|
6. GDT, page tables, boot_params, PVH structures written
|
|
7. CPUID filtered and applied to vCPUs
|
|
8. Boot MSRs configured
|
|
9. vCPU registers set (long mode, 64-bit)
|
|
10. API server starts (if socket specified)
|
|
11. vCPU threads start `KVM_RUN`
|
|
|
|
**Key difference:** Firecracker is API-first (no CLI for VM config). Volt is CLI-first with optional API. For orchestration at scale (e.g., Lambda-style), Firecracker's API-only model is better. For developer experience and quick testing, Volt's CLI is more convenient.
|
|
|
|
### 5.4 Page Table Setup
|
|
|
|
| Feature | Volt | Firecracker |
|
|
|---------|-----------|-------------|
|
|
| PML4 address | 0x1000 | 0x9000 |
|
|
| Identity mapping | 0 → 4GB (2MB pages) | 0 → 1GB (2MB pages) |
|
|
| High kernel mapping | ✅ 0xFFFFFFFF80000000+ → 0-2GB | ❌ None |
|
|
| Page table coverage | More thorough | Minimal — kernel sets up its own quickly |
|
|
|
|
Volt's dual identity + high-kernel page table setup is more thorough and handles the case where the kernel expects virtual addresses early. However, Firecracker's minimal approach works because the Linux kernel's `__startup_64()` builds its own page tables very early in boot.
|
|
|
|
### 5.5 Register State at Entry
|
|
|
|
| Register | Volt | Firecracker (Linux boot) |
|
|
|----------|-----------|--------------------------|
|
|
| CR0 | 0x80000011 (PE + ET + PG) | 0x80000011 (PE + ET + PG) |
|
|
| CR4 | 0x20 (PAE) | 0x20 (PAE) |
|
|
| EFER | 0x500 (LME + LMA) | 0x500 (LME + LMA) |
|
|
| CS selector | 0x08 | 0x08 |
|
|
| RSI | boot_params address | boot_params address |
|
|
| FPU (fcw) | ✅ 0x37f | ✅ 0x37f |
|
|
| Boot MSRs | ✅ 11 MSRs configured | ✅ Matching set |
|
|
|
|
After the CPUID fix documented in `cpuid-implementation.md`, the register states are now very similar.
|
|
|
|
---
|
|
|
|
## 6. Maturity & Ecosystem
|
|
|
|
### 6.1 Lines of Code
|
|
|
|
| Metric | Volt | Firecracker |
|
|
|--------|-----------|-------------|
|
|
| VMM Rust lines | ~18,000 | ~70,000 |
|
|
| Total (with tools) | ~20,000 (VMM + Stellarium) | ~100,000+ (VMM + Jailer + seccompiler + tools) |
|
|
| Test lines | ~1,000 (unit tests in modules) | ~30,000+ (unit + integration + performance) |
|
|
| Documentation | 6 markdown docs | Extensive (docs/, website, API spec) |
|
|
|
|
### 6.2 Dependencies
|
|
|
|
| Aspect | Volt | Firecracker |
|
|
|--------|-----------|-------------|
|
|
| Cargo.lock packages | ~285 | ~200-250 |
|
|
| Async runtime | ✅ Tokio (full) | ❌ None (raw epoll) |
|
|
| HTTP framework | Axum + Hyper + Tower | Custom HTTP parser |
|
|
| rust-vmm crates used | kvm-ioctls, kvm-bindings, vm-memory, virtio-queue, virtio-bindings, linux-loader | kvm-ioctls, kvm-bindings, vm-memory, virtio-queue, linux-loader, event-manager, seccompiler, vmm-sys-util |
|
|
| Serialization | serde + serde_json | serde + serde_json |
|
|
| CLI | clap (derive) | None (API-only) |
|
|
| Logging | tracing + tracing-subscriber | log + serde_json (custom) |
|
|
|
|
**Notable:** Volt has more dependencies (~285 crates) despite less code, primarily because of Tokio and the Axum HTTP stack. Firecracker keeps its dependency tree tight by avoiding async runtimes and heavy frameworks.
|
|
|
|
### 6.3 Community & Support
|
|
|
|
| Aspect | Volt | Firecracker |
|
|
|--------|-----------|-------------|
|
|
| License | Apache 2.0 | Apache 2.0 |
|
|
| Maintainer | Single developer | AWS team + community |
|
|
| GitHub stars | N/A (new) | ~26,000+ |
|
|
| CVE tracking | N/A | Active (security@ email, advisories) |
|
|
| Production users | None | AWS Lambda, Fargate, Fly.io (partial), Koyeb |
|
|
| Documentation | Internal only | Extensive public docs, blog posts, presentations |
|
|
| SDK/Client libraries | None | Python, Go clients exist |
|
|
| CI/CD | None visible | Extensive (buildkite, GitHub Actions) |
|
|
|
|
---
|
|
|
|
## 7. Volt Advantages
|
|
|
|
Despite being early-stage, Volt has several genuine architectural advantages and unique design choices:
|
|
|
|
### 7.1 Content-Addressed Storage (Stellarium)
|
|
|
|
Volt includes `stellarium`, a dedicated content-addressed storage system for VM images:
|
|
|
|
- **BLAKE3 hashing** for content identification (faster than SHA-256)
|
|
- **Content-defined chunking** via FastCDC (deduplication across images)
|
|
- **Zstd/LZ4 compression** per chunk
|
|
- **Sled embedded database** for the chunk index
|
|
- **BlockBackend trait** in virtio-blk designed for CAS integration
|
|
|
|
Firecracker has no equivalent — it expects pre-provisioned raw disk images. Stellarium could enable:
|
|
- Instant VM cloning via shared chunk references
|
|
- Efficient storage of many similar images
|
|
- Network-based image fetching with dedup
|
|
|
|
### 7.2 Landlock-First Security Model
|
|
|
|
Rather than requiring a privileged jailer process (Firecracker's approach), Volt plans to use Landlock LSM for filesystem isolation:
|
|
|
|
| Aspect | Volt (planned) | Firecracker |
|
|
|--------|---------------------|-------------|
|
|
| Privilege needed | **Unprivileged** (no root) | Root required for jailer setup |
|
|
| Mechanism | Landlock `restrict_self()` | chroot + pivot_root + namespaces |
|
|
| Flexibility | Path-based rules, stackable | Fixed jail directory structure |
|
|
| Kernel requirement | 5.13+ (degradable) | Any Linux with namespaces |
|
|
| Setup complexity | In-process, automatic | External jailer binary, manual setup |
|
|
|
|
This is a genuine advantage for deployment simplicity — no root required, no separate jailer binary, no complex jail directory setup.
|
|
|
|
### 7.3 CLI-First Developer Experience
|
|
|
|
Volt can boot a VM with a single command:
|
|
```bash
|
|
volt-vmm --kernel vmlinux.bin --memory 256M --cpus 2 --tap tap0
|
|
```
|
|
|
|
Firecracker requires:
|
|
```bash
|
|
# Start Firecracker (API mode only)
|
|
firecracker --api-sock /tmp/fc.sock &
|
|
|
|
# Configure via API
|
|
curl -X PUT --unix-socket /tmp/fc.sock \
|
|
-d '{"kernel_image_path":"vmlinux.bin"}' \
|
|
http://localhost/boot-source
|
|
|
|
curl -X PUT --unix-socket /tmp/fc.sock \
|
|
-d '{"vcpu_count":2,"mem_size_mib":256}' \
|
|
http://localhost/machine-config
|
|
|
|
curl -X PUT --unix-socket /tmp/fc.sock \
|
|
-d '{"action_type":"InstanceStart"}' \
|
|
http://localhost/actions
|
|
```
|
|
|
|
For development, testing, and scripting, the CLI approach is significantly more ergonomic.
|
|
|
|
### 7.4 More Thorough Page Tables
|
|
|
|
Volt sets up both identity-mapped (0-4GB) and high-kernel-mapped (0xFFFFFFFF80000000+) page tables. This provides a more robust boot environment that can handle kernels expecting virtual addresses early in startup.
|
|
|
|
### 7.5 macvtap and vhost-net Support (In Progress)
|
|
|
|
Volt has code for macvtap networking and vhost-net kernel offload:
|
|
- **macvtap** — direct attachment to host NIC without bridge, lower overhead
|
|
- **vhost-net** — kernel-space packet processing, significant throughput improvement
|
|
|
|
Firecracker uses userspace virtio-net only with TAP, which has higher per-packet overhead. If Volt completes the vhost-net integration, it could have a meaningful networking performance advantage.
|
|
|
|
### 7.6 Modern Rust Ecosystem
|
|
|
|
| Choice | Volt | Firecracker | Advantage |
|
|
|--------|-----------|-------------|-----------|
|
|
| Error handling | `thiserror` + `anyhow` | Custom error types | More ergonomic for developers |
|
|
| Logging | `tracing` (structured, spans) | `log` crate | Better observability |
|
|
| Concurrency | `parking_lot` + `crossbeam` | `std::sync` | Lower contention |
|
|
| CLI | `clap` (derive macros) | N/A | Developer experience |
|
|
| HTTP | Axum (modern, typed) | Custom HTTP parser | Faster development |
|
|
|
|
### 7.7 Smaller Binary (Potential)
|
|
|
|
With aggressive release profile settings already configured:
|
|
```toml
|
|
[profile.release]
|
|
lto = true
|
|
codegen-units = 1
|
|
panic = "abort"
|
|
strip = true
|
|
```
|
|
|
|
The Volt binary could be significantly smaller than Firecracker's (~3-4MB) due to less code. However, the Tokio dependency adds weight. If Tokio were replaced with a lighter async solution or raw epoll, binary size could be very competitive.
|
|
|
|
### 7.8 systemd-networkd Integration
|
|
|
|
Volt includes code for direct systemd-networkd integration (in `net/networkd.rs`), which could simplify network setup on modern Linux hosts without manual bridge/TAP configuration.
|
|
|
|
---
|
|
|
|
## 8. Gap Analysis & Roadmap
|
|
|
|
### 8.1 Critical Gaps (Must Fix Before Any Production Use)
|
|
|
|
| Gap | Description | Effort |
|
|
|-----|-------------|--------|
|
|
| **seccomp filter** | No syscall filtering — a VMM escape has full access to all syscalls | 2-3 days |
|
|
| **Capability dropping** | VMM process retains all capabilities of its user | 1 day |
|
|
| **virtio-net integration** | Code exists but disabled (`// TODO: Fix net module`) — VMs can't network | 3-5 days |
|
|
| **Device model integration** | virtio devices aren't wired into the vCPU IO exit handler | 3-5 days |
|
|
| **Integration tests** | No boot-to-userspace tests | 1-2 weeks |
|
|
|
|
### 8.2 Important Gaps (Needed for Competitive Feature Parity)
|
|
|
|
| Gap | Description | Effort |
|
|
|-----|-------------|--------|
|
|
| **Landlock sandboxing** | Analyzed but not implemented | 2-3 days |
|
|
| **Snapshot/Restore** | No state save/restore capability | 2-3 weeks |
|
|
| **vsock** | No host-guest communication channel (important for orchestration) | 1-2 weeks |
|
|
| **Rate limiting** | No IO rate limiting on block or net devices | 1 week |
|
|
| **CPU templates** | No CPUID normalization across hardware | 1-2 weeks |
|
|
| **aarch64 support** | x86_64 only | 2-4 weeks |
|
|
|
|
### 8.3 Nice-to-Have Gaps (Differentiation Opportunities)
|
|
|
|
| Gap | Description | Effort |
|
|
|-----|-------------|--------|
|
|
| **Stellarium integration** | CAS storage exists as separate crate, not wired into virtio-blk | 1-2 weeks |
|
|
| **vhost-net completion** | Kernel-offloaded networking (code exists) | 1-2 weeks |
|
|
| **macvtap completion** | Direct NIC attachment networking (code exists) | 1 week |
|
|
| **io_uring block backend** | Higher IOPS for block devices | 1-2 weeks |
|
|
| **Balloon device** | Dynamic memory management | 1-2 weeks |
|
|
| **API parity with Firecracker** | Per-device endpoints, pre-boot config | 1-2 weeks |
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
Volt is a promising early-stage microVMM with some genuinely innovative ideas (Landlock-first security, content-addressed storage, CLI-first UX) and a clean Rust codebase. Its architecture is sound and closely mirrors Firecracker's proven approach where it matters (KVM setup, CPUID filtering, boot protocol).
|
|
|
|
**The biggest risk is the security gap.** Without seccomp, capability dropping, and Landlock, Volt is not suitable for multi-tenant or production use. However, these are all well-understood problems with clear implementation paths.
|
|
|
|
**The biggest opportunity is the Stellarium + Landlock combination.** A VMM that can boot from content-addressed storage without requiring root privileges would be genuinely differentiated from Firecracker and could enable new deployment patterns (edge, developer laptops, rootless containers).
|
|
|
|
---
|
|
|
|
*Document generated: 2025-07-11*
|
|
*Based on Volt source analysis and Firecracker 1.6.0 documentation/binaries*
|