Volt VMM (Neutron Stardust): source-available under AGPSL v5.0
KVM-based microVMM for the Volt platform: - Sub-second VM boot times - Minimal memory footprint - Landlock LSM + seccomp security - Virtio device support - Custom kernel management Copyright (c) Armored Gates LLC. All rights reserved. Licensed under AGPSL v5.0
This commit is contained in:
424
docs/benchmark-firecracker.md
Normal file
424
docs/benchmark-firecracker.md
Normal file
@@ -0,0 +1,424 @@
|
||||
# Firecracker VMM Benchmark Results
|
||||
|
||||
**Date:** 2026-03-08
|
||||
**Firecracker Version:** v1.14.2 (latest stable)
|
||||
**Binary:** static-pie linked, x86_64, not stripped
|
||||
**Test Host:** julius — Intel Xeon Silver 4210R @ 2.40GHz, 20 cores, Linux 6.1.0-42-amd64
|
||||
**Kernel:** vmlinux-4.14.174 (Firecracker's official guest kernel, 21,441,304 bytes)
|
||||
**Methodology:** No rootfs attached — kernel boots to VFS panic. Matches Volt test methodology.
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
1. [Executive Summary](#1-executive-summary)
|
||||
2. [Binary Size](#2-binary-size)
|
||||
3. [Cold Boot Time](#3-cold-boot-time)
|
||||
4. [Startup Breakdown](#4-startup-breakdown)
|
||||
5. [Memory Overhead](#5-memory-overhead)
|
||||
6. [CPU Features (CPUID)](#6-cpu-features-cpuid)
|
||||
7. [Thread Model](#7-thread-model)
|
||||
8. [Comparison with Volt](#8-comparison-with-volt-vmm)
|
||||
9. [Methodology Notes](#9-methodology-notes)
|
||||
|
||||
---
|
||||
|
||||
## 1. Executive Summary
|
||||
|
||||
| Metric | Firecracker v1.14.2 | Notes |
|
||||
|--------|---------------------|-------|
|
||||
| Binary size | 3.44 MB (3,436,512 bytes) | Static-pie, not stripped |
|
||||
| Cold boot to kernel panic (wall) | **1,127ms median** | Includes ~500ms i8042 stall |
|
||||
| Cold boot (no i8042 stall) | **351ms median** | With `i8042.noaux i8042.nokbd` |
|
||||
| Kernel internal boot time | **912ms** / **138ms** | Default / no-i8042 |
|
||||
| VMM overhead (startup→VM running) | **~80ms** | FC process + API + KVM setup |
|
||||
| RSS at 128MB guest | **52 MB** | ~50MB VMM overhead |
|
||||
| RSS at 256MB guest | **56 MB** | +4MB vs 128MB guest |
|
||||
| RSS at 512MB guest | **60 MB** | +8MB vs 128MB guest |
|
||||
| Threads during VM run | 3 | main + fc_api + fc_vcpu_0 |
|
||||
|
||||
**Key Finding:** The ~912ms "boot time" with the default Firecracker kernel (4.14.174) is dominated by a **~500ms i8042 keyboard controller timeout**. The actual kernel initialization takes only ~130ms. This is a kernel issue, not a VMM issue.
|
||||
|
||||
---
|
||||
|
||||
## 2. Binary Size
|
||||
|
||||
```
|
||||
-rwxr-xr-x 1 karl karl 3,436,512 Feb 26 11:32 firecracker-v1.14.2-x86_64
|
||||
```
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| Size | 3.44 MB (3,436,512 bytes) |
|
||||
| Format | ELF 64-bit LSB pie executable, x86-64 |
|
||||
| Linking | Static-pie (no shared library dependencies) |
|
||||
| Stripped | No (includes symbol table) |
|
||||
| Debug sections | 0 |
|
||||
| Language | Rust |
|
||||
|
||||
### Related Binaries
|
||||
|
||||
| Binary | Size |
|
||||
|--------|------|
|
||||
| firecracker | 3.44 MB |
|
||||
| jailer | 2.29 MB |
|
||||
| cpu-template-helper | 2.58 MB |
|
||||
| snapshot-editor | 1.23 MB |
|
||||
| seccompiler-bin | 1.16 MB |
|
||||
| rebase-snap | 0.52 MB |
|
||||
|
||||
---
|
||||
|
||||
## 3. Cold Boot Time
|
||||
|
||||
### Default Boot Args (`console=ttyS0 reboot=k panic=1 pci=off`)
|
||||
|
||||
10 iterations, 128MB guest RAM, 1 vCPU:
|
||||
|
||||
| Iteration | Wall Clock (ms) | Kernel Time (s) |
|
||||
|-----------|-----------------|------------------|
|
||||
| 1 | 1,130 | 0.9156 |
|
||||
| 2 | 1,144 | 0.9097 |
|
||||
| 3 | 1,132 | 0.9112 |
|
||||
| 4 | 1,113 | 0.9138 |
|
||||
| 5 | 1,126 | 0.9115 |
|
||||
| 6 | 1,128 | 0.9130 |
|
||||
| 7 | 1,143 | 0.9099 |
|
||||
| 8 | 1,117 | 0.9119 |
|
||||
| 9 | 1,123 | 0.9119 |
|
||||
| 10 | 1,115 | 0.9169 |
|
||||
|
||||
| Statistic | Wall Clock (ms) | Kernel Time (ms) |
|
||||
|-----------|-----------------|-------------------|
|
||||
| **Min** | 1,113 | 910 |
|
||||
| **Median** | 1,127 | 912 |
|
||||
| **Max** | 1,144 | 917 |
|
||||
| **Mean** | 1,127 | 913 |
|
||||
| **Stddev** | ~10 | ~2 |
|
||||
|
||||
### Optimized Boot Args (`... i8042.noaux i8042.nokbd`)
|
||||
|
||||
Disabling the i8042 keyboard controller removes a ~500ms probe timeout:
|
||||
|
||||
| Iteration | Wall Clock (ms) | Kernel Time (s) |
|
||||
|-----------|-----------------|------------------|
|
||||
| 1 | 330 | 0.1418 |
|
||||
| 2 | 347 | 0.1383 |
|
||||
| 3 | 357 | 0.1391 |
|
||||
| 4 | 358 | 0.1379 |
|
||||
| 5 | 351 | 0.1367 |
|
||||
| 6 | 371 | 0.1385 |
|
||||
| 7 | 346 | 0.1376 |
|
||||
| 8 | 378 | 0.1393 |
|
||||
| 9 | 328 | 0.1382 |
|
||||
| 10 | 355 | 0.1388 |
|
||||
|
||||
| Statistic | Wall Clock (ms) | Kernel Time (ms) |
|
||||
|-----------|-----------------|-------------------|
|
||||
| **Min** | 328 | 137 |
|
||||
| **Median** | 353 | 138 |
|
||||
| **Max** | 378 | 142 |
|
||||
| **Mean** | 352 | 138 |
|
||||
|
||||
### Wall Clock vs Kernel Time Gap Analysis
|
||||
|
||||
The ~200ms gap between wall clock and kernel internal time is:
|
||||
- **~80ms** — Firecracker process startup + API configuration + KVM VM creation
|
||||
- **~125ms** — Kernel time between panic message and process exit (reboot handling, serial flush)
|
||||
|
||||
---
|
||||
|
||||
## 4. Startup Breakdown
|
||||
|
||||
Measured with nanosecond wall-clock timing of each API call:
|
||||
|
||||
| Phase | Duration | Cumulative | Description |
|
||||
|-------|----------|------------|-------------|
|
||||
| **FC process start → socket ready** | 7-9 ms | 8 ms | Firecracker binary loads, creates API socket |
|
||||
| **PUT /boot-source** | 12-16 ms | 22 ms | Loads + validates kernel ELF (21MB) |
|
||||
| **PUT /machine-config** | 8-15 ms | 33 ms | Validates machine configuration |
|
||||
| **PUT /actions (InstanceStart)** | 44-74 ms | 80 ms | Creates KVM VM, allocates guest memory, sets up vCPU, page tables, starts vCPU thread |
|
||||
| **Kernel boot (with i8042)** | ~912 ms | 992 ms | Includes 500ms i8042 probe timeout |
|
||||
| **Kernel boot (no i8042)** | ~138 ms | 218 ms | Pure kernel initialization |
|
||||
| **Kernel panic → process exit** | ~125 ms | — | Reboot handling, serial flush |
|
||||
|
||||
### API Overhead Detail (5 runs)
|
||||
|
||||
| Run | Socket | Boot-src | Machine-cfg | InstanceStart | Total to VM |
|
||||
|-----|--------|----------|-------------|---------------|-------------|
|
||||
| 1 | 9ms | 11ms | 8ms | 48ms | 76ms |
|
||||
| 2 | 9ms | 14ms | 14ms | 63ms | 101ms |
|
||||
| 3 | 8ms | 12ms | 15ms | 65ms | 101ms |
|
||||
| 4 | 9ms | 13ms | 8ms | 44ms | 75ms |
|
||||
| 5 | 9ms | 14ms | 9ms | 74ms | 108ms |
|
||||
| **Median** | **9ms** | **13ms** | **9ms** | **63ms** | **101ms** |
|
||||
|
||||
The InstanceStart phase is the most variable (44-74ms) because it does the heavy lifting: KVM_CREATE_VM, mmap guest memory, set up page tables, configure vCPU registers, create vCPU thread, and enter KVM_RUN.
|
||||
|
||||
### Seccomp Impact
|
||||
|
||||
| Mode | Avg Wall Clock (5 runs) |
|
||||
|------|------------------------|
|
||||
| With seccomp | 8ms to exit |
|
||||
| Without seccomp (`--no-seccomp`) | 8ms to exit |
|
||||
|
||||
Seccomp has no measurable impact on boot time (measured with `--no-api --config-file` mode).
|
||||
|
||||
---
|
||||
|
||||
## 5. Memory Overhead
|
||||
|
||||
### RSS by Guest Memory Size
|
||||
|
||||
Measured during active VM execution (kernel booted, pre-panic):
|
||||
|
||||
| Guest Memory | RSS (KB) | RSS (MB) | VSZ (KB) | VSZ (MB) | VMM Overhead |
|
||||
|-------------|----------|----------|----------|----------|-------------|
|
||||
| — (pre-boot) | 3,396 | 3 | — | — | Base process |
|
||||
| 128 MB | 51,260–53,520 | 50–52 | 139,084 | 135 | ~50 MB |
|
||||
| 256 MB | 57,616–57,972 | 56–57 | 270,156 | 263 | ~54 MB |
|
||||
| 512 MB | 61,704–62,068 | 60–61 | 532,300 | 519 | ~58 MB |
|
||||
|
||||
### Memory Breakdown (128MB guest)
|
||||
|
||||
From `/proc/PID/smaps_rollup` and `/proc/PID/status`:
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Pss (proportional) | 51,800 KB |
|
||||
| Pss_Anon | 49,432 KB |
|
||||
| Pss_File | 2,364 KB |
|
||||
| AnonHugePages | 47,104 KB |
|
||||
| VmData | 136,128 KB (132 MB) |
|
||||
| VmExe | 2,380 KB (2.3 MB) |
|
||||
| VmStk | 132 KB |
|
||||
| VmLib | 8 KB |
|
||||
| Memory regions | 29 |
|
||||
| Threads | 3 |
|
||||
|
||||
### Key Observations
|
||||
|
||||
1. **Guest memory is mmap'd but demand-paged**: VSZ scales linearly with guest size, but RSS only reflects touched pages
|
||||
2. **VMM base overhead is ~3.4 MB** (pre-boot RSS)
|
||||
3. **~50 MB RSS at 128MB guest**: The kernel touches ~47MB during boot (page tables, kernel code, data structures)
|
||||
4. **AnonHugePages = 47MB**: THP (Transparent Huge Pages) is used for guest memory, reducing TLB pressure
|
||||
5. **Scaling**: RSS increases ~4MB per 128MB of additional guest memory (minimal — guest pages are only touched on demand)
|
||||
|
||||
### Pre-boot vs Post-boot Memory
|
||||
|
||||
| Phase | RSS |
|
||||
|-------|-----|
|
||||
| After FC process start | 3,396 KB (3.3 MB) |
|
||||
| After boot-source + machine-config | 3,396 KB (3.3 MB) — no change |
|
||||
| After InstanceStart (VM running) | 51,260+ KB (~50 MB) |
|
||||
|
||||
All guest memory allocation happens during InstanceStart. The API configuration phase uses zero additional memory.
|
||||
|
||||
---
|
||||
|
||||
## 6. CPU Features (CPUID)
|
||||
|
||||
Firecracker v1.14.2 exposes the following CPU features to guests (as reported by kernel 4.14.174):
|
||||
|
||||
### XSAVE Features Exposed
|
||||
|
||||
| Feature | XSAVE Bit | Offset | Size |
|
||||
|---------|-----------|--------|------|
|
||||
| x87 FPU | 0x001 | — | — |
|
||||
| SSE | 0x002 | — | — |
|
||||
| AVX | 0x004 | 576 | 256 bytes |
|
||||
| MPX bounds | 0x008 | 832 | 64 bytes |
|
||||
| MPX CSR | 0x010 | 896 | 64 bytes |
|
||||
| AVX-512 opmask | 0x020 | 960 | 64 bytes |
|
||||
| AVX-512 Hi256 | 0x040 | 1024 | 512 bytes |
|
||||
| AVX-512 ZMM_Hi256 | 0x080 | 1536 | 1024 bytes |
|
||||
| PKU | 0x200 | 2560 | 8 bytes |
|
||||
|
||||
Total XSAVE context: 2,568 bytes (compacted format).
|
||||
|
||||
### CPU Identity (as seen by guest)
|
||||
|
||||
```
|
||||
vendor_id: GenuineIntel
|
||||
model name: Intel(R) Xeon(R) Processor @ 2.40GHz
|
||||
family: 0x6
|
||||
model: 0x55
|
||||
stepping: 0x7
|
||||
```
|
||||
|
||||
Firecracker strips the full CPU model name and reports a generic "Intel(R) Xeon(R) Processor @ 2.40GHz" (removed "Silver 4210R" from host).
|
||||
|
||||
### Security Mitigations Active in Guest
|
||||
|
||||
| Mitigation | Status |
|
||||
|-----------|--------|
|
||||
| NX (Execute Disable) | Active |
|
||||
| Spectre V1 | usercopy/swapgs barriers |
|
||||
| Spectre V2 | Enhanced IBRS |
|
||||
| SpectreRSB | RSB filling on context switch |
|
||||
| IBPB | Conditional on context switch |
|
||||
| SSBD | Via prctl and seccomp |
|
||||
| TAA | TSX disabled |
|
||||
|
||||
### Paravirt Features
|
||||
|
||||
| Feature | Present |
|
||||
|---------|---------|
|
||||
| KVM hypervisor detection | ✅ |
|
||||
| kvm-clock | ✅ (MSRs 4b564d01/4b564d00) |
|
||||
| KVM async PF | ✅ |
|
||||
| KVM stealtime | ✅ |
|
||||
| PV qspinlock | ✅ |
|
||||
| x2apic | ✅ |
|
||||
|
||||
### Devices Visible to Guest
|
||||
|
||||
| Device | Type | Notes |
|
||||
|--------|------|-------|
|
||||
| Serial (ttyS0) | I/O 0x3f8 | 8250/16550 UART (U6_16550A) |
|
||||
| i8042 keyboard | I/O 0x60, 0x64 | PS/2 controller |
|
||||
| IOAPIC | MMIO 0xfec00000 | 24 GSIs |
|
||||
| Local APIC | MMIO 0xfee00000 | x2apic mode |
|
||||
| virtio-mmio | MMIO | Not probed (pci=off, no rootfs) |
|
||||
|
||||
---
|
||||
|
||||
## 7. Thread Model
|
||||
|
||||
Firecracker uses a minimal thread model:
|
||||
|
||||
| Thread | Name | Role |
|
||||
|--------|------|------|
|
||||
| Main | `firecracker-bin` | Event loop, serial I/O, device emulation |
|
||||
| API | `fc_api` | HTTP API server on Unix socket |
|
||||
| vCPU 0 | `fc_vcpu 0` | KVM_RUN loop for vCPU 0 |
|
||||
|
||||
With N vCPUs, there would be N+2 threads total.
|
||||
|
||||
### Process Details
|
||||
|
||||
| Property | Value |
|
||||
|----------|-------|
|
||||
| Seccomp | Level 2 (strict) |
|
||||
| NoNewPrivs | Yes |
|
||||
| Capabilities | None (all dropped) |
|
||||
| Seccomp filters | 1 |
|
||||
| FD limit | 1,048,576 |
|
||||
|
||||
---
|
||||
|
||||
## 8. Comparison with Volt
|
||||
|
||||
### Binary Size
|
||||
|
||||
| VMM | Size | Linking |
|
||||
|-----|------|---------|
|
||||
| Firecracker v1.14.2 | 3.44 MB (3,436,512 bytes) | Static-pie, not stripped |
|
||||
| Volt 0.1.0 | 3.26 MB (3,258,448 bytes) | Dynamic (release build) |
|
||||
|
||||
Volt is **5% smaller**, though Firecracker is statically linked (includes musl libc).
|
||||
|
||||
### Boot Time Comparison
|
||||
|
||||
Both tested with the same kernel (vmlinux-4.14.174), same boot args, no rootfs:
|
||||
|
||||
| Metric | Firecracker | Volt | Delta |
|
||||
|--------|-------------|-----------|-------|
|
||||
| Wall clock (default boot) | 1,127ms median | TBD | — |
|
||||
| Kernel internal time | 912ms | TBD | — |
|
||||
| VMM startup overhead | ~80ms | TBD | — |
|
||||
| Wall clock (no i8042) | 351ms median | TBD | — |
|
||||
|
||||
**Note:** Fill in Volt numbers from `benchmark-volt-vmm.md` for direct comparison.
|
||||
|
||||
### Memory Overhead
|
||||
|
||||
| Guest Size | Firecracker RSS | Volt RSS | Delta |
|
||||
|-----------|-----------------|---------------|-------|
|
||||
| Pre-boot (base) | 3.3 MB | TBD | — |
|
||||
| 128 MB | 50–52 MB | TBD | — |
|
||||
| 256 MB | 56–57 MB | TBD | — |
|
||||
| 512 MB | 60–61 MB | TBD | — |
|
||||
|
||||
### Architecture Differences Affecting Performance
|
||||
|
||||
| Aspect | Firecracker | Volt |
|
||||
|--------|-------------|-----------|
|
||||
| API model | REST over Unix socket (always on) | Direct (no API server) |
|
||||
| Thread model | main + api + N×vcpu | main + N×vcpu |
|
||||
| Memory allocation | During InstanceStart | During VM setup |
|
||||
| Kernel loading | Via API call (separate step) | At startup |
|
||||
| Seccomp | BPF filter, ~50 syscalls | Planned |
|
||||
| Guest memory | mmap + demand-paging + THP | TBD |
|
||||
|
||||
Firecracker's API-based architecture adds ~80ms overhead but enables runtime configuration. A direct-launch VMM like Volt can potentially start faster by eliminating the socket setup and HTTP parsing.
|
||||
|
||||
---
|
||||
|
||||
## 9. Methodology Notes
|
||||
|
||||
### Test Environment
|
||||
|
||||
- **Host OS:** Debian (Linux 6.1.0-42-amd64)
|
||||
- **CPU:** Intel Xeon Silver 4210R @ 2.40GHz (Cascade Lake)
|
||||
- **KVM:** `/dev/kvm` with user `karl` in group `kvm`
|
||||
- **Firecracker:** Downloaded from GitHub releases, not jailed (bare process)
|
||||
- **No jailer:** Tests run without the jailer for apples-to-apples VMM comparison
|
||||
|
||||
### What's Measured
|
||||
|
||||
- **Wall clock time:** `date +%s%N` before FC process start to detection of "Rebooting in" in serial output
|
||||
- **Kernel internal time:** Extracted from kernel log timestamps (`[0.912xxx]` before "Rebooting in")
|
||||
- **RSS:** `ps -p PID -o rss=` captured during VM execution
|
||||
- **VMM overhead:** Time from process start to InstanceStart API return
|
||||
|
||||
### Caveats
|
||||
|
||||
1. **No rootfs:** Kernel panics at VFS mount. This measures pure boot, not a complete VM startup with userspace.
|
||||
2. **i8042 timeout:** The default kernel (4.14.174) spends ~500ms probing the PS/2 keyboard controller. This is a kernel config issue, not a VMM issue. A custom kernel with `CONFIG_SERIO_I8042=n` would eliminate this.
|
||||
3. **Serial output buffering:** Firecracker's serial port occasionally hits `WouldBlock` errors, which may slightly affect kernel timing (serial I/O blocks the vCPU when the buffer fills).
|
||||
4. **No huge page pre-allocation:** Tests use default THP (Transparent Huge Pages). Pre-allocating huge pages would reduce memory allocation latency.
|
||||
5. **Both kernels identical:** The "official" Firecracker kernel and `vmlinux-4.14` symlink point to the same 21MB binary (vmlinux-4.14.174).
|
||||
|
||||
### Kernel Boot Timeline (annotated)
|
||||
|
||||
```
|
||||
0ms FC process starts
|
||||
8ms API socket ready
|
||||
22ms Kernel loaded (PUT /boot-source)
|
||||
33ms Machine configured (PUT /machine-config)
|
||||
80ms VM running (PUT /actions InstanceStart)
|
||||
┌─── Kernel execution begins ───┐
|
||||
~84ms │ Memory init, e820 map │
|
||||
~84ms │ KVM hypervisor detected │
|
||||
~84ms │ kvm-clock initialized │
|
||||
~88ms │ SMP init, CPU0 identified │
|
||||
~113ms │ devtmpfs, clocksource │
|
||||
~150ms │ Network stack init │
|
||||
~176ms │ Serial driver registered │
|
||||
~188ms │ i8042 probe begins │ ← 500ms stall
|
||||
~464ms │ i8042 KBD port registered │
|
||||
~976ms │ i8042 keyboard input created │ ← i8042 probe complete
|
||||
~980ms │ VFS: Cannot open root device │
|
||||
~985ms │ Kernel panic │
|
||||
~993ms │ "Rebooting in 1 seconds.." │
|
||||
└────────────────────────────────┘
|
||||
~1130ms Serial output flushed, process exits
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Raw Data Files
|
||||
|
||||
All raw benchmark data is stored in `/tmp/fc-bench-results/`:
|
||||
|
||||
- `boot-times-official.txt` — 10 iterations of wall-clock + kernel times
|
||||
- `precise-boot-times.txt` — 10 iterations with --no-api mode
|
||||
- `memory-official.txt` — RSS/VSZ for 128/256/512 MB guest sizes
|
||||
- `smaps-detail-{128,256,512}.txt` — Detailed memory maps
|
||||
- `status-official-{128,256,512}.txt` — /proc/PID/status snapshots
|
||||
- `kernel-output-official.txt` — Full kernel serial output
|
||||
|
||||
---
|
||||
|
||||
*Generated by automated benchmark suite, 2026-03-08*
|
||||
Reference in New Issue
Block a user