Files
volt-vmm/docs/benchmark-firecracker.md
Karl Clinger 40ed108dd5 Volt VMM (Neutron Stardust): source-available under AGPSL v5.0
KVM-based microVMM for the Volt platform:
- Sub-second VM boot times
- Minimal memory footprint
- Landlock LSM + seccomp security
- Virtio device support
- Custom kernel management

Copyright (c) Armored Gates LLC. All rights reserved.
Licensed under AGPSL v5.0
2026-03-21 01:04:35 -05:00

425 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Firecracker VMM Benchmark Results
**Date:** 2026-03-08
**Firecracker Version:** v1.14.2 (latest stable)
**Binary:** static-pie linked, x86_64, not stripped
**Test Host:** julius — Intel Xeon Silver 4210R @ 2.40GHz, 20 cores, Linux 6.1.0-42-amd64
**Kernel:** vmlinux-4.14.174 (Firecracker's official guest kernel, 21,441,304 bytes)
**Methodology:** No rootfs attached — kernel boots to VFS panic. Matches Volt test methodology.
---
## Table of Contents
1. [Executive Summary](#1-executive-summary)
2. [Binary Size](#2-binary-size)
3. [Cold Boot Time](#3-cold-boot-time)
4. [Startup Breakdown](#4-startup-breakdown)
5. [Memory Overhead](#5-memory-overhead)
6. [CPU Features (CPUID)](#6-cpu-features-cpuid)
7. [Thread Model](#7-thread-model)
8. [Comparison with Volt](#8-comparison-with-volt-vmm)
9. [Methodology Notes](#9-methodology-notes)
---
## 1. Executive Summary
| Metric | Firecracker v1.14.2 | Notes |
|--------|---------------------|-------|
| Binary size | 3.44 MB (3,436,512 bytes) | Static-pie, not stripped |
| Cold boot to kernel panic (wall) | **1,127ms median** | Includes ~500ms i8042 stall |
| Cold boot (no i8042 stall) | **351ms median** | With `i8042.noaux i8042.nokbd` |
| Kernel internal boot time | **912ms** / **138ms** | Default / no-i8042 |
| VMM overhead (startup→VM running) | **~80ms** | FC process + API + KVM setup |
| RSS at 128MB guest | **52 MB** | ~50MB VMM overhead |
| RSS at 256MB guest | **56 MB** | +4MB vs 128MB guest |
| RSS at 512MB guest | **60 MB** | +8MB vs 128MB guest |
| Threads during VM run | 3 | main + fc_api + fc_vcpu_0 |
**Key Finding:** The ~912ms "boot time" with the default Firecracker kernel (4.14.174) is dominated by a **~500ms i8042 keyboard controller timeout**. The actual kernel initialization takes only ~130ms. This is a kernel issue, not a VMM issue.
---
## 2. Binary Size
```
-rwxr-xr-x 1 karl karl 3,436,512 Feb 26 11:32 firecracker-v1.14.2-x86_64
```
| Property | Value |
|----------|-------|
| Size | 3.44 MB (3,436,512 bytes) |
| Format | ELF 64-bit LSB pie executable, x86-64 |
| Linking | Static-pie (no shared library dependencies) |
| Stripped | No (includes symbol table) |
| Debug sections | 0 |
| Language | Rust |
### Related Binaries
| Binary | Size |
|--------|------|
| firecracker | 3.44 MB |
| jailer | 2.29 MB |
| cpu-template-helper | 2.58 MB |
| snapshot-editor | 1.23 MB |
| seccompiler-bin | 1.16 MB |
| rebase-snap | 0.52 MB |
---
## 3. Cold Boot Time
### Default Boot Args (`console=ttyS0 reboot=k panic=1 pci=off`)
10 iterations, 128MB guest RAM, 1 vCPU:
| Iteration | Wall Clock (ms) | Kernel Time (s) |
|-----------|-----------------|------------------|
| 1 | 1,130 | 0.9156 |
| 2 | 1,144 | 0.9097 |
| 3 | 1,132 | 0.9112 |
| 4 | 1,113 | 0.9138 |
| 5 | 1,126 | 0.9115 |
| 6 | 1,128 | 0.9130 |
| 7 | 1,143 | 0.9099 |
| 8 | 1,117 | 0.9119 |
| 9 | 1,123 | 0.9119 |
| 10 | 1,115 | 0.9169 |
| Statistic | Wall Clock (ms) | Kernel Time (ms) |
|-----------|-----------------|-------------------|
| **Min** | 1,113 | 910 |
| **Median** | 1,127 | 912 |
| **Max** | 1,144 | 917 |
| **Mean** | 1,127 | 913 |
| **Stddev** | ~10 | ~2 |
### Optimized Boot Args (`... i8042.noaux i8042.nokbd`)
Disabling the i8042 keyboard controller removes a ~500ms probe timeout:
| Iteration | Wall Clock (ms) | Kernel Time (s) |
|-----------|-----------------|------------------|
| 1 | 330 | 0.1418 |
| 2 | 347 | 0.1383 |
| 3 | 357 | 0.1391 |
| 4 | 358 | 0.1379 |
| 5 | 351 | 0.1367 |
| 6 | 371 | 0.1385 |
| 7 | 346 | 0.1376 |
| 8 | 378 | 0.1393 |
| 9 | 328 | 0.1382 |
| 10 | 355 | 0.1388 |
| Statistic | Wall Clock (ms) | Kernel Time (ms) |
|-----------|-----------------|-------------------|
| **Min** | 328 | 137 |
| **Median** | 353 | 138 |
| **Max** | 378 | 142 |
| **Mean** | 352 | 138 |
### Wall Clock vs Kernel Time Gap Analysis
The ~200ms gap between wall clock and kernel internal time is:
- **~80ms** — Firecracker process startup + API configuration + KVM VM creation
- **~125ms** — Kernel time between panic message and process exit (reboot handling, serial flush)
---
## 4. Startup Breakdown
Measured with nanosecond wall-clock timing of each API call:
| Phase | Duration | Cumulative | Description |
|-------|----------|------------|-------------|
| **FC process start → socket ready** | 7-9 ms | 8 ms | Firecracker binary loads, creates API socket |
| **PUT /boot-source** | 12-16 ms | 22 ms | Loads + validates kernel ELF (21MB) |
| **PUT /machine-config** | 8-15 ms | 33 ms | Validates machine configuration |
| **PUT /actions (InstanceStart)** | 44-74 ms | 80 ms | Creates KVM VM, allocates guest memory, sets up vCPU, page tables, starts vCPU thread |
| **Kernel boot (with i8042)** | ~912 ms | 992 ms | Includes 500ms i8042 probe timeout |
| **Kernel boot (no i8042)** | ~138 ms | 218 ms | Pure kernel initialization |
| **Kernel panic → process exit** | ~125 ms | — | Reboot handling, serial flush |
### API Overhead Detail (5 runs)
| Run | Socket | Boot-src | Machine-cfg | InstanceStart | Total to VM |
|-----|--------|----------|-------------|---------------|-------------|
| 1 | 9ms | 11ms | 8ms | 48ms | 76ms |
| 2 | 9ms | 14ms | 14ms | 63ms | 101ms |
| 3 | 8ms | 12ms | 15ms | 65ms | 101ms |
| 4 | 9ms | 13ms | 8ms | 44ms | 75ms |
| 5 | 9ms | 14ms | 9ms | 74ms | 108ms |
| **Median** | **9ms** | **13ms** | **9ms** | **63ms** | **101ms** |
The InstanceStart phase is the most variable (44-74ms) because it does the heavy lifting: KVM_CREATE_VM, mmap guest memory, set up page tables, configure vCPU registers, create vCPU thread, and enter KVM_RUN.
### Seccomp Impact
| Mode | Avg Wall Clock (5 runs) |
|------|------------------------|
| With seccomp | 8ms to exit |
| Without seccomp (`--no-seccomp`) | 8ms to exit |
Seccomp has no measurable impact on boot time (measured with `--no-api --config-file` mode).
---
## 5. Memory Overhead
### RSS by Guest Memory Size
Measured during active VM execution (kernel booted, pre-panic):
| Guest Memory | RSS (KB) | RSS (MB) | VSZ (KB) | VSZ (MB) | VMM Overhead |
|-------------|----------|----------|----------|----------|-------------|
| — (pre-boot) | 3,396 | 3 | — | — | Base process |
| 128 MB | 51,26053,520 | 5052 | 139,084 | 135 | ~50 MB |
| 256 MB | 57,61657,972 | 5657 | 270,156 | 263 | ~54 MB |
| 512 MB | 61,70462,068 | 6061 | 532,300 | 519 | ~58 MB |
### Memory Breakdown (128MB guest)
From `/proc/PID/smaps_rollup` and `/proc/PID/status`:
| Metric | Value |
|--------|-------|
| Pss (proportional) | 51,800 KB |
| Pss_Anon | 49,432 KB |
| Pss_File | 2,364 KB |
| AnonHugePages | 47,104 KB |
| VmData | 136,128 KB (132 MB) |
| VmExe | 2,380 KB (2.3 MB) |
| VmStk | 132 KB |
| VmLib | 8 KB |
| Memory regions | 29 |
| Threads | 3 |
### Key Observations
1. **Guest memory is mmap'd but demand-paged**: VSZ scales linearly with guest size, but RSS only reflects touched pages
2. **VMM base overhead is ~3.4 MB** (pre-boot RSS)
3. **~50 MB RSS at 128MB guest**: The kernel touches ~47MB during boot (page tables, kernel code, data structures)
4. **AnonHugePages = 47MB**: THP (Transparent Huge Pages) is used for guest memory, reducing TLB pressure
5. **Scaling**: RSS increases ~4MB per 128MB of additional guest memory (minimal — guest pages are only touched on demand)
### Pre-boot vs Post-boot Memory
| Phase | RSS |
|-------|-----|
| After FC process start | 3,396 KB (3.3 MB) |
| After boot-source + machine-config | 3,396 KB (3.3 MB) — no change |
| After InstanceStart (VM running) | 51,260+ KB (~50 MB) |
All guest memory allocation happens during InstanceStart. The API configuration phase uses zero additional memory.
---
## 6. CPU Features (CPUID)
Firecracker v1.14.2 exposes the following CPU features to guests (as reported by kernel 4.14.174):
### XSAVE Features Exposed
| Feature | XSAVE Bit | Offset | Size |
|---------|-----------|--------|------|
| x87 FPU | 0x001 | — | — |
| SSE | 0x002 | — | — |
| AVX | 0x004 | 576 | 256 bytes |
| MPX bounds | 0x008 | 832 | 64 bytes |
| MPX CSR | 0x010 | 896 | 64 bytes |
| AVX-512 opmask | 0x020 | 960 | 64 bytes |
| AVX-512 Hi256 | 0x040 | 1024 | 512 bytes |
| AVX-512 ZMM_Hi256 | 0x080 | 1536 | 1024 bytes |
| PKU | 0x200 | 2560 | 8 bytes |
Total XSAVE context: 2,568 bytes (compacted format).
### CPU Identity (as seen by guest)
```
vendor_id: GenuineIntel
model name: Intel(R) Xeon(R) Processor @ 2.40GHz
family: 0x6
model: 0x55
stepping: 0x7
```
Firecracker strips the full CPU model name and reports a generic "Intel(R) Xeon(R) Processor @ 2.40GHz" (removed "Silver 4210R" from host).
### Security Mitigations Active in Guest
| Mitigation | Status |
|-----------|--------|
| NX (Execute Disable) | Active |
| Spectre V1 | usercopy/swapgs barriers |
| Spectre V2 | Enhanced IBRS |
| SpectreRSB | RSB filling on context switch |
| IBPB | Conditional on context switch |
| SSBD | Via prctl and seccomp |
| TAA | TSX disabled |
### Paravirt Features
| Feature | Present |
|---------|---------|
| KVM hypervisor detection | ✅ |
| kvm-clock | ✅ (MSRs 4b564d01/4b564d00) |
| KVM async PF | ✅ |
| KVM stealtime | ✅ |
| PV qspinlock | ✅ |
| x2apic | ✅ |
### Devices Visible to Guest
| Device | Type | Notes |
|--------|------|-------|
| Serial (ttyS0) | I/O 0x3f8 | 8250/16550 UART (U6_16550A) |
| i8042 keyboard | I/O 0x60, 0x64 | PS/2 controller |
| IOAPIC | MMIO 0xfec00000 | 24 GSIs |
| Local APIC | MMIO 0xfee00000 | x2apic mode |
| virtio-mmio | MMIO | Not probed (pci=off, no rootfs) |
---
## 7. Thread Model
Firecracker uses a minimal thread model:
| Thread | Name | Role |
|--------|------|------|
| Main | `firecracker-bin` | Event loop, serial I/O, device emulation |
| API | `fc_api` | HTTP API server on Unix socket |
| vCPU 0 | `fc_vcpu 0` | KVM_RUN loop for vCPU 0 |
With N vCPUs, there would be N+2 threads total.
### Process Details
| Property | Value |
|----------|-------|
| Seccomp | Level 2 (strict) |
| NoNewPrivs | Yes |
| Capabilities | None (all dropped) |
| Seccomp filters | 1 |
| FD limit | 1,048,576 |
---
## 8. Comparison with Volt
### Binary Size
| VMM | Size | Linking |
|-----|------|---------|
| Firecracker v1.14.2 | 3.44 MB (3,436,512 bytes) | Static-pie, not stripped |
| Volt 0.1.0 | 3.26 MB (3,258,448 bytes) | Dynamic (release build) |
Volt is **5% smaller**, though Firecracker is statically linked (includes musl libc).
### Boot Time Comparison
Both tested with the same kernel (vmlinux-4.14.174), same boot args, no rootfs:
| Metric | Firecracker | Volt | Delta |
|--------|-------------|-----------|-------|
| Wall clock (default boot) | 1,127ms median | TBD | — |
| Kernel internal time | 912ms | TBD | — |
| VMM startup overhead | ~80ms | TBD | — |
| Wall clock (no i8042) | 351ms median | TBD | — |
**Note:** Fill in Volt numbers from `benchmark-volt-vmm.md` for direct comparison.
### Memory Overhead
| Guest Size | Firecracker RSS | Volt RSS | Delta |
|-----------|-----------------|---------------|-------|
| Pre-boot (base) | 3.3 MB | TBD | — |
| 128 MB | 5052 MB | TBD | — |
| 256 MB | 5657 MB | TBD | — |
| 512 MB | 6061 MB | TBD | — |
### Architecture Differences Affecting Performance
| Aspect | Firecracker | Volt |
|--------|-------------|-----------|
| API model | REST over Unix socket (always on) | Direct (no API server) |
| Thread model | main + api + N×vcpu | main + N×vcpu |
| Memory allocation | During InstanceStart | During VM setup |
| Kernel loading | Via API call (separate step) | At startup |
| Seccomp | BPF filter, ~50 syscalls | Planned |
| Guest memory | mmap + demand-paging + THP | TBD |
Firecracker's API-based architecture adds ~80ms overhead but enables runtime configuration. A direct-launch VMM like Volt can potentially start faster by eliminating the socket setup and HTTP parsing.
---
## 9. Methodology Notes
### Test Environment
- **Host OS:** Debian (Linux 6.1.0-42-amd64)
- **CPU:** Intel Xeon Silver 4210R @ 2.40GHz (Cascade Lake)
- **KVM:** `/dev/kvm` with user `karl` in group `kvm`
- **Firecracker:** Downloaded from GitHub releases, not jailed (bare process)
- **No jailer:** Tests run without the jailer for apples-to-apples VMM comparison
### What's Measured
- **Wall clock time:** `date +%s%N` before FC process start to detection of "Rebooting in" in serial output
- **Kernel internal time:** Extracted from kernel log timestamps (`[0.912xxx]` before "Rebooting in")
- **RSS:** `ps -p PID -o rss=` captured during VM execution
- **VMM overhead:** Time from process start to InstanceStart API return
### Caveats
1. **No rootfs:** Kernel panics at VFS mount. This measures pure boot, not a complete VM startup with userspace.
2. **i8042 timeout:** The default kernel (4.14.174) spends ~500ms probing the PS/2 keyboard controller. This is a kernel config issue, not a VMM issue. A custom kernel with `CONFIG_SERIO_I8042=n` would eliminate this.
3. **Serial output buffering:** Firecracker's serial port occasionally hits `WouldBlock` errors, which may slightly affect kernel timing (serial I/O blocks the vCPU when the buffer fills).
4. **No huge page pre-allocation:** Tests use default THP (Transparent Huge Pages). Pre-allocating huge pages would reduce memory allocation latency.
5. **Both kernels identical:** The "official" Firecracker kernel and `vmlinux-4.14` symlink point to the same 21MB binary (vmlinux-4.14.174).
### Kernel Boot Timeline (annotated)
```
0ms FC process starts
8ms API socket ready
22ms Kernel loaded (PUT /boot-source)
33ms Machine configured (PUT /machine-config)
80ms VM running (PUT /actions InstanceStart)
┌─── Kernel execution begins ───┐
~84ms │ Memory init, e820 map │
~84ms │ KVM hypervisor detected │
~84ms │ kvm-clock initialized │
~88ms │ SMP init, CPU0 identified │
~113ms │ devtmpfs, clocksource │
~150ms │ Network stack init │
~176ms │ Serial driver registered │
~188ms │ i8042 probe begins │ ← 500ms stall
~464ms │ i8042 KBD port registered │
~976ms │ i8042 keyboard input created │ ← i8042 probe complete
~980ms │ VFS: Cannot open root device │
~985ms │ Kernel panic │
~993ms │ "Rebooting in 1 seconds.." │
└────────────────────────────────┘
~1130ms Serial output flushed, process exits
```
---
## Raw Data Files
All raw benchmark data is stored in `/tmp/fc-bench-results/`:
- `boot-times-official.txt` — 10 iterations of wall-clock + kernel times
- `precise-boot-times.txt` — 10 iterations with --no-api mode
- `memory-official.txt` — RSS/VSZ for 128/256/512 MB guest sizes
- `smaps-detail-{128,256,512}.txt` — Detailed memory maps
- `status-official-{128,256,512}.txt` — /proc/PID/status snapshots
- `kernel-output-official.txt` — Full kernel serial output
---
*Generated by automated benchmark suite, 2026-03-08*