KVM-based microVMM for the Volt platform: - Sub-second VM boot times - Minimal memory footprint - Landlock LSM + seccomp security - Virtio device support - Custom kernel management Copyright (c) Armored Gates LLC. All rights reserved. Licensed under AGPSL v5.0
425 lines
15 KiB
Markdown
425 lines
15 KiB
Markdown
# Firecracker VMM Benchmark Results
|
||
|
||
**Date:** 2026-03-08
|
||
**Firecracker Version:** v1.14.2 (latest stable)
|
||
**Binary:** static-pie linked, x86_64, not stripped
|
||
**Test Host:** julius — Intel Xeon Silver 4210R @ 2.40GHz, 20 cores, Linux 6.1.0-42-amd64
|
||
**Kernel:** vmlinux-4.14.174 (Firecracker's official guest kernel, 21,441,304 bytes)
|
||
**Methodology:** No rootfs attached — kernel boots to VFS panic. Matches Volt test methodology.
|
||
|
||
---
|
||
|
||
## Table of Contents
|
||
|
||
1. [Executive Summary](#1-executive-summary)
|
||
2. [Binary Size](#2-binary-size)
|
||
3. [Cold Boot Time](#3-cold-boot-time)
|
||
4. [Startup Breakdown](#4-startup-breakdown)
|
||
5. [Memory Overhead](#5-memory-overhead)
|
||
6. [CPU Features (CPUID)](#6-cpu-features-cpuid)
|
||
7. [Thread Model](#7-thread-model)
|
||
8. [Comparison with Volt](#8-comparison-with-volt-vmm)
|
||
9. [Methodology Notes](#9-methodology-notes)
|
||
|
||
---
|
||
|
||
## 1. Executive Summary
|
||
|
||
| Metric | Firecracker v1.14.2 | Notes |
|
||
|--------|---------------------|-------|
|
||
| Binary size | 3.44 MB (3,436,512 bytes) | Static-pie, not stripped |
|
||
| Cold boot to kernel panic (wall) | **1,127ms median** | Includes ~500ms i8042 stall |
|
||
| Cold boot (no i8042 stall) | **351ms median** | With `i8042.noaux i8042.nokbd` |
|
||
| Kernel internal boot time | **912ms** / **138ms** | Default / no-i8042 |
|
||
| VMM overhead (startup→VM running) | **~80ms** | FC process + API + KVM setup |
|
||
| RSS at 128MB guest | **52 MB** | ~50MB VMM overhead |
|
||
| RSS at 256MB guest | **56 MB** | +4MB vs 128MB guest |
|
||
| RSS at 512MB guest | **60 MB** | +8MB vs 128MB guest |
|
||
| Threads during VM run | 3 | main + fc_api + fc_vcpu_0 |
|
||
|
||
**Key Finding:** The ~912ms "boot time" with the default Firecracker kernel (4.14.174) is dominated by a **~500ms i8042 keyboard controller timeout**. The actual kernel initialization takes only ~130ms. This is a kernel issue, not a VMM issue.
|
||
|
||
---
|
||
|
||
## 2. Binary Size
|
||
|
||
```
|
||
-rwxr-xr-x 1 karl karl 3,436,512 Feb 26 11:32 firecracker-v1.14.2-x86_64
|
||
```
|
||
|
||
| Property | Value |
|
||
|----------|-------|
|
||
| Size | 3.44 MB (3,436,512 bytes) |
|
||
| Format | ELF 64-bit LSB pie executable, x86-64 |
|
||
| Linking | Static-pie (no shared library dependencies) |
|
||
| Stripped | No (includes symbol table) |
|
||
| Debug sections | 0 |
|
||
| Language | Rust |
|
||
|
||
### Related Binaries
|
||
|
||
| Binary | Size |
|
||
|--------|------|
|
||
| firecracker | 3.44 MB |
|
||
| jailer | 2.29 MB |
|
||
| cpu-template-helper | 2.58 MB |
|
||
| snapshot-editor | 1.23 MB |
|
||
| seccompiler-bin | 1.16 MB |
|
||
| rebase-snap | 0.52 MB |
|
||
|
||
---
|
||
|
||
## 3. Cold Boot Time
|
||
|
||
### Default Boot Args (`console=ttyS0 reboot=k panic=1 pci=off`)
|
||
|
||
10 iterations, 128MB guest RAM, 1 vCPU:
|
||
|
||
| Iteration | Wall Clock (ms) | Kernel Time (s) |
|
||
|-----------|-----------------|------------------|
|
||
| 1 | 1,130 | 0.9156 |
|
||
| 2 | 1,144 | 0.9097 |
|
||
| 3 | 1,132 | 0.9112 |
|
||
| 4 | 1,113 | 0.9138 |
|
||
| 5 | 1,126 | 0.9115 |
|
||
| 6 | 1,128 | 0.9130 |
|
||
| 7 | 1,143 | 0.9099 |
|
||
| 8 | 1,117 | 0.9119 |
|
||
| 9 | 1,123 | 0.9119 |
|
||
| 10 | 1,115 | 0.9169 |
|
||
|
||
| Statistic | Wall Clock (ms) | Kernel Time (ms) |
|
||
|-----------|-----------------|-------------------|
|
||
| **Min** | 1,113 | 910 |
|
||
| **Median** | 1,127 | 912 |
|
||
| **Max** | 1,144 | 917 |
|
||
| **Mean** | 1,127 | 913 |
|
||
| **Stddev** | ~10 | ~2 |
|
||
|
||
### Optimized Boot Args (`... i8042.noaux i8042.nokbd`)
|
||
|
||
Disabling the i8042 keyboard controller removes a ~500ms probe timeout:
|
||
|
||
| Iteration | Wall Clock (ms) | Kernel Time (s) |
|
||
|-----------|-----------------|------------------|
|
||
| 1 | 330 | 0.1418 |
|
||
| 2 | 347 | 0.1383 |
|
||
| 3 | 357 | 0.1391 |
|
||
| 4 | 358 | 0.1379 |
|
||
| 5 | 351 | 0.1367 |
|
||
| 6 | 371 | 0.1385 |
|
||
| 7 | 346 | 0.1376 |
|
||
| 8 | 378 | 0.1393 |
|
||
| 9 | 328 | 0.1382 |
|
||
| 10 | 355 | 0.1388 |
|
||
|
||
| Statistic | Wall Clock (ms) | Kernel Time (ms) |
|
||
|-----------|-----------------|-------------------|
|
||
| **Min** | 328 | 137 |
|
||
| **Median** | 353 | 138 |
|
||
| **Max** | 378 | 142 |
|
||
| **Mean** | 352 | 138 |
|
||
|
||
### Wall Clock vs Kernel Time Gap Analysis
|
||
|
||
The ~200ms gap between wall clock and kernel internal time is:
|
||
- **~80ms** — Firecracker process startup + API configuration + KVM VM creation
|
||
- **~125ms** — Kernel time between panic message and process exit (reboot handling, serial flush)
|
||
|
||
---
|
||
|
||
## 4. Startup Breakdown
|
||
|
||
Measured with nanosecond wall-clock timing of each API call:
|
||
|
||
| Phase | Duration | Cumulative | Description |
|
||
|-------|----------|------------|-------------|
|
||
| **FC process start → socket ready** | 7-9 ms | 8 ms | Firecracker binary loads, creates API socket |
|
||
| **PUT /boot-source** | 12-16 ms | 22 ms | Loads + validates kernel ELF (21MB) |
|
||
| **PUT /machine-config** | 8-15 ms | 33 ms | Validates machine configuration |
|
||
| **PUT /actions (InstanceStart)** | 44-74 ms | 80 ms | Creates KVM VM, allocates guest memory, sets up vCPU, page tables, starts vCPU thread |
|
||
| **Kernel boot (with i8042)** | ~912 ms | 992 ms | Includes 500ms i8042 probe timeout |
|
||
| **Kernel boot (no i8042)** | ~138 ms | 218 ms | Pure kernel initialization |
|
||
| **Kernel panic → process exit** | ~125 ms | — | Reboot handling, serial flush |
|
||
|
||
### API Overhead Detail (5 runs)
|
||
|
||
| Run | Socket | Boot-src | Machine-cfg | InstanceStart | Total to VM |
|
||
|-----|--------|----------|-------------|---------------|-------------|
|
||
| 1 | 9ms | 11ms | 8ms | 48ms | 76ms |
|
||
| 2 | 9ms | 14ms | 14ms | 63ms | 101ms |
|
||
| 3 | 8ms | 12ms | 15ms | 65ms | 101ms |
|
||
| 4 | 9ms | 13ms | 8ms | 44ms | 75ms |
|
||
| 5 | 9ms | 14ms | 9ms | 74ms | 108ms |
|
||
| **Median** | **9ms** | **13ms** | **9ms** | **63ms** | **101ms** |
|
||
|
||
The InstanceStart phase is the most variable (44-74ms) because it does the heavy lifting: KVM_CREATE_VM, mmap guest memory, set up page tables, configure vCPU registers, create vCPU thread, and enter KVM_RUN.
|
||
|
||
### Seccomp Impact
|
||
|
||
| Mode | Avg Wall Clock (5 runs) |
|
||
|------|------------------------|
|
||
| With seccomp | 8ms to exit |
|
||
| Without seccomp (`--no-seccomp`) | 8ms to exit |
|
||
|
||
Seccomp has no measurable impact on boot time (measured with `--no-api --config-file` mode).
|
||
|
||
---
|
||
|
||
## 5. Memory Overhead
|
||
|
||
### RSS by Guest Memory Size
|
||
|
||
Measured during active VM execution (kernel booted, pre-panic):
|
||
|
||
| Guest Memory | RSS (KB) | RSS (MB) | VSZ (KB) | VSZ (MB) | VMM Overhead |
|
||
|-------------|----------|----------|----------|----------|-------------|
|
||
| — (pre-boot) | 3,396 | 3 | — | — | Base process |
|
||
| 128 MB | 51,260–53,520 | 50–52 | 139,084 | 135 | ~50 MB |
|
||
| 256 MB | 57,616–57,972 | 56–57 | 270,156 | 263 | ~54 MB |
|
||
| 512 MB | 61,704–62,068 | 60–61 | 532,300 | 519 | ~58 MB |
|
||
|
||
### Memory Breakdown (128MB guest)
|
||
|
||
From `/proc/PID/smaps_rollup` and `/proc/PID/status`:
|
||
|
||
| Metric | Value |
|
||
|--------|-------|
|
||
| Pss (proportional) | 51,800 KB |
|
||
| Pss_Anon | 49,432 KB |
|
||
| Pss_File | 2,364 KB |
|
||
| AnonHugePages | 47,104 KB |
|
||
| VmData | 136,128 KB (132 MB) |
|
||
| VmExe | 2,380 KB (2.3 MB) |
|
||
| VmStk | 132 KB |
|
||
| VmLib | 8 KB |
|
||
| Memory regions | 29 |
|
||
| Threads | 3 |
|
||
|
||
### Key Observations
|
||
|
||
1. **Guest memory is mmap'd but demand-paged**: VSZ scales linearly with guest size, but RSS only reflects touched pages
|
||
2. **VMM base overhead is ~3.4 MB** (pre-boot RSS)
|
||
3. **~50 MB RSS at 128MB guest**: The kernel touches ~47MB during boot (page tables, kernel code, data structures)
|
||
4. **AnonHugePages = 47MB**: THP (Transparent Huge Pages) is used for guest memory, reducing TLB pressure
|
||
5. **Scaling**: RSS increases ~4MB per 128MB of additional guest memory (minimal — guest pages are only touched on demand)
|
||
|
||
### Pre-boot vs Post-boot Memory
|
||
|
||
| Phase | RSS |
|
||
|-------|-----|
|
||
| After FC process start | 3,396 KB (3.3 MB) |
|
||
| After boot-source + machine-config | 3,396 KB (3.3 MB) — no change |
|
||
| After InstanceStart (VM running) | 51,260+ KB (~50 MB) |
|
||
|
||
All guest memory allocation happens during InstanceStart. The API configuration phase uses zero additional memory.
|
||
|
||
---
|
||
|
||
## 6. CPU Features (CPUID)
|
||
|
||
Firecracker v1.14.2 exposes the following CPU features to guests (as reported by kernel 4.14.174):
|
||
|
||
### XSAVE Features Exposed
|
||
|
||
| Feature | XSAVE Bit | Offset | Size |
|
||
|---------|-----------|--------|------|
|
||
| x87 FPU | 0x001 | — | — |
|
||
| SSE | 0x002 | — | — |
|
||
| AVX | 0x004 | 576 | 256 bytes |
|
||
| MPX bounds | 0x008 | 832 | 64 bytes |
|
||
| MPX CSR | 0x010 | 896 | 64 bytes |
|
||
| AVX-512 opmask | 0x020 | 960 | 64 bytes |
|
||
| AVX-512 Hi256 | 0x040 | 1024 | 512 bytes |
|
||
| AVX-512 ZMM_Hi256 | 0x080 | 1536 | 1024 bytes |
|
||
| PKU | 0x200 | 2560 | 8 bytes |
|
||
|
||
Total XSAVE context: 2,568 bytes (compacted format).
|
||
|
||
### CPU Identity (as seen by guest)
|
||
|
||
```
|
||
vendor_id: GenuineIntel
|
||
model name: Intel(R) Xeon(R) Processor @ 2.40GHz
|
||
family: 0x6
|
||
model: 0x55
|
||
stepping: 0x7
|
||
```
|
||
|
||
Firecracker strips the full CPU model name and reports a generic "Intel(R) Xeon(R) Processor @ 2.40GHz" (removed "Silver 4210R" from host).
|
||
|
||
### Security Mitigations Active in Guest
|
||
|
||
| Mitigation | Status |
|
||
|-----------|--------|
|
||
| NX (Execute Disable) | Active |
|
||
| Spectre V1 | usercopy/swapgs barriers |
|
||
| Spectre V2 | Enhanced IBRS |
|
||
| SpectreRSB | RSB filling on context switch |
|
||
| IBPB | Conditional on context switch |
|
||
| SSBD | Via prctl and seccomp |
|
||
| TAA | TSX disabled |
|
||
|
||
### Paravirt Features
|
||
|
||
| Feature | Present |
|
||
|---------|---------|
|
||
| KVM hypervisor detection | ✅ |
|
||
| kvm-clock | ✅ (MSRs 4b564d01/4b564d00) |
|
||
| KVM async PF | ✅ |
|
||
| KVM stealtime | ✅ |
|
||
| PV qspinlock | ✅ |
|
||
| x2apic | ✅ |
|
||
|
||
### Devices Visible to Guest
|
||
|
||
| Device | Type | Notes |
|
||
|--------|------|-------|
|
||
| Serial (ttyS0) | I/O 0x3f8 | 8250/16550 UART (U6_16550A) |
|
||
| i8042 keyboard | I/O 0x60, 0x64 | PS/2 controller |
|
||
| IOAPIC | MMIO 0xfec00000 | 24 GSIs |
|
||
| Local APIC | MMIO 0xfee00000 | x2apic mode |
|
||
| virtio-mmio | MMIO | Not probed (pci=off, no rootfs) |
|
||
|
||
---
|
||
|
||
## 7. Thread Model
|
||
|
||
Firecracker uses a minimal thread model:
|
||
|
||
| Thread | Name | Role |
|
||
|--------|------|------|
|
||
| Main | `firecracker-bin` | Event loop, serial I/O, device emulation |
|
||
| API | `fc_api` | HTTP API server on Unix socket |
|
||
| vCPU 0 | `fc_vcpu 0` | KVM_RUN loop for vCPU 0 |
|
||
|
||
With N vCPUs, there would be N+2 threads total.
|
||
|
||
### Process Details
|
||
|
||
| Property | Value |
|
||
|----------|-------|
|
||
| Seccomp | Level 2 (strict) |
|
||
| NoNewPrivs | Yes |
|
||
| Capabilities | None (all dropped) |
|
||
| Seccomp filters | 1 |
|
||
| FD limit | 1,048,576 |
|
||
|
||
---
|
||
|
||
## 8. Comparison with Volt
|
||
|
||
### Binary Size
|
||
|
||
| VMM | Size | Linking |
|
||
|-----|------|---------|
|
||
| Firecracker v1.14.2 | 3.44 MB (3,436,512 bytes) | Static-pie, not stripped |
|
||
| Volt 0.1.0 | 3.26 MB (3,258,448 bytes) | Dynamic (release build) |
|
||
|
||
Volt is **5% smaller**, though Firecracker is statically linked (includes musl libc).
|
||
|
||
### Boot Time Comparison
|
||
|
||
Both tested with the same kernel (vmlinux-4.14.174), same boot args, no rootfs:
|
||
|
||
| Metric | Firecracker | Volt | Delta |
|
||
|--------|-------------|-----------|-------|
|
||
| Wall clock (default boot) | 1,127ms median | TBD | — |
|
||
| Kernel internal time | 912ms | TBD | — |
|
||
| VMM startup overhead | ~80ms | TBD | — |
|
||
| Wall clock (no i8042) | 351ms median | TBD | — |
|
||
|
||
**Note:** Fill in Volt numbers from `benchmark-volt-vmm.md` for direct comparison.
|
||
|
||
### Memory Overhead
|
||
|
||
| Guest Size | Firecracker RSS | Volt RSS | Delta |
|
||
|-----------|-----------------|---------------|-------|
|
||
| Pre-boot (base) | 3.3 MB | TBD | — |
|
||
| 128 MB | 50–52 MB | TBD | — |
|
||
| 256 MB | 56–57 MB | TBD | — |
|
||
| 512 MB | 60–61 MB | TBD | — |
|
||
|
||
### Architecture Differences Affecting Performance
|
||
|
||
| Aspect | Firecracker | Volt |
|
||
|--------|-------------|-----------|
|
||
| API model | REST over Unix socket (always on) | Direct (no API server) |
|
||
| Thread model | main + api + N×vcpu | main + N×vcpu |
|
||
| Memory allocation | During InstanceStart | During VM setup |
|
||
| Kernel loading | Via API call (separate step) | At startup |
|
||
| Seccomp | BPF filter, ~50 syscalls | Planned |
|
||
| Guest memory | mmap + demand-paging + THP | TBD |
|
||
|
||
Firecracker's API-based architecture adds ~80ms overhead but enables runtime configuration. A direct-launch VMM like Volt can potentially start faster by eliminating the socket setup and HTTP parsing.
|
||
|
||
---
|
||
|
||
## 9. Methodology Notes
|
||
|
||
### Test Environment
|
||
|
||
- **Host OS:** Debian (Linux 6.1.0-42-amd64)
|
||
- **CPU:** Intel Xeon Silver 4210R @ 2.40GHz (Cascade Lake)
|
||
- **KVM:** `/dev/kvm` with user `karl` in group `kvm`
|
||
- **Firecracker:** Downloaded from GitHub releases, not jailed (bare process)
|
||
- **No jailer:** Tests run without the jailer for apples-to-apples VMM comparison
|
||
|
||
### What's Measured
|
||
|
||
- **Wall clock time:** `date +%s%N` before FC process start to detection of "Rebooting in" in serial output
|
||
- **Kernel internal time:** Extracted from kernel log timestamps (`[0.912xxx]` before "Rebooting in")
|
||
- **RSS:** `ps -p PID -o rss=` captured during VM execution
|
||
- **VMM overhead:** Time from process start to InstanceStart API return
|
||
|
||
### Caveats
|
||
|
||
1. **No rootfs:** Kernel panics at VFS mount. This measures pure boot, not a complete VM startup with userspace.
|
||
2. **i8042 timeout:** The default kernel (4.14.174) spends ~500ms probing the PS/2 keyboard controller. This is a kernel config issue, not a VMM issue. A custom kernel with `CONFIG_SERIO_I8042=n` would eliminate this.
|
||
3. **Serial output buffering:** Firecracker's serial port occasionally hits `WouldBlock` errors, which may slightly affect kernel timing (serial I/O blocks the vCPU when the buffer fills).
|
||
4. **No huge page pre-allocation:** Tests use default THP (Transparent Huge Pages). Pre-allocating huge pages would reduce memory allocation latency.
|
||
5. **Both kernels identical:** The "official" Firecracker kernel and `vmlinux-4.14` symlink point to the same 21MB binary (vmlinux-4.14.174).
|
||
|
||
### Kernel Boot Timeline (annotated)
|
||
|
||
```
|
||
0ms FC process starts
|
||
8ms API socket ready
|
||
22ms Kernel loaded (PUT /boot-source)
|
||
33ms Machine configured (PUT /machine-config)
|
||
80ms VM running (PUT /actions InstanceStart)
|
||
┌─── Kernel execution begins ───┐
|
||
~84ms │ Memory init, e820 map │
|
||
~84ms │ KVM hypervisor detected │
|
||
~84ms │ kvm-clock initialized │
|
||
~88ms │ SMP init, CPU0 identified │
|
||
~113ms │ devtmpfs, clocksource │
|
||
~150ms │ Network stack init │
|
||
~176ms │ Serial driver registered │
|
||
~188ms │ i8042 probe begins │ ← 500ms stall
|
||
~464ms │ i8042 KBD port registered │
|
||
~976ms │ i8042 keyboard input created │ ← i8042 probe complete
|
||
~980ms │ VFS: Cannot open root device │
|
||
~985ms │ Kernel panic │
|
||
~993ms │ "Rebooting in 1 seconds.." │
|
||
└────────────────────────────────┘
|
||
~1130ms Serial output flushed, process exits
|
||
```
|
||
|
||
---
|
||
|
||
## Raw Data Files
|
||
|
||
All raw benchmark data is stored in `/tmp/fc-bench-results/`:
|
||
|
||
- `boot-times-official.txt` — 10 iterations of wall-clock + kernel times
|
||
- `precise-boot-times.txt` — 10 iterations with --no-api mode
|
||
- `memory-official.txt` — RSS/VSZ for 128/256/512 MB guest sizes
|
||
- `smaps-detail-{128,256,512}.txt` — Detailed memory maps
|
||
- `status-official-{128,256,512}.txt` — /proc/PID/status snapshots
|
||
- `kernel-output-official.txt` — Full kernel serial output
|
||
|
||
---
|
||
|
||
*Generated by automated benchmark suite, 2026-03-08*
|