KVM-based microVMM for the Volt platform: - Sub-second VM boot times - Minimal memory footprint - Landlock LSM + seccomp security - Virtio device support - Custom kernel management Copyright (c) Armored Gates LLC. All rights reserved. Licensed under AGPSL v5.0
14 KiB
Landlock LSM Analysis for Volt
Date: 2026-03-08
Status: Research Complete
Author: Edgar (Subagent)
Executive Summary
Landlock is a Linux Security Module that enables unprivileged sandboxing—allowing processes to restrict their own capabilities without requiring root privileges. For Volt (a VMM), Landlock provides compelling defense-in-depth benefits, but comes with kernel version requirements that must be carefully considered.
Recommendation: Make Landlock optional but strongly encouraged. When detected (kernel 5.13+), enable it by default. Document that users on older kernels have reduced defense-in-depth.
1. What is Landlock?
Landlock is a stackable Linux Security Module (LSM) that enables unprivileged processes to restrict their own ambient rights. Unlike traditional LSMs (SELinux, AppArmor), Landlock doesn't require system administrator configuration—applications can self-sandbox.
Core Capabilities
| ABI Version | Kernel | Features |
|---|---|---|
| ABI 1 | 5.13+ | Filesystem access control (13 access rights) |
| ABI 2 | 5.19+ | LANDLOCK_ACCESS_FS_REFER (cross-directory moves/links) |
| ABI 3 | 6.2+ | LANDLOCK_ACCESS_FS_TRUNCATE |
| ABI 4 | 6.7+ | Network access control (TCP bind/connect) |
| ABI 5 | 6.10+ | LANDLOCK_ACCESS_FS_IOCTL_DEV (device ioctls) |
| ABI 6 | 6.12+ | IPC scoping (signals, abstract Unix sockets) |
| ABI 7 | 6.13+ | Audit logging support |
How It Works
-
Create a ruleset defining handled access types:
struct landlock_ruleset_attr ruleset_attr = { .handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE | LANDLOCK_ACCESS_FS_WRITE_FILE | ... }; int ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0); -
Add rules for allowed paths:
struct landlock_path_beneath_attr path_beneath = { .allowed_access = LANDLOCK_ACCESS_FS_READ_FILE, .parent_fd = open("/allowed/path", O_PATH | O_CLOEXEC), }; landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH, &path_beneath, 0); -
Enforce the ruleset (irrevocable):
prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); // Required first landlock_restrict_self(ruleset_fd, 0);
Key Properties
- Unprivileged: No CAP_SYS_ADMIN required (just
PR_SET_NO_NEW_PRIVS) - Stackable: Multiple layers can be applied; restrictions only accumulate
- Irrevocable: Once enforced, cannot be removed for process lifetime
- Inherited: Child processes inherit parent's Landlock domain
- Path-based: Rules attach to file hierarchies, not inodes
2. Kernel Version Requirements
Minimum Requirements by Feature
| Feature | Minimum Kernel | Distro Support |
|---|---|---|
| Basic filesystem | 5.13 (July 2021) | Ubuntu 22.04+, Debian 12+, RHEL 9+ |
| File referencing | 5.19 (July 2022) | Ubuntu 22.10+, Debian 12+ |
| File truncation | 6.2 (Feb 2023) | Ubuntu 23.04+, Fedora 38+ |
| Network (TCP) | 6.7 (Jan 2024) | Ubuntu 24.04+, Fedora 39+ |
Distro Compatibility Matrix
| Distribution | Default Kernel | Landlock ABI | Network Support |
|---|---|---|---|
| Ubuntu 20.04 LTS | 5.4 | ❌ None | ❌ |
| Ubuntu 22.04 LTS | 5.15 | ❌ None | ❌ |
| Ubuntu 24.04 LTS | 6.8 | ✅ ABI 4+ | ✅ |
| Debian 11 | 5.10 | ❌ None | ❌ |
| Debian 12 | 6.1 | ✅ ABI 3 | ❌ |
| RHEL 8 | 4.18 | ❌ None | ❌ |
| RHEL 9 | 5.14 | ✅ ABI 1 | ❌ |
| Fedora 40 | 6.8+ | ✅ ABI 4+ | ✅ |
Detection at Runtime
int abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
if (abi < 0) {
if (errno == ENOSYS) // Landlock not compiled in
if (errno == EOPNOTSUPP) // Landlock disabled
}
3. Advantages for Volt VMM
3.1 Defense in Depth Against VM Escape
If a guest exploits a vulnerability in the VMM (memory corruption, etc.) and achieves code execution in the VMM process, Landlock limits what the attacker can do:
| Attack Vector | Without Landlock | With Landlock |
|---|---|---|
| Read host files | Full access | Only allowed paths |
| Write host files | Full access | Only VM disk images |
| Execute binaries | Any executable | Denied (no EXECUTE right) |
| Network access | Unrestricted | Only specified ports (ABI 4+) |
| Device access | All /dev | Only /dev/kvm, /dev/net/tun |
3.2 Restricting VMM Process Capabilities
Volt can declare exactly what it needs:
// Example Volt Landlock policy
let ruleset = Ruleset::new()
.handle_access(AccessFs::ReadFile | AccessFs::WriteFile)?;
// Allow read-only access to kernel/initrd
ruleset.add_rule(PathBeneath::new(kernel_path, AccessFs::ReadFile))?;
ruleset.add_rule(PathBeneath::new(initrd_path, AccessFs::ReadFile))?;
// Allow read-write access to VM disk images
for disk in &vm_config.disks {
ruleset.add_rule(PathBeneath::new(&disk.path, AccessFs::ReadFile | AccessFs::WriteFile))?;
}
// Allow /dev/kvm and /dev/net/tun
ruleset.add_rule(PathBeneath::new("/dev/kvm", AccessFs::ReadFile | AccessFs::WriteFile))?;
ruleset.add_rule(PathBeneath::new("/dev/net/tun", AccessFs::ReadFile | AccessFs::WriteFile))?;
ruleset.restrict_self()?;
3.3 Comparison with seccomp-bpf
| Aspect | seccomp-bpf | Landlock |
|---|---|---|
| Controls | System call invocation | Resource access (files, network) |
| Granularity | Syscall number + args | Path hierarchies, ports |
| Use case | "Can call open()" | "Can access /tmp/vm-disk.img" |
| Complexity | Complex (BPF programs) | Simple (path-based rules) |
| Kernel version | 3.5+ | 5.13+ |
| Pointer args | Cannot inspect | N/A (path-based) |
| Complementary? | ✅ Yes | ✅ Yes |
Key insight: seccomp and Landlock are complementary, not alternatives.
- seccomp: "You may only call these 50 syscalls" (attack surface reduction)
- Landlock: "You may only access these specific files" (resource restriction)
A properly sandboxed VMM should use both:
- seccomp to limit syscall surface
- Landlock to limit accessible resources
4. Disadvantages and Considerations
4.1 Kernel Version Requirement
The 5.13+ requirement excludes:
- Ubuntu 20.04 LTS (EOL April 2025, but still deployed)
- Ubuntu 22.04 LTS without HWE kernel
- RHEL 8 (mainstream support until 2029)
- Debian 11 (EOL June 2026)
Mitigation: Make Landlock optional; gracefully degrade when unavailable.
4.2 ABI Evolution Complexity
Supporting multiple Landlock ABI versions requires careful coding:
switch (abi) {
case 1:
ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER;
__attribute__((fallthrough));
case 2:
ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_TRUNCATE;
__attribute__((fallthrough));
case 3:
ruleset_attr.handled_access_net = 0; // No network support
// ...
}
Mitigation: Use a Landlock library (e.g., landlock crate for Rust) that handles ABI negotiation.
4.3 Path Resolution Subtleties
- Bind mounts: Rules apply to the same files via either path
- OverlayFS: Rules do NOT propagate between layers and merged view
- Symlinks: Rules apply to the target, not the symlink itself
Mitigation: Document clearly; test with containerized/overlayfs scenarios.
4.4 No Dynamic Rule Modification
Once landlock_restrict_self() is called:
- Cannot remove rules
- Cannot expand allowed paths
- Can only add more restrictive rules
For Volt: Must know all needed paths at restriction time. For hotplug support, pre-declare potential hotplug paths (as Cloud Hypervisor does with --landlock-rules).
5. What Firecracker and Cloud Hypervisor Do
5.1 Firecracker
Firecracker uses a multi-layered approach via its "jailer" wrapper:
| Layer | Mechanism | Purpose |
|---|---|---|
| 1 | chroot + pivot_root | Filesystem isolation |
| 2 | User namespaces | UID/GID isolation |
| 3 | Network namespaces | Network isolation |
| 4 | Cgroups | Resource limits |
| 5 | seccomp-bpf | Syscall filtering |
| 6 | Capability dropping | Privilege reduction |
Notably missing: Landlock. Firecracker relies on the jailer's chroot for filesystem isolation, which requires:
- Root privileges to set up (then drops them)
- Careful hardlink/copy of resources into chroot
Firecracker's jailer is mature and battle-tested but requires privileged setup.
5.2 Cloud Hypervisor
Cloud Hypervisor has native Landlock support (--landlock flag):
./cloud-hypervisor \
--kernel ./vmlinux.bin \
--disk path=disk.raw \
--landlock \
--landlock-rules path="/path/to/hotplug",access="rw"
Features:
- Enabled via CLI flag (optional)
- Supports pre-declaring hotplug paths
- Falls back gracefully if kernel lacks support
- Combined with seccomp for defense in depth
Cloud Hypervisor's approach is a good model for Volt.
6. Recommendation for Volt
Implementation Strategy
┌─────────────────────────────────────────────────────────────┐
│ Security Layer Stack │
├─────────────────────────────────────────────────────────────┤
│ Layer 5: Landlock (optional, 5.13+) │
│ - Filesystem path restrictions │
│ - Network port restrictions (6.7+) │
├─────────────────────────────────────────────────────────────┤
│ Layer 4: seccomp-bpf (required) │
│ - Syscall allowlist │
│ - Argument filtering │
├─────────────────────────────────────────────────────────────┤
│ Layer 3: Capability dropping (required) │
│ - Drop all caps except CAP_NET_ADMIN if needed │
├─────────────────────────────────────────────────────────────┤
│ Layer 2: User namespaces (optional) │
│ - Run as unprivileged user │
├─────────────────────────────────────────────────────────────┤
│ Layer 1: KVM isolation (inherent) │
│ - Hardware virtualization boundary │
└─────────────────────────────────────────────────────────────┘
Specific Recommendations
-
Make Landlock optional, default-enabled when available
pub struct VoltConfig { /// Enable Landlock sandboxing (requires kernel 5.13+) /// Default: auto (enabled if available) pub landlock: LandlockMode, // Auto | Enabled | Disabled } -
Do NOT require kernel 5.13+
- Too many production systems still on older kernels
- Landlock adds defense-in-depth, but seccomp+capabilities are adequate baseline
- Log a warning if Landlock unavailable
-
Support hotplug path pre-declaration (like Cloud Hypervisor)
volt-vmm --disk /vm/disk.img \ --landlock \ --landlock-allow-path /vm/hotplug/,rw -
Use the
landlockRust crate- Handles ABI version detection
- Provides ergonomic API
- Maintained, well-tested
-
Minimum practical policy for VMM:
// Read-only - kernel image - initrd - any read-only disks // Read-write - VM disk images - VM state/snapshot paths - API socket path - Logging paths // Devices (special handling may be needed) - /dev/kvm - /dev/net/tun - /dev/vhost-net (if used) -
Document security posture clearly:
Volt Security Layers: ✅ KVM hardware isolation (always) ✅ seccomp syscall filtering (always) ✅ Capability dropping (always) ⚠️ Landlock filesystem restrictions (kernel 5.13+ required) ⚠️ Landlock network restrictions (kernel 6.7+ required)
Why Not Require 5.13+?
| Consideration | Impact |
|---|---|
| Ubuntu 22.04 LTS | Most common cloud image; ships 5.15 but Landlock often disabled |
| RHEL 8 | Enterprise deployments; kernel 4.18 |
| Embedded/IoT | Often run older LTS kernels |
| User expectations | VMMs should "just work" |
Landlock is excellent defense-in-depth, but not a hard requirement. The base security (KVM + seccomp + capabilities) is strong. Landlock makes it stronger.
7. Implementation Checklist
- Add
landlockcrate dependency - Implement Landlock policy configuration
- Detect Landlock ABI at runtime
- Apply appropriate policy based on ABI version
- Support
--landlock/--no-landlockCLI flags - Support
--landlock-rulesfor hotplug paths - Log Landlock status at startup (enabled/disabled/unavailable)
- Document Landlock in security documentation
- Add integration tests with Landlock enabled
- Test on kernels without Landlock (graceful fallback)