KVM-based microVMM for the Volt platform: - Sub-second VM boot times - Minimal memory footprint - Landlock LSM + seccomp security - Virtio device support - Custom kernel management Copyright (c) Armored Gates LLC. All rights reserved. Licensed under AGPSL v5.0
379 lines
14 KiB
Markdown
379 lines
14 KiB
Markdown
# Landlock LSM Analysis for Volt
|
|
|
|
**Date:** 2026-03-08
|
|
**Status:** Research Complete
|
|
**Author:** Edgar (Subagent)
|
|
|
|
## Executive Summary
|
|
|
|
Landlock is a Linux Security Module that enables unprivileged sandboxing—allowing processes to restrict their own capabilities without requiring root privileges. For Volt (a VMM), Landlock provides compelling defense-in-depth benefits, but comes with kernel version requirements that must be carefully considered.
|
|
|
|
**Recommendation:** Make Landlock **optional but strongly encouraged**. When detected (kernel 5.13+), enable it by default. Document that users on older kernels have reduced defense-in-depth.
|
|
|
|
---
|
|
|
|
## 1. What is Landlock?
|
|
|
|
Landlock is a **stackable Linux Security Module (LSM)** that enables unprivileged processes to restrict their own ambient rights. Unlike traditional LSMs (SELinux, AppArmor), Landlock doesn't require system administrator configuration—applications can self-sandbox.
|
|
|
|
### Core Capabilities
|
|
|
|
| ABI Version | Kernel | Features |
|
|
|-------------|--------|----------|
|
|
| ABI 1 | 5.13+ | Filesystem access control (13 access rights) |
|
|
| ABI 2 | 5.19+ | `LANDLOCK_ACCESS_FS_REFER` (cross-directory moves/links) |
|
|
| ABI 3 | 6.2+ | `LANDLOCK_ACCESS_FS_TRUNCATE` |
|
|
| ABI 4 | 6.7+ | Network access control (TCP bind/connect) |
|
|
| ABI 5 | 6.10+ | `LANDLOCK_ACCESS_FS_IOCTL_DEV` (device ioctls) |
|
|
| ABI 6 | 6.12+ | IPC scoping (signals, abstract Unix sockets) |
|
|
| ABI 7 | 6.13+ | Audit logging support |
|
|
|
|
### How It Works
|
|
|
|
1. **Create a ruleset** defining handled access types:
|
|
```c
|
|
struct landlock_ruleset_attr ruleset_attr = {
|
|
.handled_access_fs = LANDLOCK_ACCESS_FS_READ_FILE |
|
|
LANDLOCK_ACCESS_FS_WRITE_FILE | ...
|
|
};
|
|
int ruleset_fd = landlock_create_ruleset(&ruleset_attr, sizeof(ruleset_attr), 0);
|
|
```
|
|
|
|
2. **Add rules** for allowed paths:
|
|
```c
|
|
struct landlock_path_beneath_attr path_beneath = {
|
|
.allowed_access = LANDLOCK_ACCESS_FS_READ_FILE,
|
|
.parent_fd = open("/allowed/path", O_PATH | O_CLOEXEC),
|
|
};
|
|
landlock_add_rule(ruleset_fd, LANDLOCK_RULE_PATH_BENEATH, &path_beneath, 0);
|
|
```
|
|
|
|
3. **Enforce the ruleset** (irrevocable):
|
|
```c
|
|
prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0); // Required first
|
|
landlock_restrict_self(ruleset_fd, 0);
|
|
```
|
|
|
|
### Key Properties
|
|
|
|
- **Unprivileged:** No CAP_SYS_ADMIN required (just `PR_SET_NO_NEW_PRIVS`)
|
|
- **Stackable:** Multiple layers can be applied; restrictions only accumulate
|
|
- **Irrevocable:** Once enforced, cannot be removed for process lifetime
|
|
- **Inherited:** Child processes inherit parent's Landlock domain
|
|
- **Path-based:** Rules attach to file hierarchies, not inodes
|
|
|
|
---
|
|
|
|
## 2. Kernel Version Requirements
|
|
|
|
### Minimum Requirements by Feature
|
|
|
|
| Feature | Minimum Kernel | Distro Support |
|
|
|---------|---------------|----------------|
|
|
| Basic filesystem | 5.13 (July 2021) | Ubuntu 22.04+, Debian 12+, RHEL 9+ |
|
|
| File referencing | 5.19 (July 2022) | Ubuntu 22.10+, Debian 12+ |
|
|
| File truncation | 6.2 (Feb 2023) | Ubuntu 23.04+, Fedora 38+ |
|
|
| Network (TCP) | 6.7 (Jan 2024) | Ubuntu 24.04+, Fedora 39+ |
|
|
|
|
### Distro Compatibility Matrix
|
|
|
|
| Distribution | Default Kernel | Landlock ABI | Network Support |
|
|
|--------------|---------------|--------------|-----------------|
|
|
| Ubuntu 20.04 LTS | 5.4 | ❌ None | ❌ |
|
|
| Ubuntu 22.04 LTS | 5.15 | ❌ None | ❌ |
|
|
| Ubuntu 24.04 LTS | 6.8 | ✅ ABI 4+ | ✅ |
|
|
| Debian 11 | 5.10 | ❌ None | ❌ |
|
|
| Debian 12 | 6.1 | ✅ ABI 3 | ❌ |
|
|
| RHEL 8 | 4.18 | ❌ None | ❌ |
|
|
| RHEL 9 | 5.14 | ✅ ABI 1 | ❌ |
|
|
| Fedora 40 | 6.8+ | ✅ ABI 4+ | ✅ |
|
|
|
|
### Detection at Runtime
|
|
|
|
```c
|
|
int abi = landlock_create_ruleset(NULL, 0, LANDLOCK_CREATE_RULESET_VERSION);
|
|
if (abi < 0) {
|
|
if (errno == ENOSYS) // Landlock not compiled in
|
|
if (errno == EOPNOTSUPP) // Landlock disabled
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 3. Advantages for Volt VMM
|
|
|
|
### 3.1 Defense in Depth Against VM Escape
|
|
|
|
If a guest exploits a vulnerability in the VMM (memory corruption, etc.) and achieves code execution in the VMM process, Landlock limits what the attacker can do:
|
|
|
|
| Attack Vector | Without Landlock | With Landlock |
|
|
|--------------|------------------|---------------|
|
|
| Read host files | Full access | Only allowed paths |
|
|
| Write host files | Full access | Only VM disk images |
|
|
| Execute binaries | Any executable | Denied (no EXECUTE right) |
|
|
| Network access | Unrestricted | Only specified ports (ABI 4+) |
|
|
| Device access | All /dev | Only /dev/kvm, /dev/net/tun |
|
|
|
|
### 3.2 Restricting VMM Process Capabilities
|
|
|
|
Volt can declare exactly what it needs:
|
|
|
|
```rust
|
|
// Example Volt Landlock policy
|
|
let ruleset = Ruleset::new()
|
|
.handle_access(AccessFs::ReadFile | AccessFs::WriteFile)?;
|
|
|
|
// Allow read-only access to kernel/initrd
|
|
ruleset.add_rule(PathBeneath::new(kernel_path, AccessFs::ReadFile))?;
|
|
ruleset.add_rule(PathBeneath::new(initrd_path, AccessFs::ReadFile))?;
|
|
|
|
// Allow read-write access to VM disk images
|
|
for disk in &vm_config.disks {
|
|
ruleset.add_rule(PathBeneath::new(&disk.path, AccessFs::ReadFile | AccessFs::WriteFile))?;
|
|
}
|
|
|
|
// Allow /dev/kvm and /dev/net/tun
|
|
ruleset.add_rule(PathBeneath::new("/dev/kvm", AccessFs::ReadFile | AccessFs::WriteFile))?;
|
|
ruleset.add_rule(PathBeneath::new("/dev/net/tun", AccessFs::ReadFile | AccessFs::WriteFile))?;
|
|
|
|
ruleset.restrict_self()?;
|
|
```
|
|
|
|
### 3.3 Comparison with seccomp-bpf
|
|
|
|
| Aspect | seccomp-bpf | Landlock |
|
|
|--------|-------------|----------|
|
|
| **Controls** | System call invocation | Resource access (files, network) |
|
|
| **Granularity** | Syscall number + args | Path hierarchies, ports |
|
|
| **Use case** | "Can call open()" | "Can access /tmp/vm-disk.img" |
|
|
| **Complexity** | Complex (BPF programs) | Simple (path-based rules) |
|
|
| **Kernel version** | 3.5+ | 5.13+ |
|
|
| **Pointer args** | Cannot inspect | N/A (path-based) |
|
|
| **Complementary?** | ✅ Yes | ✅ Yes |
|
|
|
|
**Key insight:** seccomp and Landlock are **complementary**, not alternatives.
|
|
|
|
- **seccomp:** "You may only call these 50 syscalls" (attack surface reduction)
|
|
- **Landlock:** "You may only access these specific files" (resource restriction)
|
|
|
|
A properly sandboxed VMM should use **both**:
|
|
1. seccomp to limit syscall surface
|
|
2. Landlock to limit accessible resources
|
|
|
|
---
|
|
|
|
## 4. Disadvantages and Considerations
|
|
|
|
### 4.1 Kernel Version Requirement
|
|
|
|
The 5.13+ requirement excludes:
|
|
- Ubuntu 20.04 LTS (EOL April 2025, but still deployed)
|
|
- Ubuntu 22.04 LTS without HWE kernel
|
|
- RHEL 8 (mainstream support until 2029)
|
|
- Debian 11 (EOL June 2026)
|
|
|
|
**Mitigation:** Make Landlock optional; gracefully degrade when unavailable.
|
|
|
|
### 4.2 ABI Evolution Complexity
|
|
|
|
Supporting multiple Landlock ABI versions requires careful coding:
|
|
|
|
```c
|
|
switch (abi) {
|
|
case 1:
|
|
ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_REFER;
|
|
__attribute__((fallthrough));
|
|
case 2:
|
|
ruleset_attr.handled_access_fs &= ~LANDLOCK_ACCESS_FS_TRUNCATE;
|
|
__attribute__((fallthrough));
|
|
case 3:
|
|
ruleset_attr.handled_access_net = 0; // No network support
|
|
// ...
|
|
}
|
|
```
|
|
|
|
**Mitigation:** Use a Landlock library (e.g., `landlock` crate for Rust) that handles ABI negotiation.
|
|
|
|
### 4.3 Path Resolution Subtleties
|
|
|
|
- Bind mounts: Rules apply to the same files via either path
|
|
- OverlayFS: Rules do NOT propagate between layers and merged view
|
|
- Symlinks: Rules apply to the target, not the symlink itself
|
|
|
|
**Mitigation:** Document clearly; test with containerized/overlayfs scenarios.
|
|
|
|
### 4.4 No Dynamic Rule Modification
|
|
|
|
Once `landlock_restrict_self()` is called:
|
|
- Cannot remove rules
|
|
- Cannot expand allowed paths
|
|
- Can only add more restrictive rules
|
|
|
|
**For Volt:** Must know all needed paths at restriction time. For hotplug support, pre-declare potential hotplug paths (as Cloud Hypervisor does with `--landlock-rules`).
|
|
|
|
---
|
|
|
|
## 5. What Firecracker and Cloud Hypervisor Do
|
|
|
|
### 5.1 Firecracker
|
|
|
|
Firecracker uses a **multi-layered approach** via its "jailer" wrapper:
|
|
|
|
| Layer | Mechanism | Purpose |
|
|
|-------|-----------|---------|
|
|
| 1 | chroot + pivot_root | Filesystem isolation |
|
|
| 2 | User namespaces | UID/GID isolation |
|
|
| 3 | Network namespaces | Network isolation |
|
|
| 4 | Cgroups | Resource limits |
|
|
| 5 | seccomp-bpf | Syscall filtering |
|
|
| 6 | Capability dropping | Privilege reduction |
|
|
|
|
**Notably missing: Landlock.** Firecracker relies on the jailer's chroot for filesystem isolation, which requires:
|
|
- Root privileges to set up (then drops them)
|
|
- Careful hardlink/copy of resources into chroot
|
|
|
|
Firecracker's jailer is mature and battle-tested but requires privileged setup.
|
|
|
|
### 5.2 Cloud Hypervisor
|
|
|
|
Cloud Hypervisor **has native Landlock support** (`--landlock` flag):
|
|
|
|
```bash
|
|
./cloud-hypervisor \
|
|
--kernel ./vmlinux.bin \
|
|
--disk path=disk.raw \
|
|
--landlock \
|
|
--landlock-rules path="/path/to/hotplug",access="rw"
|
|
```
|
|
|
|
**Features:**
|
|
- Enabled via CLI flag (optional)
|
|
- Supports pre-declaring hotplug paths
|
|
- Falls back gracefully if kernel lacks support
|
|
- Combined with seccomp for defense in depth
|
|
|
|
**Cloud Hypervisor's approach is a good model for Volt.**
|
|
|
|
---
|
|
|
|
## 6. Recommendation for Volt
|
|
|
|
### Implementation Strategy
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Security Layer Stack │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ Layer 5: Landlock (optional, 5.13+) │
|
|
│ - Filesystem path restrictions │
|
|
│ - Network port restrictions (6.7+) │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ Layer 4: seccomp-bpf (required) │
|
|
│ - Syscall allowlist │
|
|
│ - Argument filtering │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ Layer 3: Capability dropping (required) │
|
|
│ - Drop all caps except CAP_NET_ADMIN if needed │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ Layer 2: User namespaces (optional) │
|
|
│ - Run as unprivileged user │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ Layer 1: KVM isolation (inherent) │
|
|
│ - Hardware virtualization boundary │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### Specific Recommendations
|
|
|
|
1. **Make Landlock optional, default-enabled when available**
|
|
```rust
|
|
pub struct VoltConfig {
|
|
/// Enable Landlock sandboxing (requires kernel 5.13+)
|
|
/// Default: auto (enabled if available)
|
|
pub landlock: LandlockMode, // Auto | Enabled | Disabled
|
|
}
|
|
```
|
|
|
|
2. **Do NOT require kernel 5.13+**
|
|
- Too many production systems still on older kernels
|
|
- Landlock adds defense-in-depth, but seccomp+capabilities are adequate baseline
|
|
- Log a warning if Landlock unavailable
|
|
|
|
3. **Support hotplug path pre-declaration** (like Cloud Hypervisor)
|
|
```bash
|
|
volt-vmm --disk /vm/disk.img \
|
|
--landlock \
|
|
--landlock-allow-path /vm/hotplug/,rw
|
|
```
|
|
|
|
4. **Use the `landlock` Rust crate**
|
|
- Handles ABI version detection
|
|
- Provides ergonomic API
|
|
- Maintained, well-tested
|
|
|
|
5. **Minimum practical policy for VMM:**
|
|
```rust
|
|
// Read-only
|
|
- kernel image
|
|
- initrd
|
|
- any read-only disks
|
|
|
|
// Read-write
|
|
- VM disk images
|
|
- VM state/snapshot paths
|
|
- API socket path
|
|
- Logging paths
|
|
|
|
// Devices (special handling may be needed)
|
|
- /dev/kvm
|
|
- /dev/net/tun
|
|
- /dev/vhost-net (if used)
|
|
```
|
|
|
|
6. **Document security posture clearly:**
|
|
```
|
|
Volt Security Layers:
|
|
✅ KVM hardware isolation (always)
|
|
✅ seccomp syscall filtering (always)
|
|
✅ Capability dropping (always)
|
|
⚠️ Landlock filesystem restrictions (kernel 5.13+ required)
|
|
⚠️ Landlock network restrictions (kernel 6.7+ required)
|
|
```
|
|
|
|
### Why Not Require 5.13+?
|
|
|
|
| Consideration | Impact |
|
|
|---------------|--------|
|
|
| Ubuntu 22.04 LTS | Most common cloud image; ships 5.15 but Landlock often disabled |
|
|
| RHEL 8 | Enterprise deployments; kernel 4.18 |
|
|
| Embedded/IoT | Often run older LTS kernels |
|
|
| User expectations | VMMs should "just work" |
|
|
|
|
**Landlock is excellent defense-in-depth, but not a hard requirement.** The base security (KVM + seccomp + capabilities) is strong. Landlock makes it stronger.
|
|
|
|
---
|
|
|
|
## 7. Implementation Checklist
|
|
|
|
- [ ] Add `landlock` crate dependency
|
|
- [ ] Implement Landlock policy configuration
|
|
- [ ] Detect Landlock ABI at runtime
|
|
- [ ] Apply appropriate policy based on ABI version
|
|
- [ ] Support `--landlock` / `--no-landlock` CLI flags
|
|
- [ ] Support `--landlock-rules` for hotplug paths
|
|
- [ ] Log Landlock status at startup (enabled/disabled/unavailable)
|
|
- [ ] Document Landlock in security documentation
|
|
- [ ] Add integration tests with Landlock enabled
|
|
- [ ] Test on kernels without Landlock (graceful fallback)
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- [Landlock Documentation](https://landlock.io/)
|
|
- [Kernel Landlock API](https://docs.kernel.org/userspace-api/landlock.html)
|
|
- [Cloud Hypervisor Landlock docs](https://github.com/cloud-hypervisor/cloud-hypervisor/blob/main/docs/landlock.md)
|
|
- [Firecracker Jailer](https://github.com/firecracker-microvm/firecracker/blob/main/docs/jailer.md)
|
|
- [LWN: Landlock sets sail](https://lwn.net/Articles/859908/)
|
|
- [Rust landlock crate](https://crates.io/crates/landlock)
|