Files
volt-vmm/designs/networkd-virtio-net.md
Karl Clinger 40ed108dd5 Volt VMM (Neutron Stardust): source-available under AGPSL v5.0
KVM-based microVMM for the Volt platform:
- Sub-second VM boot times
- Minimal memory footprint
- Landlock LSM + seccomp security
- Virtio device support
- Custom kernel management

Copyright (c) Armored Gates LLC. All rights reserved.
Licensed under AGPSL v5.0
2026-03-21 01:04:35 -05:00

10 KiB

systemd-networkd Enhanced virtio-net

Overview

This design enhances Volt's virtio-net implementation by integrating with systemd-networkd for declarative, lifecycle-managed network configuration. Instead of Volt manually creating/configuring TAP devices, networkd manages them declaratively.

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                         systemd-networkd                             │
│  ┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐   │
│  │ volt-vmm-br0    │  │ vm-{uuid}.netdev │  │ vm-{uuid}.network│   │
│  │ (.netdev bridge) │  │ (TAP definition) │  │ (bridge attach)  │   │
│  └────────┬─────────┘  └────────┬─────────┘  └────────┬─────────┘   │
│           │                     │                     │              │
│           └─────────────────────┼─────────────────────┘              │
│                                 ▼                                    │
│                        ┌───────────────┐                             │
│                        │    br0        │ ◄── Unified bridge          │
│                        │  (bridge)     │     (VMs + Voltainer)       │
│                        └───────┬───────┘                             │
│                                │                                     │
│              ┌─────────────────┼─────────────────┐                   │
│              ▼                 ▼                 ▼                   │
│        ┌─────────┐       ┌─────────┐       ┌─────────┐               │
│        │ tap0    │       │ veth0   │       │ tap1    │               │
│        │ (VM-1)  │       │ (cont.) │       │ (VM-2)  │               │
│        └────┬────┘       └────┬────┘       └────┬────┘               │
└─────────────┼────────────────┼────────────────┼─────────────────────┘
              │                │                │
              ▼                ▼                ▼
        ┌─────────┐       ┌─────────┐      ┌─────────┐
        │Volt│       │Voltainer│      │Volt│
        │  VM-1   │       │Container│      │  VM-2   │
        └─────────┘       └─────────┘      └─────────┘

Benefits

  1. Declarative Configuration: Network topology defined in unit files, version-controllable
  2. Automatic Cleanup: systemd removes TAP devices when VM exits
  3. Lifecycle Integration: TAP created before VM starts, destroyed after
  4. Unified Networking: VMs and Voltainer containers share the same bridge infrastructure
  5. vhost-net Acceleration: Kernel-level packet processing bypasses userspace
  6. Predictable Naming: TAP names derived from VM UUID

Components

1. Bridge Infrastructure (One-time Setup)

# /etc/systemd/network/10-volt-vmm-br0.netdev
[NetDev]
Name=br0
Kind=bridge
MACAddress=52:54:00:00:00:01

[Bridge]
STP=false
ForwardDelaySec=0
# /etc/systemd/network/10-volt-vmm-br0.network
[Match]
Name=br0

[Network]
Address=10.42.0.1/24
IPForward=yes
IPMasquerade=both
ConfigureWithoutCarrier=yes

2. Per-VM TAP Template

Volt generates these dynamically:

# /run/systemd/network/50-vm-{uuid}.netdev
[NetDev]
Name=tap-{short_uuid}
Kind=tap
MACAddress=none

[Tap]
User=root
Group=root
VNetHeader=true
MultiQueue=true
PacketInfo=false
# /run/systemd/network/50-vm-{uuid}.network
[Match]
Name=tap-{short_uuid}

[Network]
Bridge=br0
ConfigureWithoutCarrier=yes

3. vhost-net Acceleration

vhost-net offloads packet processing to the kernel:

┌─────────────────────────────────────────────────┐
│                   Guest VM                       │
│  ┌─────────────────────────────────────────┐    │
│  │           virtio-net driver              │    │
│  └─────────────────┬───────────────────────┘    │
└───────────────────┬┼────────────────────────────┘
                    ││
         ┌──────────┘│
         │           │     KVM Exit (rare)
         ▼           ▼
┌────────────────────────────────────────────────┐
│              vhost-net (kernel)                 │
│                                                 │
│  - Processes virtqueue directly in kernel       │
│  - Zero-copy between TAP and guest memory       │
│  - Avoids userspace context switches            │
│  - ~30-50% throughput improvement               │
└────────────────────┬───────────────────────────┘
                     │
                     ▼
              ┌─────────────┐
              │ TAP device  │
              └─────────────┘

Without vhost-net:

Guest → KVM exit → QEMU/Volt userspace → syscall → TAP → kernel → network

With vhost-net:

Guest → vhost-net (kernel) → TAP → network

Integration with Voltainer

Both Volt VMs and Voltainer containers connect to the same bridge:

Voltainer Network Zone

# /etc/voltainer/network/zone-default.yaml
kind: NetworkZone
name: default
bridge: br0
subnet: 10.42.0.0/24
gateway: 10.42.0.1
dhcp:
  enabled: true
  range: 10.42.0.100-10.42.0.254

Volt VM Allocation

VMs get static IPs from a reserved range (10.42.0.2-10.42.0.99):

network:
  - zone: default
    mac: "52:54:00:ab:cd:ef"
    ipv4: "10.42.0.10/24"

File Locations

File Type Location Persistence
Bridge .netdev/.network /etc/systemd/network/ Permanent
VM TAP .netdev/.network /run/systemd/network/ Runtime only
Voltainer zone config /etc/voltainer/network/ Permanent
vhost-net module Kernel built-in N/A

Lifecycle

VM Start

  1. Volt generates .netdev and .network in /run/systemd/network/
  2. networkctl reload triggers networkd to create TAP
  3. Wait for TAP interface to appear (networkctl status tap-XXX)
  4. Open TAP fd with O_RDWR
  5. Enable vhost-net via /dev/vhost-net ioctl
  6. Boot VM with virtio-net using the TAP fd

VM Stop

  1. Close vhost-net and TAP file descriptors
  2. Delete .netdev and .network from /run/systemd/network/
  3. networkctl reload triggers cleanup
  4. TAP interface automatically removed

vhost-net Setup Sequence

// 1. Open vhost-net device
int vhost_fd = open("/dev/vhost-net", O_RDWR);

// 2. Set owner (associate with TAP)
ioctl(vhost_fd, VHOST_SET_OWNER, 0);

// 3. Set memory region table
struct vhost_memory *mem = ...;  // Guest memory regions
ioctl(vhost_fd, VHOST_SET_MEM_TABLE, mem);

// 4. Set vring info for each queue (RX and TX)
struct vhost_vring_state state = { .index = 0, .num = queue_size };
ioctl(vhost_fd, VHOST_SET_VRING_NUM, &state);

struct vhost_vring_addr addr = {
    .index = 0,
    .desc_user_addr = desc_addr,
    .used_user_addr = used_addr,
    .avail_user_addr = avail_addr,
};
ioctl(vhost_fd, VHOST_SET_VRING_ADDR, &addr);

// 5. Set kick/call eventfds
struct vhost_vring_file kick = { .index = 0, .fd = kick_eventfd };
ioctl(vhost_fd, VHOST_SET_VRING_KICK, &kick);

struct vhost_vring_file call = { .index = 0, .fd = call_eventfd };
ioctl(vhost_fd, VHOST_SET_VRING_CALL, &call);

// 6. Associate with TAP backend
struct vhost_vring_file backend = { .index = 0, .fd = tap_fd };
ioctl(vhost_fd, VHOST_NET_SET_BACKEND, &backend);

Performance Comparison

Metric userspace virtio-net vhost-net
Throughput (1500 MTU) ~5 Gbps ~8 Gbps
Throughput (Jumbo 9000) ~8 Gbps ~15 Gbps
Latency (ping) ~200 µs ~80 µs
CPU usage Higher 30-50% lower
Context switches Many Minimal

Configuration Examples

Minimal VM with Networking

{
  "vcpus": 2,
  "memory_mib": 512,
  "kernel": "vmlinux",
  "network": [{
    "id": "eth0",
    "mode": "networkd",
    "bridge": "br0",
    "mac": "52:54:00:12:34:56",
    "vhost": true
  }]
}

Multi-NIC VM

{
  "network": [
    {
      "id": "mgmt",
      "bridge": "br-mgmt",
      "vhost": true
    },
    {
      "id": "data",
      "bridge": "br-data",
      "mtu": 9000,
      "vhost": true,
      "multiqueue": 4
    }
  ]
}

Error Handling

Error Cause Recovery
TAP creation timeout networkd slow/unresponsive Retry with backoff, fall back to direct creation
vhost-net open fails Module not loaded Fall back to userspace virtio-net
Bridge not found Infrastructure not set up Create bridge or fail with clear error
MAC conflict Duplicate MAC on bridge Auto-regenerate MAC

Future Enhancements

  1. SR-IOV Passthrough: Direct VF assignment for bare-metal performance
  2. DPDK Backend: Alternative to TAP for ultra-low-latency
  3. virtio-vhost-user: Offload to separate process for isolation
  4. Network Namespace Integration: Per-VM network namespaces for isolation