KVM-based microVMM for the Volt platform: - Sub-second VM boot times - Minimal memory footprint - Landlock LSM + seccomp security - Virtio device support - Custom kernel management Copyright (c) Armored Gates LLC. All rights reserved. Licensed under AGPSL v5.0
246 lines
8.1 KiB
Markdown
246 lines
8.1 KiB
Markdown
# Volt ELF Loading & Memory Layout Analysis
|
||
|
||
**Date**: 2025-01-20
|
||
**Status**: ✅ **ALL ISSUES RESOLVED**
|
||
**Kernel**: vmlinux with Virtual 0xffffffff81000000 → Physical 0x1000000, Entry at physical 0x1000000
|
||
|
||
## Executive Summary
|
||
|
||
| Component | Status | Notes |
|
||
|-----------|--------|-------|
|
||
| ELF Loading | ✅ Correct | Loads to correct physical addresses |
|
||
| Entry Point | ✅ Correct | Virtual address used (page tables handle translation) |
|
||
| RSI → boot_params | ✅ Correct | RSI set to BOOT_PARAMS_ADDR (0x20000) |
|
||
| Page Tables (identity) | ✅ Correct | Maps physical 0-4GB to virtual 0-4GB |
|
||
| Page Tables (high-half) | ✅ Correct | Maps 0xffffffff80000000+ to physical 0+ |
|
||
| Memory Layout | ✅ **FIXED** | Addresses relocated above page table area |
|
||
| Constants | ✅ **FIXED** | Cleaned up and documented |
|
||
|
||
---
|
||
|
||
## 1. ELF Loading Analysis (loader.rs)
|
||
|
||
### Current Implementation
|
||
|
||
```rust
|
||
let dest_addr = if ph.p_paddr >= layout::HIGH_MEMORY_START {
|
||
ph.p_paddr
|
||
} else {
|
||
load_addr + ph.p_paddr
|
||
};
|
||
```
|
||
|
||
### Verification
|
||
|
||
For vmlinux with:
|
||
- `p_paddr = 0x1000000` (16MB physical)
|
||
- `p_vaddr = 0xffffffff81000000` (high-half virtual)
|
||
|
||
The code correctly:
|
||
1. Detects `p_paddr (0x1000000) >= HIGH_MEMORY_START (0x100000)` → true
|
||
2. Uses `p_paddr` directly as `dest_addr = 0x1000000`
|
||
3. Loads kernel to physical address 0x1000000 ✅
|
||
|
||
### Entry Point
|
||
|
||
```rust
|
||
entry_point: elf.e_entry, // Returns virtual address (e.g., 0xffffffff81000000 + startup_64_offset)
|
||
```
|
||
|
||
This is **correct** because the page tables map the virtual address to the correct physical location.
|
||
|
||
---
|
||
|
||
## 2. Memory Layout Analysis
|
||
|
||
### Current Memory Map
|
||
|
||
```
|
||
Physical Address Size Structure
|
||
─────────────────────────────────────────
|
||
0x0000 - 0x04FF 0x500 Reserved (IVT, BDA)
|
||
0x0500 - 0x052F 0x030 GDT (3 entries)
|
||
0x0530 - 0x0FFF ~0xAD0 Unused gap
|
||
0x1000 - 0x1FFF 0x1000 PML4 (Page Map Level 4)
|
||
0x2000 - 0x2FFF 0x1000 PDPT_LOW (identity mapping)
|
||
0x3000 - 0x3FFF 0x1000 PDPT_HIGH (kernel mapping)
|
||
0x4000 - 0x7FFF 0x4000 PD tables (for identity mapping, up to 4GB)
|
||
├─ 0x4000: PD for 0-1GB
|
||
├─ 0x5000: PD for 1-2GB
|
||
├─ 0x6000: PD for 2-3GB
|
||
└─ 0x7000: PD for 3-4GB ← OVERLAP!
|
||
0x7000 - 0x7FFF 0x1000 boot_params (Linux zero page) ← COLLISION!
|
||
0x8000 - 0x8FFF 0x1000 CMDLINE
|
||
0x8000+ 0x2000 PD tables for high-half kernel mapping
|
||
0x9000 - 0x9XXX ~0x500 E820 memory map
|
||
...
|
||
0x100000 varies Kernel load address (1MB)
|
||
0x1000000 varies Kernel (16MB physical for vmlinux)
|
||
```
|
||
|
||
### 🔴 CRITICAL: Memory Overlap
|
||
|
||
**Problem**: For guest memory sizes > 512MB, the page directory tables for identity mapping extend into 0x7000, which is also used for `boot_params`.
|
||
|
||
```
|
||
Memory Size PD Tables Needed PD Address Range Overlaps boot_params?
|
||
─────────────────────────────────────────────────────────────────────────────
|
||
128 MB 1 0x4000-0x4FFF No
|
||
512 MB 1 0x4000-0x4FFF No
|
||
1 GB 1 0x4000-0x4FFF No
|
||
2 GB 2 0x4000-0x5FFF No
|
||
3 GB 2 0x4000-0x5FFF No
|
||
4 GB 2 0x4000-0x5FFF No (but close)
|
||
```
|
||
|
||
Wait - rechecking the math:
|
||
- Each PD covers 1GB (512 entries × 2MB per entry)
|
||
- For 4GB identity mapping: need ceil(4GB / 1GB) = 4 PD tables
|
||
|
||
Actually looking at the code again:
|
||
|
||
```rust
|
||
let num_2mb_pages = (map_size + 0x1FFFFF) / 0x200000;
|
||
let num_pd_tables = ((num_2mb_pages + 511) / 512).max(1) as usize;
|
||
```
|
||
|
||
For 4GB = 4 * 1024 * 1024 * 1024 bytes:
|
||
- num_2mb_pages = 4GB / 2MB = 2048 pages
|
||
- num_pd_tables = (2048 + 511) / 512 = 4 (capped at 4 by `.min(4)` in the loop)
|
||
|
||
**The 4 PD tables are at 0x4000, 0x5000, 0x6000, 0x7000** - overlapping boot_params!
|
||
|
||
Then high_pd_base:
|
||
```rust
|
||
let high_pd_base = PD_ADDR + (num_pd_tables.min(4) as u64 * PAGE_TABLE_SIZE);
|
||
```
|
||
= 0x4000 + 4 * 0x1000 = 0x8000 - overlapping CMDLINE!
|
||
|
||
---
|
||
|
||
## 3. Page Table Mapping Verification
|
||
|
||
### High-Half Kernel Mapping (0xffffffff80000000+)
|
||
|
||
For virtual address `0xffffffff81000000`:
|
||
|
||
| Level | Index Calculation | Index | Maps To |
|
||
|-------|-------------------|-------|---------|
|
||
| PML4 | `(0xffffffff81000000 >> 39) & 0x1FF` | 511 | PDPT_HIGH at 0x3000 |
|
||
| PDPT | `(0xffffffff81000000 >> 30) & 0x1FF` | 510 | PD at high_pd_base |
|
||
| PD | `(0xffffffff81000000 >> 21) & 0x1FF` | 8 | Physical 8 × 2MB = 0x1000000 ✅ |
|
||
|
||
The mapping is correct:
|
||
- `0xffffffff80000000` → physical `0x0`
|
||
- `0xffffffff81000000` → physical `0x1000000` ✅
|
||
|
||
---
|
||
|
||
## 4. RSI Register Setup
|
||
|
||
In `vcpu.rs`:
|
||
|
||
```rust
|
||
let regs = kvm_regs {
|
||
rip: kernel_entry, // Entry point (virtual address)
|
||
rsi: boot_params_addr, // Boot params pointer (Linux boot protocol)
|
||
rflags: 0x2,
|
||
rsp: 0x8000,
|
||
..Default::default()
|
||
};
|
||
```
|
||
|
||
RSI correctly points to `boot_params_addr` (0x7000). ✅
|
||
|
||
---
|
||
|
||
## 5. Constants Inconsistency
|
||
|
||
### mod.rs layout module:
|
||
```rust
|
||
pub const PVH_START_INFO_ADDR: u64 = 0x7000; // Used
|
||
pub const ZERO_PAGE_ADDR: u64 = 0x10000; // NOT USED - misleading!
|
||
```
|
||
|
||
### linux.rs:
|
||
```rust
|
||
pub const BOOT_PARAMS_ADDR: u64 = 0x7000; // Used
|
||
```
|
||
|
||
The `ZERO_PAGE_ADDR` constant is defined but never used, which is confusing since "zero page" is another name for boot_params in Linux terminology.
|
||
|
||
---
|
||
|
||
## Applied Fixes
|
||
|
||
### Fix 1: Relocated Boot Structures ✅
|
||
|
||
Moved all boot structures above the page table area (0xA000 max):
|
||
|
||
| Structure | Old Address | New Address | Status |
|
||
|-----------|-------------|-------------|--------|
|
||
| BOOT_PARAMS_ADDR | 0x7000 | 0x20000 | ✅ Already done |
|
||
| PVH_START_INFO_ADDR | 0x7000 | 0x21000 | ✅ Fixed |
|
||
| E820_MAP_ADDR | 0x9000 | 0x22000 | ✅ Fixed |
|
||
| CMDLINE_ADDR | 0x8000 | 0x30000 | ✅ Already done |
|
||
| BOOT_STACK_POINTER | 0x8FF0 | 0x1FFF0 | ✅ Fixed |
|
||
|
||
### Fix 2: Updated vcpu.rs ✅
|
||
|
||
Changed hardcoded stack pointer from `0x8000` to `0x1FFF0`:
|
||
- File: `vmm/src/kvm/vcpu.rs`
|
||
- Stack now safely above page tables but below boot structures
|
||
|
||
### Fix 3: Added Layout Documentation ✅
|
||
|
||
Updated `mod.rs` with comprehensive memory map documentation:
|
||
|
||
```text
|
||
0x0000 - 0x04FF : Reserved (IVT, BDA)
|
||
0x0500 - 0x052F : GDT (3 entries)
|
||
0x1000 - 0x1FFF : PML4
|
||
0x2000 - 0x2FFF : PDPT_LOW (identity mapping)
|
||
0x3000 - 0x3FFF : PDPT_HIGH (kernel high-half mapping)
|
||
0x4000 - 0x7FFF : PD tables for identity mapping (up to 4 for 4GB)
|
||
0x8000 - 0x9FFF : PD tables for high-half kernel mapping
|
||
0xA000 - 0x1FFFF : Reserved / available
|
||
0x20000 : boot_params (Linux zero page) - 4KB
|
||
0x21000 : PVH start_info - 4KB
|
||
0x22000 : E820 memory map - 4KB
|
||
0x30000 : Boot command line - 4KB
|
||
0x31000 - 0xFFFFF: Stack and scratch space
|
||
0x100000 : Kernel load address (1MB)
|
||
```
|
||
|
||
### Verification Results ✅
|
||
|
||
All memory sizes from 128MB to 16GB now pass without overlaps:
|
||
|
||
```
|
||
Memory: 128 MB - Page tables: 0x1000-0x6FFF ✅
|
||
Memory: 512 MB - Page tables: 0x1000-0x6FFF ✅
|
||
Memory: 1024 MB - Page tables: 0x1000-0x6FFF ✅
|
||
Memory: 2048 MB - Page tables: 0x1000-0x7FFF ✅
|
||
Memory: 4096 MB - Page tables: 0x1000-0x9FFF ✅
|
||
Memory: 8192 MB - Page tables: 0x1000-0x9FFF ✅
|
||
Memory: 16384 MB- Page tables: 0x1000-0x9FFF ✅
|
||
```
|
||
|
||
---
|
||
|
||
## Verification Checklist
|
||
|
||
- [x] ELF segments loaded to correct physical addresses
|
||
- [x] Entry point is virtual address (handled by page tables)
|
||
- [x] RSI contains boot_params pointer
|
||
- [x] High-half mapping: 0xffffffff80000000 → physical 0
|
||
- [x] High-half mapping: 0xffffffff81000000 → physical 0x1000000
|
||
- [x] **Memory layout has no overlaps** ← FIXED
|
||
- [x] Constants are consistent and documented ← FIXED
|
||
|
||
## Files Modified
|
||
|
||
1. `vmm/src/boot/mod.rs` - Updated layout constants, added documentation
|
||
2. `vmm/src/kvm/vcpu.rs` - Updated stack pointer from 0x8000 to 0x1FFF0
|
||
3. `docs/MEMORY_LAYOUT_ANALYSIS.md` - This analysis document
|