前言
最近项目上有虚拟化需求,想着熟悉一下开源虚拟化框架。网上搜索了一通,开源的Type1型虚拟化框架Xen映入眼帘,同时ARM也官方支持Xen,你懂得(不懂拉倒,我也不懂),那就先熟悉一下吧。
一、Xen优势
网上搜索了一堆,Xen的资料还真是乱。东扯西扯的就下面几个优势:
- 半虚拟化(Paravirtualized) peripheral sharing
- Displays
- Audio interfaces
- Touchscreens, keyboards, mice
- Network interfaces
- Disk drives
- USB
- IMG GPU virtualization with up to eight guests supported
- ARM TrustZone virtualization access supported with OP-TEE
- Real-time scheduler support (RTDS, ARINC653-航空基的哦)
- Full Android on Xen support
- HALs for PV peripherals
- OPTEE gatekeeper/keymaster
- Sensors/GNSS/Vehicle data via VIS
- Full open source support
- Functional safety support - IEC 61508 route 3s
车载需求
下面的表格是Genivi依据车载场景对Xen的能力提出的挑战:
Compute Requirements | Xen Project |
Static resource partitioning and flexible on-demand resource allocation (CPU, RAM, GPU and IO) | Core functionality, multiple schedulers, GPU/co-processor sharing, memory ballooning, etc. |
Memory/IO bus bandwidth allocation and rebalancing | WIP: Effort by several parties to enable Hard RT support on Xen |
Peripherals Requirements | Xen Project |
GPU and displays shall be shared between execution environments supporting both fixed(each one talks to its own display or to a specified area on a single display) and flexible configurations (shape, z-order, position and assignment of surfaces from different execution environments may change at run time) | Via GPU sharing (and WIP coprocessor sharing), PV Drivers (PV DRM) |
Inputs shall be routed to one or multiple execution environments depending on current mode, display configuration (for touchscreens), active application (for jog dials & buttons), etc. | Via PV Drivers (PV KBDFRONT) |
Audio shall be shared between execution environments. Sound complex mixing policies for multiple audio streams and routing of dynamic source/sink devices (BT profiles, USB,speakers or microphones, etc.) shall be supported. | Via PV Drivers (PV SOUND) |
Network shall be shared between execution environments. Virtual networks with different security characteristics shall be supported (e.g., traffic filtering and security mechanisms) | Via PV Drivers & Disaggregation Xen Security Modules |
Storage shall support static or shared allocation, together with routing of dynamic storage devices (USB mass storage) | Via PV Drivers |
Security Requirements | Xen Project |
Root of Trust and Secure boot shall be supported for all execution environments. x86: TPM 2.0, Intel TXT, AMD SVM | x86: TPM 2.0, Intel TXT, AMD SVM Arm: supported with OPTEE |
Trusted Computing (discrete TPM, Arm TrustZone or similar) shall be available and configurable for all execution environments. | x86: in Xen; some extras in OpenXT Arm: OPTEE (WIP: up streaming) |
Hardware isolation shall be supported (cache, interrupts, IOMMUs, firewalls, etc.) | Core functionality (except firewalls) |
Safety Requirements | Xen Project |
System monitoring shall be supported to attest and verify that the system is correctly running. | Can be implemented through VMI in Hypervisor, agents outside or through a hybrid |
Restart shall be possible for each execution environment in case of failure. | Core Functionality |
Redundancy shall be supported for the highest level of fault tolerance with fall-back solutions available to react in case of failure. | WIP: This has to be analysed in scope of "safety certification" initiative, as well as "dom0-less" Xen and "minimal" Kbuild |
Real time support shall be guaranteed together with predictive reaction time. | Different scheduler options with WIP: Benchmarks with recommendations and Hard RT support |
Performance and Power Consumption Requirements | Xen Project |
Virtualization performance overhead shall be minimal: 1-2% on CPU/memory benchmarks, up to 5% on GPU benchmarks. | Arm: fulfils requirements x86: not verified |
Predictability shall be guaranteed. Minimal performance requirements shall be met in any condition (unexpected events, system overload, etc.). | different scheduler options with different trade-offs. Benchmarks with recommendations in progress. Possibly some code changes will be up streamed. |
Execution environments fast boot: Less than 2 seconds for safety critical applications, less than 5 seconds for Instrument Cluster, and 10 seconds for IVI. Hibernate and Suspend to RAM shall be supported. | Arm: Proven by both GlobalLogic and EPAM |
Execution environments startup order shall be predictable. | Core functionality |
Advanced power management shall be implemented with flexible policies for each execution environment. | Arm: Partially implemented (not yet up-streamed). Further work by EPAM, XILINX and Aggios planned |
二、Xen框架
下图是Xen的框架模型,可以简单的分为三层:
- 底层为物理硬件,包括IO设备,内存,CPU等硬件;
- 中间层为Xen Hypervisor,也就是通常所说的VMM(Virtual Machine Manager),主要负责内存管理,CPU管理,中断处理以及人物调度等核心工作;
- 上层为Xen创建的虚拟机,包括特权虚拟机Dom0和非特权虚拟机DomU。
需要注意的:
底层的IO设备不是由Xen Hypervisor管理,而是由Dom0中的内核进行管理的。
从下图较为详细的拆解中可以看到,Xen Hypervisor控制所有的系统资源:
- 内存管理模块
- 中断控制器驱动
- 系统时钟驱动
- 虚拟机调度器
- 特权寄存器模块
但对Linux生态依赖比较严重的设备驱动和工具链,Xen将其放置于Dom0中,简化系统的实现并提高兼容性。
Domain0是Xen创建的一个拥有特殊权限的虚拟机:
- 它拥有大部分设备的访问权限,除非Xen做保留
- 它能够访问Xen提供的所有hypercall API,用于支持工具链
- 可以访问其它虚拟机的镜像和配置文件
- 可以创建和删除其它虚拟机
- 拥有重启和关闭整个物理主机的权限
下图是Xen基于内核空间的Backend Driver实现的IO虚拟化。这种方式将DomU内核空间运行的Frontend Driver接收到的Guest OS的IO请求,通过环形缓冲提供给位于Dom0 Kernel中的Backend Driver,也就是所说的后半段,后半段与运行在内核空间的驱动程序交互完成底层IO硬件的访问。
下图是TI J6系统上Xen的框图:
三、Xen性能
1. NXP i.MX8
- Boot times:Xen 0.8 sec, dom0 5.1 sec
- Interrupt Latency:2.3 µsec
- Context Switch Overhead: ~0% to 0.6%
2.TI J6
- u-boot loads Xen device tree configuration and Dom0 kernel
- cold start to Xen start is less than 100ms
- domain configuration, memory map, IRQ map passed to Xen trough device tree
- Xen boot time on J6 is 300ms
- all printouts are disabled
- RAM wipeout is disabled
- Dom0 kernel boots in 800ms
总结
先简单介绍一下行业内针对Xen做出的一些总结,接下来介绍Xen的启动过程。