Linux 3.0 will support Xen
After a relatively long road traveled with a few bumps along the way, as of yesterday, Linus’s mainline tree (2.6.39+) contains literally every component needed for Linux to run both as a management domain kernel(Dom0) and a guest(DomU).
Xen has always used Linux as the management OS (Dom0) on top of the hypervisor itself, to do the device management and control of the virtual machines running on top of Xen. And for many years, next to the hypervisor, there was a substantial linux kernel patch that had to be applied on top of a linux kernel to transform into this “Dom0″. This code had to constantly be kept in sync with the progress Linux itself was making and as such caused a substantial amount of extra work that had to be done.
Another bit of code, that’s been in the kernel for many years, were the paravirt drivers for xen in a guest VM (DomU). Linux has had this as part of the codebase for quite a few years, the xen network, block and xenbus drivers that are loaded when you run a hardware virtualized guest (hvm) on Xen with paravirt (pv) drivers. This is always referred to as pv-hvm.
A pure hardware virtualized kernel without any xen drivers, just emulated qemu devices is just simply called an “hvm guest”. This does not perform well as any type of network or block IO goes through many layers of emulation. As hardware virtualization has improved over the years in the chips, pv-hvm has become performant and is frequently used. The pv-drivers basically are highly optimized virtual devices that communicate through the hypervisor to do network or disk io, handled behind the scenes by the Dom0 kernel and what is called backend devices (netbk, blockbk).
A pure paravirtualized guest is/was an OS kernel that was totally modified to really be in sync with the hypervisor and let the hypervisor take care or own a number of tasks to be as optimized as possible. Performance and integration is the best with a paravirtualized kernel and this also allowed xen to run on x86 hardware, optimally, without hardware virtualization instruction support – this is referred to as pv-guest. The Dom0 kernel runs in pv mode (more on this later) and the DomU guests could run in hvm, pv-hvm or pv mode.
Over the years, a number of efforts were made to get these pv / dom0 patches submitted into the mainline kernel but at times the code was not considered acceptable by a number of the linux kernel maintainers and little progress was made. Over the last 2 years a renewed effort started to really convert the code into patches considerd acceptable and a set of people : Jeremy Fitzhardinge, Konrad Rzeszutek Wilk, Ian Campbell , Stefano Stabellini (and others not mentioned but obviously also important) focused on getting this stuff done once and for all… and so.. bit by bit. code was rewritten submitted for review, rewritten again until it was considered ok. In terms of timeline, a good chunk of code has gone in over time to handle Linux as a well behaved guest (DomU) first, then followed by all the work to make the Dom0 happy as well.
One change that happened in the Linux kernel to be able to better handle such an infrastructure in a virtual world for more than one hypervisor, was called pvops.
pvops, is a mode where the kernel can switch into pv, hvm or pvhvm at boot time. Instead of having multiple kernel binaries, there is just one and it will lay out its operations at boot time when it detects on what platform it runs. Linux as a DomU guest on Xen has had pvops support since 2.6.23/24 with good use starting around 2.6.27. So the frontend network and block drivers and running pvops on xen has been around also for quite some time. As this finalized the work focused more on preparing the Dom0 parts of integration and a migration from the old classical pure pv kernel mode to what’s now called pvops.
Late last year in 2.6.37, we had a mainline kernel that was able to actually run as the “Dom0″ for the Xen hypervisor. That was a big step, followed shortly by adding the remaining bits that were needed to really handle every area : memory management, grant table stuff, network pv driver backend and block pv driver backend code (and other misc components). The last remaining driver just got merged 2 days ago into 2.6.39+ mainline – the block backend driver blkback.









