Linux containers, as a lighter virtualization alternative to virtual machines, are gaining momentum. The High Performance Computing (HPC) community is eyeing Linux containers with interest, hoping that they can provide the isolation and configurability of Virtual Machines, but without the performance penalties.
In this article, I will show a simple example of libvirt-based container configuration in which I assign the container one of the ultra-low latency (usNIC) enabled Ethernet interfaces available in the host. This allows bare-metal performance of HPC applications, but within the confines of a Linux container.
Before we jump into the specific libvirt configuration details, let’s first quickly review the following points:
What “container” means in the context of this article.
What limitations exist making it impossible to rely solely on (the available) namespaces to assign host devices to containers and guarantee some kind of isolation.
What tools can be used to bridge the above-mentioned gaps.
Introduction to Linux Containers
Fun fact: there is no formal definition of a Linux “container.” Most people identify a Linux container with keywords like LXC, libvirt, Docker, namespaces, cgroups, etc.
Some of those keywords identify user space tools used to configure and manage some form of containers (LXC, libvirt, and Docker). Others identify some of the building blocks used to define a container (namespaces and cgroups).
Even in the Linux kernel, there is no definition of a “container.”
However, the kernel does provide a number of features that can be combined to define what many people call a “container.” None of these features are mandatory, and depending on what level of sharing or isolation you need between containers — or between the host and containers — the definition/configuration of a “container” will (or will not) make use of certain features.
In the context of this article, I will focus on assignment of usNIC enabled devices in libvirt-based LXC containers. For simplicity, I will ignore all security-related aspects.
Network namespaces, PCI, and filesystems
Given the relationship between devices and the filesystem, I will focus on filesystem related aspects and ignore the other commonly configured parts of a container, such as CPU, generic devices, etc.
Assigning containers their own view of the filesystem, with different degrees of sharing between host filesystem and container filesystem, is already possible and easy to achieve (see mount documentation for namespaces). However, what is still not possible is to partition or virtualize (i.e., make namespace-aware) certain parts of the filesystem.
Filesystem elements such as the virtual filesystems commonly mounted in /proc, /sys, and /dev are examples that fall into that category. These special filesystems provide a lot of information and configuration knobs that you may not want to share between the host and all containers, or between containers.
Also, a number of device drivers place special files in /dev that user space can use to interact with the devices via the device driver.
More Details : http://blogs.cisco.com/performance/usnic-inside-linux-containers