Containers Unveiled: Understanding the Basics
Containers rely on multiple Linux features like namespaces, cgroups, and layered filesystems. Docker Engine uses lower-level tools like runc to set up containers based on Docker images, which encapsulate the filesystem and configuration. Namespaces provide process isolation, but significant additional logic is needed to create complete containers.
Containers are more than just namespace isolation - they also rely on control groups (cgroups) for resource limitation, UnionFS for filesystem layers, and container formats (Docker image) for portability, among other things. Namespaces are just one key mechanism.
There are different types of namespaces, including pid, net, mnt, uts, ipc, user, etc. However, namespaces alone do not create containers - additional configuration is needed.
Docker uses a container runtime (runc by default) that handles creating namespaces, cgroups, managing file system layers, networking, etc. Docker Engine calls runc to set up containers.
runc is not just "a mixture of all the namespaces". It carefully sets up namespaces, cgroups, filesystems, networking, etc., in a coordinated way to create an isolated container.
The Docker image provides the filesystem layers and container configuration. runc then sets up the container based on this specification.
Inside the Container Lifecycle: A Step-by-Step Guide
On the command line, you run
docker run ubuntu
, instructing the Docker Engine to create an Ubuntu container.The Docker client makes an API request to the Docker daemon to create a container, communicating via the Docker API over a Unix socket.
The Docker Engine looks for the Ubuntu image locally or pulls it from a registry if not found.
The Ubuntu image contains a root filesystem with Ubuntu binaries/libraries and a JSON configuration specifying metadata like exposed ports.
The Docker daemon requests container creation from the containerd process via the gRPC protocol.
containerd uses runC via the protobuf API to create the container.
runc sets up a new network namespace for the container with a virtual Ethernet interface, providing network isolation.
runc creates a new PID namespace to isolate the container's processes from the host.
A new IPC namespace is created for interprocess communication isolation.
runc sets up a new MNT namespace and overlays the Ubuntu image filesystem layers using a union filesystem, providing file system isolation.
User namespaces can be used to map UIDs/GIDs from the container to the host.
Control groups (cgroups) are configured by runc for resource limitations.
runC starts a "shim" process in the namespaces to serve as an intermediary between the container processes and runC itself.
The shim then starts the container's init process, allowing the Ubuntu distro to boot inside the container.
Within the namespaces, runc starts the init process defined in the config, allowing Ubuntu to boot inside the container.
runC reports container creation success to containerd.
containerd communicates success to the Docker daemon.
The running Ubuntu container is now completely isolated from the host through the combination of namespaces, cgroups, and union filesystem.
The Docker Engine monitors the running container, providing an access point to manage its lifecycle.
The Docker daemon provides high-level management APIs for starting, stopping, and removing containers
Docker Engine leverages runC to carefully configure Linux namespaces, cgroups, and filesystems to set up the isolated container environment based on the Docker image specification.
Container Revolution: Separating Myth from Reality
Myth: Containers are just namespaces.
Truth: Containers are built upon multiple technologies, including namespaces, cgroups, UnionFS, container runtimes like runc, and image formats like Docker images. While namespaces provide process isolation, they are just one component of the containerization ecosystem. Each of these elements plays a crucial role in creating isolated and efficient container environments.
Myth: runc is just a mishmash of namespaces.
Truth: runc is a sophisticated tool that carefully configures various aspects of a container environment. It carefully manages namespaces, cgroups, filesystems, networking, and more in a coordinated manner to achieve complete isolation and resource management for containers. It serves as a fundamental building block for containerization.
Myth: Containers are lightweight VMs.
Truth: Unlike virtual machines (VMs), containers do not virtualize hardware. Instead, they share the host kernel and resources, making them more lightweight and efficient. However, it is important to recognize that VMs offer stronger isolation between workloads, which can be critical in specific scenarios.
Myth: Docker is required for containers.
Truth: Docker played a pivotal role in popularizing containers, but it is not the only option. Various container runtimes like runc and containerd, along with orchestrators such as Kubernetes, are available. Docker is often used as a developer tool, while other runtimes are utilized in production environments.
Myth: Containers are completely isolated and secure.
Truth: Containers provide a high level of isolation at the application level, but they are not entirely immune to security risks. Kernel exploits are potential vulnerabilities. To enhance container security, additional measures such as seccomp, AppArmor, and vulnerability scanning are strongly recommended.
Myth: Containers are only used for microservices.
Truth: Containers are versatile and can deploy a wide range of workloads, from microservices to monolithic applications. The modular nature of containers makes them well-suited for microservices architectures, but they are not limited to that use case.
Myth: Containers are only for developers/ops.
Truth: Containers benefit both developers and operations teams. They enable Continuous Integration/Continuous Deployment (CI/CD) pipelines, fostering collaboration between development and operations. Additionally, containers provide advantages like portability and consistency across development, quality assurance, and production environments.
Containerization Explored: Key Takeaways and Next Steps
Docker Client uses Docker API to send requests to Docker Daemon/Engine.
Docker Daemon employs gRPC(Google Remote Procedure Call) to communicate with containerd.
containerd uses protobuf to communicate with runc.
runc configures namespaces, cgroups, and filesystems to create a container.
Docker image supplies the filesystem and configuration to runc.
The outcome is an isolated container environment.
An Open Invitation for Feedback
Thank you for reading this overview of how containers work under the hood. I aimed to provide an accurate technical explanation of the key components involved in creating Docker containers. However, container technology is complex and constantly evolving, so there may be aspects I missed or described incorrectly. If you notice any errors or have suggestions to improve the accuracy or clarity of this article, please don't hesitate to contact me. I welcome feedback from readers to help strengthen my own container knowledge, as we all continue learning together in this rapidly changing ecosystem. Constructive comments allow me to correct any mistaken assumptions and become a better technical writer.