What is a container?
Containers are a more lightweight approach to virtualization. They work very similar to a Virtual Machine (VM), in the sense that they have their own file system and their own network adapters, which separate them from the host. The distinction to a VM lies in the fact that containers do not have their own kernel, thus they share the host’s OS kernel. This enables them to have a far lower performance overhead compared to regular virtualization.
But this also has some drawbacks, namely that the container can only run the same OS as the host. So Linux-Containers will only run on Linux and Windows-Containers will only run on Windows (of course, you can virtualize a Linux kernel on Windows and then run Linux-Containers).
The other problem is that sharing a kernel inherently creates more security risk. While Container-Engines usually have very good protections to keep the host and containers separate, we have learned time and time again that no protection is perfect.
Containers are usually available as “Images” in repositories, the most popular one being the DockerHub Image-Repository, these Images can then be instantiated as Containers. All data in the containers is ephemeral, so in order to use persistent configurations or to store database-files, it is required to mount directories inside the Container to an external persistent data storage. Some options being:
- Mounting to a host directory
- Using a volume (Storage managed by the Container-Engine)
- Using network storage such as NFS
Container-Engines (CEs) are programs that run the Containers and manage their persistent data. Common CEs are Docker, Podman, LXC, containerd and CRI-O. The most known CEs is probably Docker, it has been widely adopted in the industry in recent years, by being an easy entrypoint for new developers, while simultaneously running the largest Container-Orchestrators like Docker-Swarm, Nomad and Kubernetes (although Kubernetes has deprecated Docker with v1.20).
Podman emerged as a daemon-less alternative to Docker developed by Red-Hat. A notable feature is that it doesn’t necessarily require root privileges to run containers and can also run pods which are groups of containers that share the same resource pool.
CRI-O is the newest of the listed CEs, being especially developed for Kubernetes and as a lightweight alternative to Docker.
Running as root
When building container images, it is important to consider that the kernel between container and host is shared. This means if a process running as root is compromised in the container, it also has root access to any resources that a assigned from the host, such as bind-mounts. It is advisable to always run the processes in containers with a user with the least amount of privileges possible.
Mounting as read-only
For container images that only need to access static config files, it is good practice to mount those as read only.
docker run -v volume-name:/path/in/container:ro my/image
This can protect them from unwanted changes from the container or even from ransomware.
CEs and Container-Orchestrators usually have the ability to limit resources for specific containers. This can be useful for example if a container is misbehaving or if a container is under a Denial of Serivce (DOS) attack which aims to exhaust resources to disrupt service. While the container itself may need to be recreated, the rest of your system is safe from resource exhaustion.
Limiting kernel capabilities
Limiting kernel capabilities is another way to reduce the risk from a compromised or misbehaving container. Some of the capabilities that can be limited are:
- deny all “mount” operations;
- deny access to raw sockets (to prevent packet spoofing);
- deny access to some filesystem operations, like creating new device nodes, changing the owner of files, or altering attributes (including the immutable flag);
- deny module loading;
Using multi-stage builds
Multi-stage builds can be used to reduce complexity in images, they are used to separate the “building” part from the actually running part of the docker container.
In the example below we firstly use the golang image to build our binary, since for the actual deployment we don’t need all the functionality of the golang image, we then copy the created binary into a much smaller and less complex alpine based image. This newly created smaller image which contains the compiled binary will then be used in the actual deployment.
# syntax=docker/dockerfile:1 FROM golang:1.16 WORKDIR /go/src/github.com/alexellis/href-counter/ RUN go get -d -v golang.org/x/net/html COPY app.go ./ RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app . FROM alpine:latest RUN apk --no-cache add ca-certificates WORKDIR /root/ COPY --from=0 /go/src/github.com/alexellis/href-counter/app ./ CMD ["./app"]
Healthchecks are usually small scripts that are run periodically to check if a container is behaving correctly. One of the simplest examples for a healthcheck can be found below.
HEALTHCHECK --interval=5m --timeout=3s \ CMD curl -f http://localhost/ || exit 1
A webserver is queried every 5 minutes, if it does not respond within 3 seconds it will be considered “unhealthy”. The data from the healthchecks can then be further utilized to for example automatically recreate failing containers.
Keeping stuff updated
Of course, the same general security measures that apply to conventional software also apply to containers. Especially keeping your CEs, Orchestrators and containers up-to-date.
One Container that can help you with keeping your containers updated is Watchtower. It can automatically detect whether newer versions are available for your containers, and then replace your old containers with newer ones.
Containers can be a great way to dynamically and securely deploy and scale your workloads across your infrastructure. But it is still important to remember that while containers are virtualized, vulnerable ones still pose a risk to your systems as a whole. So keeping up with best practices to ensure maximum security remains important.