Tag Archives: Cloud

Never login to your production containers


We have to stop treating production (virtual) machines like they are desktops – and that means resisting the urge to login to the running machine, once it begins its production lifecycle. Later in this post I will contrast the way VM containers are managed with how Linux containers are managed by solutions like Docker.

Every time you log in to a machine, no matter for what reason, you create side-effects that you are unaware of. Over time, the state of the machine will deviate from the desired state in which it started its lifecycle. This is the root cause of many nasty problems, some of which can be very difficult to diagnose. Seemingly innocuous commands typed in the machine’s console can wreak havoc right away (if you are lucky) or linger for months before they create disruption (if you are unlucky). A surprisingly common example is changing the permissions on a directory to enable some other operation, then forgetting to change it back. There was one such situation reported at Google many years ago, when someone removed the executable permission on the directory containing the Linux dynamic loader, causing several machines to lose the ability to exec() any binaries, including their own health monitoring agents. Fortunately Google had a sufficiently resilient design that the impact of this disruption was not noticed by end users. But I have been in several customer crit-sits (critical situations) where similar accidentally introduced problems have taken business applications down.

But how can we avoid logging in to a production VM? Don’t we need to install/update software inside it and start/stop services within it? Yes and yes, but you don’t need to log in to the production VM to do it.

If you need to install/update software in a production VM, follow these steps instead:

  1. Start a maintenance instance of the VM image you created the production VM instance from. This assumes you are following the best practice I bogged about in an earlier post so every uniquely configured VM instance in your production environment has a corresponding VM image from which it was deployed.
  2. Install/update and test this maintenance VM instance. Then shutdown and capture it as a new updated VM image. This would be a good time to version the VM image so you can compare/diff it, track the provenance of changes you made to it, etc.
  3. Deploy an updated VM instance from this new image, in place of the original production VM instance. You may need a small downtime window to do this swap depending on how your application is set up.

To start/stop/restart services inside the VM, the better way to do this is to install a utility that listens on a designated port for start/stop/restart commands and executes them locally.

The key point here is this. Think of a VM image as the handoff between the Dev and Ops halves of your application lifecycle. It should contain all of the software environment necessary for execution, maintenance, and monitoring. No externally induced side-effects should be permitted once a VM image is instantiated as a running VM instance. Following this simple rule can improve the manageability of your operational environment a lot more than you think.

A good analogy is to think about a VM image as the a.out binary produced by a compilation process. When you run the binary, you get a process. You don’t “login” to a running process – indeed there is no such capability. And that is a good thing, because then the process’s runtime behavior is governed by the state of that exact a.out binary, which in turn is governed by the exact source code version that was used to build it.

I hate to say this, but deployment tools like Chef and Puppet violate this simple principle. They make the deployment process that is under their control more repeatable and robust, but they induce side effects on the system that are not modeled in the deployment recipe and therefore remain invisible to the tool chain. The right way to use these tools is to integrate them with VM image build tools like Ubuntu VM-Builder so that executing a deployment recipe results in a VM image, not a running VM instance. That VM image then represents the fully realized system image produced by a deployment recipe, in exactly the sense that an a.out binary corresponds to the source code from which it was compiled.

How Docker got it right

I have been tinkering with Linux containers and Docker recently, and one thing that really struck me was how Docker has followed this simple principle with (Docker) images and (Docker) containers. You can technically “log in” to a Docker container (get a tty into it) with this command:

docker run -t -i <myimage> <myshell>

But there is little need to ever do this, because a variety of Docker commands allow you to peek and poke the container from outside, without ever logging in to a shell within the container.  For example, you can stop/start the processes within a container (docker stop, docker start), watch events within the container (docker events, docker top), get logs from a container (docker logs), peek at a container’s current configured state (docker inspect), and even see what files changed since you started a container (docker diff).

These utilities makes Docker more than just a user-friendly wrapper around Linux containers. It is a fundamentally different abstraction for managing the lifecycle of applications. An abstraction where the image is treated as the immutable contract between the Dev (e.g. docker build) and Ops (e.g. docker run) halves of the DevOps lifecycle. This has the potential to disrupt the VM-instance centric ecosystem of tools and platforms that are in vogue today.

References

  1. Chef – IT Automation for Speed and Awesomeness
  2. Docker – Build, Ship, Run Any App Anywhere
  3. Linux containers

Index and version your VM images


Most people think about VM images as black boxes, whose contents only matter when the image is instantiated as a VM instance. Even virtualization savvy customers treat a VM image as nothing more than a disk-in-a-file, more of a storage and transportation nuisance than anything of significant value to their IT operations. In fact, it is common practice to use VM images only for the basic OS layer: all of the middleware and applications are installed using deployment tools (like Chef, Puppet, etc) after the OS image is instantiated. Thus, a single “master” VM image is used to create many VM instances that each have a different personality. Occasionally, VM images are used to snapshot the (known good state) of a running VM. But even so, the snapshot images are archived as unstructured disks, and management tools are generally unaware of the semantically rich file system level information locked within.

There is a smarter way to use VM images, that can result in many improvements in the way a data center environment is managed. Instead of a 1:N mapping of images to instances (a single image from which N uniquely configured instances are created), consider for a moment what would happen if we had a N:N mapping. In order to create a uniquely configured VM instance, you first create a VM image that contains that configuration (OS, middleware, and applications, all fully configured to give the image that unique personality), then you instantiate it. If you need to create many instances of the same configuration, you can start multiple instances of the unique VM image containing that configuration, as before. The invariant you want to enforce is that for every uniquely configured machine in your data center, you have a VM image that contains that exact configuration.

This is very useful for a number of reasons:

  1. Your VM images are a concrete representation of the “desired state” you intended each of its VM instances to have when you first instantiated them.  This is valuable in drift detection: understanding if any of those instances have deviated from this desired state, and therefore may need attention. The VM image provides a valuable reference point for problem diagnosis of running instances.
  2. You can index the file system contents of your VM images without perturbing the running VM instances that were launched from them. This is useful in optimizing compliance and security scanning operations in a data center. For example, if a running VM instance only touches 2% of the originally deployed file system state, then you only need to do an online scan of this 2% in the running VM instance. The offline scan results for the remaining 98% of the file system can be taken from the VM image that the instance was started from. This could result in smaller maintenance windows. The same optimization also applies to the indexing of other file system state, such as the contents of important configuration files within VM instances.
  3. You can version VM images just like you version source code. VM image build and update tools can work with branches, tag versions, compare/diff versions, etc. These are very useful in determining the provenance of changes made to a system over time. The ability to track the evolution of images over time may also be useful in determining how a security problem manifested itself over time.

Many years ago, my team developed a system called Mirage, that was designed to be a VM Image Library that provided these capabilities. At the lowest level, you could think of Mirage as a Git for VM images: it used a similar design to reduce the storage required to keep thousands of VM images by exploiting the file level redundancies that exist across images. In addition it provided Git like version control APIs, enabling operations like compare, branching, tagging, and so on.

Here is a diagram showing the use of VM image version control:

Screen Shot 2014-06-17 at 11.46.19 AM

 

The scenario above shows three different people, whose roles are to maintain and update three different layers of the software stack. This is a common situation in many Enterprises and IT Services organizations. Traditionally, only the “OS Admin” team creates VM images – the others merely instantiate that image and then install/configure their respective software layer within the running instance. With Mirage, there is an incentive for all three teams to collaboratively develop a VM image, similar to the way a development team with different responsibilities creates a single integrated application. Working with large VM images is very simple and fast with Mirage, because most operations are performed on image manifests, which are metadata about an image’s file system contents automatically extracted by Mirage.

The key insight in engineering Mirage is to realize that a block level representation of a VM image is much clunkier than a file level representation. The former is good for transporting an image to a host to be instantiated as a running VM instance (you can use copy-on-write to demand page disk blocks to a local cache kept on the host for example). But the latter is better for installation and maintenance operations, because it exposes the internal file system contents contained within the disk image.

When an image is imported into Mirage, it indexes the file system contents of the image disk. The libguestfs library is an excellent utility over which such a capability can be built today (at the time we built the first Mirage prototype, this library was in its infancy). Here is an overview of how the indexing process works:

Screen Shot 2014-06-17 at 11.46.52 AM

 

The file system metadata (including the disk, partition table, and file system structure) is preserved as a stripped down VM image, in which the size of every file is truncated to zero size. Mirage indexes this content-less structure into an image metadata manifest that it consults to provide various services. The contents of each file are first hashed (we used SHA1), and if this hash was not already known, the contents would be stored. Such a content addressed store is similar to that used by systems like Git for storage efficiency by exploiting file content redundancy. The mapping between file path names and their corresponding hashes was maintained in the image metadata manifest.

The Mirage VM image library was a very successful project at IBM. It now forms the core of the IBM Research Compute Cloud, which is the Cloud infrastructure used by thousands of Research employees around the world (4 data centers spread across multiple geographic zones). It is also the nucleus of the IBM Virtual Image Library, a product that is used by many Enterprise customers to manage large VM environments.

Fast forward to today, and we see Linux containers emerging as a viable alternative (some would argue it is complementary) to VMs as a vehicle to encapsulate and isolate applications. Applications like Docker that build on Linux containers are taking the right direction here. With Docker, you build a separate docker-image per unique docker-container. This allows Docker to provide image-level utilities that are valuable (e.g. docker diff). What Docker needs now, is a Git for docker images, like Mirage, except for linux container images not VM images. Many of the core concepts used in Mirage would also be useful here.

References

  1. Virtual Machine Images as Structured Data: the Mirage Image Library. Glenn Ammons, Vasanth Bala, Todd Mummert, Darrell Reimer, Xiaolan Zhang. USENIX HotCloud. 2011.
  2. Libguestfs – tools for accessing and modifying Virtual Machine disk images.