Parallelising the Unparallelisable

Launchpad has a lot of tests, almost 20,000. There are tests that make sure the internals work as expected, that verify the Javascript works in web browsers, and everything in between. In a perfect world those tests would only take seconds to run. In this world they take hours; six hours on our current continuous integration machines, for instance.

These long-running tests severely impact the time it takes to develop and deploy changes to Launchpad. We would like to improve the situation.

Given that the test cases are theoretically independent of one another, the obvious thing to do is to run the tests in parallel on a multi-core machine. Unfortunately many of the tests interact with the environment (databases, memcached, temporary directories, etc.) and conflict if run simultaneously.

Enter LXC

What we need is a way to isolate the test processes from each another. Virtual machines would allow us to do that, but the overhead and heavy-weight setup makes them unappealing. That’s where LXC (Linux Containers) comes in handy. LXC

allows the easy creation of “containers” that are isolated from the “host” machine without the performance overhead of VMs.

For example, to create a new container use lxc-create:

lxc-create -n test -t ubuntu

The container can then be started:

lxc-start -n test -d

And we can connect to it via SSH (using the default username and password shown during creation, if applicable):

ssh ubuntu@test

There are many options for customising the containers, including mounting a portion of the host’s file system in the container so sharing files between the two is easy.

Getting Ephemeral

All this is very nice for running isolated, parallel test runs but setting up and managing eight or more containers (one per core) is
off-putting, so we have used (and improved) a new LXC feature, “ephemeral” containers (created with lxc-start-ephemeral).

Ephemeral containers are “clones” of a base container and can have a temporary file system that reflects the contents of the base container but any writes are stored in-memory and are not written to disk. This allows us to install Launchpad on a single base container and then spawn many ephemeral containers, each with their own list of tests to run.

The ephemeral containers can then write to their local file systems without interfering with the others running simultaneously. The
containers may also benefit from faster IO because of the file system changes being stored in memory.

Results

We are still working out the kinks in our approach and wrestling with the occasional LXC bug as well as bugs in the Launchpad test suite itself. Even so we have already shortened a full test run on an eight-core EC2 instance down to 45 minutes; a substantial improvement over the current six hours.

 

(Image by Tolka Rova, Creative Commons license)

6 Responses to “Parallelising the Unparallelisable”

  1. Julian Edwards Says:

    Fabulous work guys!

  2. Ludovic Claude Says:

    Could juju be used here to manage the containers and deploy services on top of them? It would be great to use juju to test your services on your local machine, then keep the same tool and scripts to deploy those services on your servers or on a cloud.

  3. Benji York Says:

    Ludovic Claude wrote:
    > Could juju be used here to manage the containers and deploy services
    > on top of them?

    It could be, and is! Juju has been great to use for this project.

  4. Launchpad Blog Says:

    […] « Parallelising the Unparallelisable […]

  5. Ubuntu Weekly Newsletter Issue 259 | Ubuntu Linux FAQs Says:

    […] http://blog.launchpad.net/general/parallelising-the-unparallelisable […]

  6. LXC – Linux Container « Tommy Brander Says:

    […] http://blog.launchpad.net/general/parallelising-the-unparallelisable […]

Leave a Reply