InTech

Sunday, 23 August 2015

Mongrel and Docker: the power of containers

Last time we looked at how we started off the Mongrel2 webserver in a docker container. It was a very simple setup, with a single container running an instance of Mongrel with a few bits and pieces of static content.

This time, we're going to look at what makes Mongrel so interesting, and why I think that Docker suits it perfectly as a deployment mechanism. We'll shoot over the basics of handlers, and I'll summarise the handler that I created, and how Docker

Handlers

Mongrel2 doesn't deploy applications in the same way that, say, servlet containers do, and it doesn't do any processing of code itself in the way that PHP applications might. Instead, it has a construct called a handler. These are specific paths defined in the server configuration that, when requested, construct a message for the ZeroMQ message framework, and pass them to a socket. A dedicated application reads the message from that socket, takes any necessary action, and then responds to the Mongrel2 server by placing a ZeroMQ message back to a new queue.

Handler application: "thought for the day"

In this case, I've only constructed one handler - an incredibly simple one, that could have easily been managed other ways, but we'll test the water slowly. It's a simple "thought for the day" generator, that will return a json object containing a quotation, and a source for that quotation.

The code for this handler isn't checked into Github yet, but is very simple. There's a single, looping process that waits for messages, and returns one of a random set of quotations whenever it receives a message. It's just a little jar file that gets executed and stays up until terminated.

Accessing the handler

Accessing the handler from the frontend is relatively easy: I've wired up a simple AngularJS controller that just grabs the json object from the /thought path, and plugs it into some html code on the front page. Nothing too fancy.

Putting it all together

So now we have an infrastructure that looks like this:

Mongrel2 and the handler process are both running in docker containers. The communication ports for the two of them are exposed within the docker engine, but not outside of it. Mongrel's main access point (port 6767, in this case) is mapped to port 80 on the virtual machine and exposed to the outside world.

However, we're still not quite done yet. To make things even easier, we can use Docker Compose to describe this entire diagram, and suddenly we're able to build, deploy and start all of our containers from a single command. Again, we're not at a particularly complex level yet, this single file

mongrel2:
	build: ./mongrel2-main
	ports:
	- "80:6767"
	expose:
	- "5557"
	- "5558"
	samplehandler:
	build: ./sample-handler
	links:
	- mongrel2

will use the information in the two dockerfiles to build — from scratch — the entire application above and deploy it.

And that is an amazing tool, which will give us the ability to add sections to our infrastructure quickly and easily.

Saturday, 22 August 2015

Getting into docker: the simple case

So last time I'd been left in the situation of moving from Vagrant over to Docker. And I found myself really appreciating what Docker was doing, and beginning to get my head around what it's capable of. There's still a long way to go in order to use it properly, but I think I'm beginning to get the basics.

I ended up in a situation where I just about had the Mongrel2 webserver being loaded up onto a dockerfile, and starting up.

I've managed to take that a little further in the right direction now.

Step 1: Getting Mongrel to start properly — and keep running
I touched on it last time, but the thing you really have to nail to get Docker to work is the ability to start a single process in a container and keep it running. This really isn't as easy as it could be, given a few of the limitations of the way Docker runs things.

A docker container will only keep running as long as the process started on id 1 keeps running. There are a few hacky ways of doing this, like piping together a series of shell scripts, and finishing by tailing a log file, but that's best avoided. The best way that I've found so far is to use the supervisor daemon, which starts on process 1 and then spools up the processes that you deem necessary. It actually turned out to be easier – with Mongrel – to fire up a second supervisor process called procer in order to manage the startup of the Mongrel server. There might be a better way of doing it, but this seems to work, and (in theory) gives a layer of resiliency to the Mongrel process by granting automatic restarts in case the process dies on us.

Step 2: Getting some static content in there
The next step was to get some static content onto the site. That was pretty easy, and we ended up with an infrastructure that looked a bit like this.

The Dockerfile controlling the Mongrel2 server pulled all the necessary files to install Mongrel2 and the dependencies, copied in a set of static files, and finally started the server.

Next steps:
This is all very well, but it seems like a lot of effort to go to in order to get some static content served up by a webserver – and it is. Next up is the first handler, the independent programs that make Mongrel2 interesting, and how I believe they are perfectly suited to running on a containerised platform.

Thursday, 6 August 2015

Docker vs Vagrant - Round 1

Having kicked around Vagrant (https://docs.vagrantup.com/v2/) for a while, a colleague finally persuaded me to try out Docker instead, just to see what the competition was like.
... so, I spent a while working out ways of setting up a virtual machine / container to run an instance of the mongrel2 webserver, just to see how they compared.
The results are now in and ... well, it's a bit of a mixed bag, to be honest.
The Dockerfile and associated project are up and available on github, at https://github.com/nihilogist/docker-mongrel2, for those interested.

Setup

How easy are the two different solutions to set up?

Vagrant

It's really easy. You write a few scripts to download and install the software you need - in this case it's a little more complex as you need to grab ZeroMQ and then make mongrel, but it's not hard at all. You copy what you need over to the VM and start the server. Great.

Docker

It took a while, I have to say. I found it a good deal harder to get my head around the way that the container works. Especially coming from Vagrant, which is purely and simply a method of managing VMs, it took some getting used to the idea that the container only persists as long as the process on PID1 is running. So this means that there were a few extra steps needed to ensure that PID1 keeps running, but happily there are plenty of tools to help with this.

Running

How do the two containers run?

Vagrant

You have a full VM running. You can ssh into it. You can see it running in VirtualBox (if that's the provider you're using). You can use it in exactly the ways you'd use any other virtual machine. But it is pretty heavy - it takes a while to boot up, but seems pretty solid once it's up, as you'd expect.

Docker

It's just a container - there are lots of limits on what you can do with it and what you can't. Well - not so much limits as recommendations. Using Vagrant I think you could be easily tempted to start up a whole load of extra processes on the VM, just because it's easy, and if you have the machine running, then why not? Docker, on the other hand, really wants you to dedicate each container to a single process. It's really lightweight. It starts in an instant.

Which do I prefer?

Well, it's little early in the day, but I think that Docker has won me over. I've still got a heck of a lot to explore, like getting data volumes to work properly, but I'm really enjoying using it.
I'm also keen to explore the way that mongrel2 wants you to use many small services, and I think that Docker is perfectly aligned with that - each container running a single service, but running it well.

Wednesday, 27 May 2015

Automated deployments: What are the components of a system?

I've been thinking a lot recently about software deployments, particularly in my current project, which is a (relatively) standard Java EE application: application server, database, messaging framework, blah blah blah. As I'm sure a lot of projects do, we have a hotch-potch of assorted build scripts, deployment scripts, semi-automated configuration scripts.

Two or three members of the project team try to manage deployments to test systems, and have currently managed to write several thousand lines of code which act as a universal way of deploying the artifacts to the various environments which ... doesn't work. These issues seem to crop up on every project - certainly every developer I've worked with recounts stories of projects where weeks of time were lost trying to figure out why code that apparently worked perfectly would not even deploy to a target system.

The upshot of this was that I started trying to think about the process in a more modular fashion; to break it down into smaller components and see if there's a standard pattern that we could apply. And I think I've come up with a few things - some of which are obvious, but we may as well start with the obvious and see where it leads us.

The System
To create this mythical working system we need three things:

Environment
Build artifacts
Deployment processes

Environment: the environment is the actual machines (either physical or virtual) that the software will run on. It includes all third party additions required to run the software, such as databases, web / application servers, messaging systems and so on. It also includes all necessary network configurations such as load balancers, firewalls and host configurations.

Build artifacts: the build artifacts are the final output of building the source code ready for deployment. They may be tailored to fit a specific environment by setting runtime variables, but other than that the source code itself should not need to know about environment-specific details.

Deployment processes: the deployment processes are the description of steps needed to take the build artifacts, transfer them to the relevant environments, and initialise them such that the system is available for use.

These three aspects of the system can - and should - be treated separately when it comes to automating the overall process of creating and deploying environments. For instance, we should not be trying to write code in our build artifacts that generates application servers that are then deployed alongside the application files; they are part and parcel of the environment, and that should be as static as possible for the lifetime of the system. Also, the deployment process should not be part of building the artifacts from the source code: it should be possible to build individual artifacts and deploy them as necessary.

All these principles stem from the idea of the separation of concerns. Our build process should not interfere with a deployment; our environments should not define the way our software is built from source code.

Monday, 25 May 2015

Testing times

I'm very excited about the upcoming Dev/Test Lab feature that's coming to preview soon in Azure.

A lot of time working in an agile team is spent working on testing. Even with a dedicated QA engineer, a moderate-major portion of any development work is spent writing tests, running tests, and just flat out checking that the stuff you've written works.

It's not always easy to do that just on your local workstation, even if you're lucky enough to have a stupidly powerful machine at your disposal (spoiler alert for employers: this really does make a difference :) )

On the other hand, you can't really have a bunch of test machines, with all the associated maintenance and other costs, available for the development team to use on a whim, or because the QA team are running extended regression tests on an earlier build.

Dev/Test Lab might just help with that.

There are a bunch of tools that can be used with public or private clouds to simplify the provisioning and deployment of virtual machines and - given that many production environments now run on similarly virtualised equipment - this is a very reasonable way to run your test environments. What really has my interest in this offering from Azure is the tooling around it - making it easy to bring up and shut down systems, as well as manage a team's budget. (Sadly, this is reality, and your managers don't tend to appreciate you racking up $50,000 in VM fees because you forgot to shut down the load test system before going on holiday).

So yeah - I'm very much looking forward to kicking the tires on the preview. I might even find time to write about it.

Sunday, 19 April 2015

Building sandcastles part 4: Azure

Continuing the adventures in continuous integration, I'm now looking into how services like AWS or Azure could really help dev teams along. A lot has been written about this before now, but while there seem to be lots of places talking about how easy AWS makes their dev / test / deploy cycle, I haven't seen too many places talking about all the components you might need.

So, looking at it from the ground up, we need, in the simplest terms:

Somewhere to store our code
Somewhere to build our code
Somewhere to test our builds and artifacts
Somewhere to host our final builds

This translates to:

Some kind of SCM system
Some kind of continuous integration system
A private set of machines to deploy our applications to (let's assume that we're building a web application, so we'll need webserver and a database, minimum)
A public set of machines to deploy the applications to

After mucking around with the trials of Azure and AWS, I decided to run a few experiments with Azure, and have ended up - so far - with meeting the first two requirements, with virtual machines running GitLab and TeamCity respectively.
I created a rather simple web application - no database yet, and pushed it up to GitLab, and then connected TeamCity over to that to run a couple of builds.

So far, so easy.

The next step - which feels like a pretty big one - is to build a test system. Now, it's very easy to set up a virtual machine, set up a webserver and then have a script to deploy the build artifacts from TeamCity over to it, but in an ideal world - where it's not just me working on it - it would be ncie to be a lot more flexible than that. What I really want to achieve is to be able to deploy a whole test environment with the click of a button.

That means that we need to start looking at the automated provisioning of virtual machines, because the last thing that anyone wants to do is set up a whole environment every time they want a build. Happily, there are a bunch of tools that can help us with this.

For a first experiment, I'm using Vagrant. It seems to be generally straightforward, and - in theory - relatively portable. It's been somewhat of a pain to get working with Azure, but now that it's running, I seem to be able to deploy machines relatively easily. Next steps are to actually get servers running on them.

Monday, 16 February 2015

Building sandcastles, part 3: Scala

(aka: And Now For Something Completely Different)

So, this is less about the continued experimentation in integration a bunch of SCM and CI systems, and more about random experimentation, because sometimes I just roll that way ;)

Pretty much on a whim, I brought Scala into the mix of technologies used. Following the recommendations of a couple of folk I've spoken to about this, I brought it in to the unit testing layer first, so that I can give it a go without too much impact on the actual written code. To be honest, it felt like a bit of a slog (though, having said that, I did manage to get the first unit test written and passing within an hour, after a beer, so I guess it can't be that hard).

First of all, we had to install Scala and the various IntelliJ plugins. That wasn't so bad. Second was to research how to run a JUnit test in Scala. Again, not too bad - suggestions are to use scalatest, and then have your unit tests extend the JUnitSuite class. Writing the unit test? No worries. Well ... not for the real simple ones, anyway. Running the test? Now that was the trick. First of all, you need to make sure that the version of scalatest you are using is compatible with the version of scala that you're using. Then you have to make sure that all the scala setup you did in the IDE also matches the version of scala you installed. Finally, you might need to tweak the build steps in the pom.xml file. Sadly, very few of the errors that get thrown when you have these things wrong make much sense.

For future reference (for myself, but maybe someone else will find this useful as well), this build configuration in maven seems to work quite well:

    <build>
        <plugins>
            <plugin>
                <groupId>org.scala-tools</groupId>
                <artifactId>maven-scala-plugin</artifactId>
                <version>2.15.2</version>
                <executions>
                    <execution>
                        <id>scala-compile</id>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                        <configuration>
                            <args>
                                <arg>-dependencyfile</arg>
                                <arg>${project.build.directory}/.scala_dependencies</arg>
                            </args>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

Something to be aware of is that the -make:transitive configuration argument seems to be redundant as of scala 2.11, and actually breaks the compilation. I've seen it mentioned in a couple of places, but taking it out made everything magically work for me... YMMV.

So anyway, I now have a very very simple test committed, which is currently passing. We'll see how things go as I try to rack up the complexity, and start on mocking out various interfaces.