How many processors?

January 31st, 2017

Reading Daniel Bryant’s O’Reilly publication Containerizing Continuous Delivery in Java reminded me of the challenge of determining how processors are available to you when running in a container. In the case of Java, a call to Runtime.getRuntime().availableProcessors() should show this all important information. A quick check reveals that, when called in an unconstrained container, this correctly returns the number of cores on my physical hardware (Docker on Linux) or assigned to the VM containing the Docker Engine (Docker Toolbox or Docker for Windows/Mac). If I used the --cpuset-cpus option on docker run to constrain the cores available to the container then this is also correctly reflected in the value returned. The difficulty arises when access to those CPUs is constrained in other ways.

Take, for example, the new --cpus option in Docker 1.13. Setting this to two on a four-way box, I still get four back from a call to availableProcessors() and rightly so: there are four processors and I may get simultaneous access to all four of them even if the cgroup is then going to make sure that I don’t get that access for more than half of the time. Another potential constraint is a highly multi-tenant environment. If I deploy my test application to Bluemix it tells me that there are 48 processors. That’s great but I’m pretty sure I’m not going to get exclusive access to all of those!

One example we’ve seen where this becomes a real problem is in native memory usage. By default, WebSphere Liberty uses the number of available processors to decide the number of parallel threads it should support. Each of those executors utilises a thread and each of those threads takes up space in native memory. In a containerized environment where total memory is typically constrained (Bluemix containers are sold by the GB/hour) and some generic heuristic is often used to determine the heap size to allocate to the JVM, that can lead to memory exhaustion. That’s why you’ll see a GitHub issue from my colleague Erin that, among other things, proposes setting hard-coding a maximum on the number of threads for the executor service in our Docker images.

Docker 1.13 is out

January 22nd, 2017

Docker 1.13 finally made it out the door earlier this week and I found some time to play around with it this weekend. I shan’t enumerate all of the new features here as the introductory post from Docker does a good job of that (or you can see the release notes for the gory details). Instead I’ll talk a little bit about some of the features that are of particular interest to me.

Top of the list has to be CLI backwards compatibility. It has been a frustration for some time that you’ve had to set the DOCKER_API_VERSION environment variable in order to have a client to talk an engine using an older version of the API. I almost always hit this following an upgrade or when accessing remote engines. It also made it difficult to have an image containing a Docker client, for example to talk to the engine it was running on. You ended up either having to create an image for each API version or trying to work out the engine’s version so you could set the variable appropriately. It’s a shame that compatibility only goes back as far as 1.12 but it’s a step in the right direction.

Another feature that I’ve been holding out for for some time is the --squash option on docker build. The way it has been implemented, this will squash all of the layers from the current build down in to one, preserving the image history in the process. This means that you no longer have to jump through hoops to make sure temporary files introduced in the build are created, used and deleted all in the same Dockerfile command.

I tested the option out on the build for our websphere-liberty images and was initially surprised that it didn’t reduce the size at all. I know that some files get overwritten in subsequent layers but unfortunately that happens across different images e.g. the javaee7 image overwrites some files in the webProfile7 image. Likewise, for our websphere-traditional images we currently have a two-step build process to avoid getting Installation Manager (IM) in the final image. I had hoped that we’d just be able to uninstall IM and then squash the layers but this would only work if we didn’t install IM in a separate base image. Hopefully the squash flag will gain some options in future to control just how many layers are squashed.

Another space-saving feature is the docker system prune command. Yes, pretty much every Docker user probably already had a script to do this using a host of nested commands but, as with the corresponding docker system df command, it’s good to see Docker making this that bit easier for everyone,.

The area of restricting CPU usage for containers has also been something of a black art involving shares, cpusets, quotas and periods. (I should know as we’ve given quite some consideration as to what this means for IBM’s PVU and vCPU pricing models.) It’s therefore great to see the --cpus option being added to docker run to radically simplify this area.

Perhaps the biggest feature in Docker 1.13 has to be the introduction of the Docker Compose V3 file format and the ability to deploy these Compose files directly to a swarm using docker stack deploy. This was a glaring hole when swarm mode was introduced in 1.12. It still sits a little uneasily with me though. Docker Compose started out as a tool for the developer. Despite exposing much the same Docker API, there were a few holes that started to creep in when trying to use the same YAML with a classic Swarm. For example, you really had to be using images from a repository for each node to be able to access them and the inability to specify any sort of scaling in the file meant it wasn’t really of use for actual deployment. The latter problem is, at least, resolved with V3 and swarm mode but only at the expense of moving away from something that feels like it is also of use to the developer. Perhaps experience will show that a combination of Compose file extensibility and Distributed Application Bundles will enable reuse of artifacts between development and deployment.

I don’t wish to end on a negative note though as, all in all, there’s a lot of good stuff in this release. Roll on Docker 1.14!

Using the Docker remote API to retrieve container CPU usage

November 28th, 2016

For reasons that I won’t go in to here, I’ve been interested in the CPU accounting aspect of cgroups for a while and I recently found some time to have a poke at what information is available in the Docker remote API. I was interested in getting hold of the actual CPU time used by a container versus the elapsed time that the container has been running for (where the former would be smaller if the container is not CPU intensive and would potentially be much larger if it’s chewing through multiple cores).

The CLI doesn’t expose the information that I was looking for so my first pass was to define an image with curl and jq:

Build it:

And then run it with a script as follows:

I started out with an Alpine based image but the version of date it comes with wasn’t capable of parsing the ISO format dates returned by the API. This was an interesting exercise in the use of curl with Unix sockets and jq for parsing JSON on the command line but I thought I could do better.

Next step was a rendering of the script above in to golang which you can find over on GitHub. You’ll have to forgive my poor golang – I wouldn’t claim to know the language; this is just a cut-and-shut from numerous sources around the internet. Perhaps the only part worth mentioning is that I explicitly pass an empty version string to the golang Docker library so that you don’t get client-server version mismatch errors.

Having compiled this up in to a static binary I could then build a small image from scratch. I then wanted to build this using Docker Hub automated builds and a binary release on GitHub. This raises the thorny issue of how you make the binary executable once you’ve used ADD to download it in to the image. There is one solution here that adds a very small C binary that can be used to perform the chmod. Having initially employed this method it reminded me of another issue that I’d hit. I’d inadvertently doubled the size of our websphere-traditional images to over 3GB with a recursive chmod (the files get copied in to a new layer with the modified permissions). So, in the end I caved in and checked the binary in to GitHub so I could use a COPY and pick up the correct permissions.

The resulting image, weighing it at just over 4MB, is on Docker Hub. As the instructions say, it can be run with the command:

To test out the image, let’s spin up a container that should burn up the two cores allocated to my Docker for Mac VM:

If we leave it for a few minutes we see an output along the following lines:

The total CPU usage is, as we’d expect, twice the elapsed time. Let’s try again but this time run two containers and use cpuset to constrain them both to a single core:

This time, the results show that each container is getting half of the CPU time:

(Actually, you can see that the one that has been running longer has slightly more than half as it got the CPU to itself for a couple of seconds before the other container started!) Finally, and just for interest, let’s spin up an unconstrained WebSphere Liberty server:

After a minute, we see that it’s used just over 20 seconds of CPU time to start up:

And if we check again after half an hour, we see that without any load, the server has consumed very little extra CPU:

Barcelona Break

October 31st, 2016

BarcelonaIn general, we’re not very good at combining business trips with pleasure but at half term I was due to be in a conference in Madrid for the latter part of the week and Christine was about to start a new collaboration based in Barcelona so we decided to take the children over to Spain for a few days. Things didn’t get off to a great start with a three-hour delay on our Easyjet flight to Barcelona. To be fair, they did let us know of the delay before we left home and thankfully we’d already made arrangements for late arrival at our apartment.

On Sunday we took the metro to the Sagrada Familia, only to discover that it was sold out for the day. We therefore slowly made our way to Park Güell where we had booked in advance for a late afternoon entrance. Christine went off to the University on Monday whilst the children and I headed to the beach. Unfortunately you could barely see the beach for the mist, let alone the cable car across the harbour that we were intending to take. Luckily, as we waited to board the cable car the mist started to clear and by the time we arrived at Montjuïc the sun was out in force.

We spent some time in the Fort which became quite atmospheric when the mist rolled in again off the sea. Our walk down Plaza d’Espanya was cut short when Duncan failed to clear the large muddy puddle at the bottom of a very steep slide!

Christine was working again on Tuesday. Sadly the mist had turned to drizzle and I headed to the Museu Blau with the children (located dangerously close to the OpenStack summit that was kicking off that day!). For a very modern natural history museum, it seem to specialise in glass cases with large numbers of exhibits in them which wasn’t particularly child friendly. The visit was saved by the temporary National Geographic Spinosaurus exhibition.

In the afternoon, we headed back to the Sagrada Familia having booked our tickets in advance this time. The cathedral has gained a very impressive ceiling since I last entered the building about 10 years ago. Although the rain had stopped by this point, unfortunately the damp conditions meant that we weren’t permitted to ascend the towers.

Having handed the children over to Christine on a metro platform, I took the fast train to Madrid, arriving just in time for the speaker dinner. The rest of the family flew back to the UK the following morning.

Running Weekend

October 16th, 2016

Christine XCIt’s been a weekend for running. On Saturday Christine ran at the first of this year’s Hampshire Cross-Country races at Farley Mount. I didn’t feel 100% when I woke up so decided to save myself for Sunday. Although I felt much better by the time the races came round it was probably still a wise decision (not least to reserve some energy for a barn dance in the evening!).

StringerOn Sunday it was Totton RC’s Stinger which meant a return to Ocknell. It had been raining heavily during the night and it was still going as we drove to the event. The sun had come out by the start so, although wet underfoot, it was actually quite warm.

Ocknell MudI was slightly alarmed to be in the lead for the first couple of miles but around the three mile mark, three runners made a move (although I’m puzzled because the results that were posted suggest four). Most of the next four miles were spent racing around the gravel tracks in the Inclosure. The first two runners started to pull away and I had to work hard to stay in contact with the third placed runner (or was it fourth?!). I started to make some ground as we left the tracks and worked our way back along the edge of the Inclosure but didn’t have the energy left to haul him in on the final climb up towards the finish (the sting).

Christine, meanwhile, had take the children for a walk through a marsh which meant they were covered in almost as much mud as me!

Elevation Profile
Speed Profile
20161016 Stinger

Ocknell Orienteering

October 10th, 2016

Saturday saw our second orienteering outing of the season, SOC’s event at Ocknell. We were running the children’s activity which, as it involved finding a randomly scattered selection of controls, didn’t involve too much in the preparation but seemed to be enjoyed by the kids nevertheless. My apologies go to the little girl who turned up at the end when we’d run out of prizes!

Pete Davis had put on an excellent set of courses – it’s just a shame there weren’t more people there to enjoy them. I ran the brown and, despite knowing the area well, still made a few errors of judgement. As RouteGadget shows, it took me a couple of controls to realise that straight was almost always best and, as 13 and 14 show, you have to be careful in amongst the gorse bushes around the old runway. Although the November Classic is on an adjacent area, sadly I don’t think these lessons will really help there!

Duncan @ 7

October 4th, 2016

Coco @ Duncan's BirthdayI’ve been telling people that my children are seven and nine for some months now but Duncan has now actually reached the first of these milestones. His party was this weekend and, having spent the past month or so engrossed in Asterix books, that was the theme of his party (despite most of his friends having no idea who Asterix is!). The class teddy was brought to the party and is featured here in his Obelix plaits. The other attendees seemed to prefer the lively party games that Christine had contrived, from hunt the Roman to newspaper snowball fights and stuffing the balloons in to Obelix’s XL thermal tights!

Duncan @ 7Shield CakeWhilst the internet is full of Asterix themed birthday cakes, I wasn’t keen to embark on the necessary sugar craft to recreate any of the characters. We eventually settled on a Roman shield – a job made easier by large quantities of coloured icing. We got away with just one cake this year as it was sufficiently large to last the couple of days until his actual birthday!

Duncan opening presentsDuncan’s doesn’t like to set his sights low and suggested two ideas for presents: a drone and a metal detector. He actually ended up with two of the latter although I don’t think either is going to allow him to find any Roman hoard. The drone was also probably not what he was expecting as it’s all of 2-inches across. First attempts suggest that there’s certainly some skill (that neither of us possessed) required to fly it!

Prometheus and WebSphere Liberty

October 3rd, 2016

It’s been on my to-do list for some time to try setting up Prometheus to monitor WebSphere Liberty. There is a JMX Exporter which makes the job pretty simple even if there ended up being more steps than I had originally hoped.

My first pass was to try to configure the exporter as a Java agent but sadly the current Java client attempts to use some com.sun packages that don’t work with an IBM JRE. I started down the path of rebuilding our Liberty image on OpenJDK but, when I discovered that the Java agent actually uses Jetty to expose its HTTP endpoint I decided that I really didn’t want that bolted on to the side of my Liberty process! Ideally I’d get the Java client fixed and then create a Liberty feature to expose the HTTP endpoint but that will have to wait for another day… This time round I decided to configure the exporter as an HTTP server in a side-car container.

The first step was to create a Liberty image with monitoring enabled using the following Dockerfile:

And then build and run the image and extract the JMX URL:

Note that, in addition to the normal HTTP and HTTPS ports, we’ve exposed a port (5556) that the exporter container is going to use.

Next we need to build the JMX exporter JAR file using maven:

And we also need a config file for the exporter that uses the JMX_URL that we extracted from the Liberty image earlier:

The pattern here is subscribing us to all the available MBeans. The following Dockerfile constructs an image with these two artifacts based on the openjdk image from Docker Hub:

Note that we tell the exporter to run on the same port that we exposed from the Liberty container earlier. Now we build and run the image. We use the network from our Liberty container so that the exporter can connect to it on localhost. The curl should retrieve the metrics being exposed by the exporter.

The last step is to run Prometheus. Create a prometheus.yml file to provide the scrape configuration:

We can then run the standard Prometheus image from Docker Hub:

You can then access the Prometheus UI in your browser on port 9090 of the host where your Docker engine is running. If you’re new to Prometheus, try switching to the Graph tab, entering the name of a metric (e.g. WebSphere_JvmStats_ProcessCPU) and then hit Execute. If all is well, you should see something along the following lines:

Prometheus UI

If the metrics don’t look all that exciting then try applying a bit of load to the server, such as using the siege tool: