Using the Docker remote API to retrieve container CPU usage

November 28th, 2016

For reasons that I won’t go in to here, I’ve been interested in the CPU accounting aspect of cgroups for a while and I recently found some time to have a poke at what information is available in the Docker remote API. I was interested in getting hold of the actual CPU time used by a container versus the elapsed time that the container has been running for (where the former would be smaller if the container is not CPU intensive and would potentially be much larger if it’s chewing through multiple cores).

The CLI doesn’t expose the information that I was looking for so my first pass was to define an image with curl and jq:

Build it:

And then run it with a script as follows:

I started out with an Alpine based image but the version of date it comes with wasn’t capable of parsing the ISO format dates returned by the API. This was an interesting exercise in the use of curl with Unix sockets and jq for parsing JSON on the command line but I thought I could do better.

Next step was a rendering of the script above in to golang which you can find over on GitHub. You’ll have to forgive my poor golang – I wouldn’t claim to know the language; this is just a cut-and-shut from numerous sources around the internet. Perhaps the only part worth mentioning is that I explicitly pass an empty version string to the golang Docker library so that you don’t get client-server version mismatch errors.

Having compiled this up in to a static binary I could then build a small image from scratch. I then wanted to build this using Docker Hub automated builds and a binary release on GitHub. This raises the thorny issue of how you make the binary executable once you’ve used ADD to download it in to the image. There is one solution here that adds a very small C binary that can be used to perform the chmod. Having initially employed this method it reminded me of another issue that I’d hit. I’d inadvertently doubled the size of our websphere-traditional images to over 3GB with a recursive chmod (the files get copied in to a new layer with the modified permissions). So, in the end I caved in and checked the binary in to GitHub so I could use a COPY and pick up the correct permissions.

The resulting image, weighing it at just over 4MB, is on Docker Hub. As the instructions say, it can be run with the command:

To test out the image, let’s spin up a container that should burn up the two cores allocated to my Docker for Mac VM:

If we leave it for a few minutes we see an output along the following lines:

The total CPU usage is, as we’d expect, twice the elapsed time. Let’s try again but this time run two containers and use cpuset to constrain them both to a single core:

This time, the results show that each container is getting half of the CPU time:

(Actually, you can see that the one that has been running longer has slightly more than half as it got the CPU to itself for a couple of seconds before the other container started!) Finally, and just for interest, let’s spin up an unconstrained WebSphere Liberty server:

After a minute, we see that it’s used just over 20 seconds of CPU time to start up:

And if we check again after half an hour, we see that without any load, the server has consumed very little extra CPU:

Barcelona Break

October 31st, 2016

BarcelonaIn general, we’re not very good at combining business trips with pleasure but at half term I was due to be in a conference in Madrid for the latter part of the week and Christine was about to start a new collaboration based in Barcelona so we decided to take the children over to Spain for a few days. Things didn’t get off to a great start with a three-hour delay on our Easyjet flight to Barcelona. To be fair, they did let us know of the delay before we left home and thankfully we’d already made arrangements for late arrival at our apartment.

On Sunday we took the metro to the Sagrada Familia, only to discover that it was sold out for the day. We therefore slowly made our way to Park Güell where we had booked in advance for a late afternoon entrance. Christine went off to the University on Monday whilst the children and I headed to the beach. Unfortunately you could barely see the beach for the mist, let alone the cable car across the harbour that we were intending to take. Luckily, as we waited to board the cable car the mist started to clear and by the time we arrived at Montjuïc the sun was out in force.

We spent some time in the Fort which became quite atmospheric when the mist rolled in again off the sea. Our walk down Plaza d’Espanya was cut short when Duncan failed to clear the large muddy puddle at the bottom of a very steep slide!

Christine was working again on Tuesday. Sadly the mist had turned to drizzle and I headed to the Museu Blau with the children (located dangerously close to the OpenStack summit that was kicking off that day!). For a very modern natural history museum, it seem to specialise in glass cases with large numbers of exhibits in them which wasn’t particularly child friendly. The visit was saved by the temporary National Geographic Spinosaurus exhibition.

In the afternoon, we headed back to the Sagrada Familia having booked our tickets in advance this time. The cathedral has gained a very impressive ceiling since I last entered the building about 10 years ago. Although the rain had stopped by this point, unfortunately the damp conditions meant that we weren’t permitted to ascend the towers.

Having handed the children over to Christine on a metro platform, I took the fast train to Madrid, arriving just in time for the speaker dinner. The rest of the family flew back to the UK the following morning.

Running Weekend

October 16th, 2016

Christine XCIt’s been a weekend for running. On Saturday Christine ran at the first of this year’s Hampshire Cross-Country races at Farley Mount. I didn’t feel 100% when I woke up so decided to save myself for Sunday. Although I felt much better by the time the races came round it was probably still a wise decision (not least to reserve some energy for a barn dance in the evening!).

StringerOn Sunday it was Totton RC’s Stinger which meant a return to Ocknell. It had been raining heavily during the night and it was still going as we drove to the event. The sun had come out by the start so, although wet underfoot, it was actually quite warm.

Ocknell MudI was slightly alarmed to be in the lead for the first couple of miles but around the three mile mark, three runners made a move (although I’m puzzled because the results that were posted suggest four). Most of the next four miles were spent racing around the gravel tracks in the Inclosure. The first two runners started to pull away and I had to work hard to stay in contact with the third placed runner (or was it fourth?!). I started to make some ground as we left the tracks and worked our way back along the edge of the Inclosure but didn’t have the energy left to haul him in on the final climb up towards the finish (the sting).

Christine, meanwhile, had take the children for a walk through a marsh which meant they were covered in almost as much mud as me!

20161016 Stinger

Ocknell Orienteering

October 10th, 2016

Saturday saw our second orienteering outing of the season, SOC’s event at Ocknell. We were running the children’s activity which, as it involved finding a randomly scattered selection of controls, didn’t involve too much in the preparation but seemed to be enjoyed by the kids nevertheless. My apologies go to the little girl who turned up at the end when we’d run out of prizes!

Pete Davis had put on an excellent set of courses – it’s just a shame there weren’t more people there to enjoy them. I ran the brown and, despite knowing the area well, still made a few errors of judgement. As RouteGadget shows, it took me a couple of controls to realise that straight was almost always best and, as 13 and 14 show, you have to be careful in amongst the gorse bushes around the old runway. Although the November Classic is on an adjacent area, sadly I don’t think these lessons will really help there!

Duncan @ 7

October 4th, 2016

Coco @ Duncan's BirthdayI’ve been telling people that my children are seven and nine for some months now but Duncan has now actually reached the first of these milestones. His party was this weekend and, having spent the past month or so engrossed in Asterix books, that was the theme of his party (despite most of his friends having no idea who Asterix is!). The class teddy was brought to the party and is featured here in his Obelix plaits. The other attendees seemed to prefer the lively party games that Christine had contrived, from hunt the Roman to newspaper snowball fights and stuffing the balloons in to Obelix’s XL thermal tights!

Duncan @ 7Shield CakeWhilst the internet is full of Asterix themed birthday cakes, I wasn’t keen to embark on the necessary sugar craft to recreate any of the characters. We eventually settled on a Roman shield – a job made easier by large quantities of coloured icing. We got away with just one cake this year as it was sufficiently large to last the couple of days until his actual birthday!

Duncan opening presentsDuncan’s doesn’t like to set his sights low and suggested two ideas for presents: a drone and a metal detector. He actually ended up with two of the latter although I don’t think either is going to allow him to find any Roman hoard. The drone was also probably not what he was expecting as it’s all of 2-inches across. First attempts suggest that there’s certainly some skill (that neither of us possessed) required to fly it!

Prometheus and WebSphere Liberty

October 3rd, 2016

It’s been on my to-do list for some time to try setting up Prometheus to monitor WebSphere Liberty. There is a JMX Exporter which makes the job pretty simple even if there ended up being more steps than I had originally hoped.

My first pass was to try to configure the exporter as a Java agent but sadly the current Java client attempts to use some com.sun packages that don’t work with an IBM JRE. I started down the path of rebuilding our Liberty image on OpenJDK but, when I discovered that the Java agent actually uses Jetty to expose its HTTP endpoint I decided that I really didn’t want that bolted on to the side of my Liberty process! Ideally I’d get the Java client fixed and then create a Liberty feature to expose the HTTP endpoint but that will have to wait for another day… This time round I decided to configure the exporter as an HTTP server in a side-car container.

The first step was to create a Liberty image with monitoring enabled using the following Dockerfile:

And then build and run the image and extract the JMX URL:

Note that, in addition to the normal HTTP and HTTPS ports, we’ve exposed a port (5556) that the exporter container is going to use.

Next we need to build the JMX exporter JAR file using maven:

And we also need a config file for the exporter that uses the JMX_URL that we extracted from the Liberty image earlier:

The pattern here is subscribing us to all the available MBeans. The following Dockerfile constructs an image with these two artifacts based on the openjdk image from Docker Hub:

Note that we tell the exporter to run on the same port that we exposed from the Liberty container earlier. Now we build and run the image. We use the network from our Liberty container so that the exporter can connect to it on localhost. The curl should retrieve the metrics being exposed by the exporter.

The last step is to run Prometheus. Create a prometheus.yml file to provide the scrape configuration:

We can then run the standard Prometheus image from Docker Hub:

You can then access the Prometheus UI in your browser on port 9090 of the host where your Docker engine is running. If you’re new to Prometheus, try switching to the Graph tab, entering the name of a metric (e.g. WebSphere_JvmStats_ProcessCPU) and then hit Execute. If all is well, you should see something along the following lines:

Prometheus UI

If the metrics don’t look all that exciting then try applying a bit of load to the server, such as using the siege tool:

WebSphere Liberty admin center in Docker

September 27th, 2016

The content of the WebSphere Liberty Docker images currently match the runtime install zips that we make available for download from WASdev.net. One consequence of this is that none of them contain the admin center. Adding it is very simple though as the following Dockerfile shows:

This Dockerfile adds a snippet of server XML under the configDropins directory that adds the adminCenter-1.0 feature. It then uses installUtility to install that feature. The admin center requires a user registry to be defined for authentication and here we use the quickStartSecurity stanza to define a wsadmin user. We’ll come back to remoteFileAccess in a moment.

We can then build and run this image as follows:

Once the server has started you should then be able to access /adminCenter on the HTTPS port return by docker port admin 9443 using the credentials defined in the Dockerfile.

Liberty Admin Center

If you then click on the Explore icon in the toolbox you’ll find information about any applications that are (or are not) deployed to the server, the server configuration, and server-level metrics. The last of these may be of particular interest when trying to determine suitable resource constraints for a container.

Liberty Admin Center Monitoring

In a single-server, it’s not currently possible to deploy an application via the admin center. For a simple application you could just place it in the dropins directory but, for argument’s sake, let’s say that we need to provide some extra configuration. I’m going to assume that you have ferret-1.2.war in the current directory. We then copy the file in to the container:

In the admin center, we then navigate to Configure > server.xml, click Add child under the Server element, select Application and click Add. Fill in the location as ferret-1.2.war and the context root as ferret then click Save. It is the remoteFileAccess stanza that we added to the server configuration that allows us to edit the server configuration on the fly.

Add Application

If you return to the applications tab you should see the application deployed and you can now access the ferret application at /ferret!

Ferret application installed

Obviously modifying the server configuration in a running container is at odds with the idea of an immutable server but it may still be of use at development time or for making non-functional updates e.g. to the trace enabled for a server.

Docker swarm mode on IBM SoftLayer

September 26th, 2016

Having written a few posts on using the IBM Containers service in Bluemix I thought I’d cover another option for running Docker on IBM Cloud: using Docker on VMs provisioned from IBM’s SoftLayer IaaS. This is particularly easy with Docker Machine as there is a SoftLayer driver. As the docs state, there are three required values which I prefer to set as the environment variables SOFTLAYER_USER, SOFTLAYER_API_KEY and SOFTLAYER_DOMAIN. The instructions to retrieve/generate an API key for your SoftLayer account are here. Don’t worry if you don’t have a domain name free – it is only used as a suffix on the machine names when they appear in the SoftLayer portal so any valid value will do. With those variables exported, spinning up three VMs with Docker is as simple as:

Provisioning the VMs and installing the latest Docker engine may take some time. Thankfully, initialising swarm mode across the three VMs with a single manager and two worker nodes can then be achieved very quickly:

Now we can target our local client at the swarm and create a service (running the WebSphere Liberty ferret application):

Once service ps reports the task as running, due to the routing mesh, we can call the application via any of the nodes:

Scale up the number of instances and wait for all three to report as running:

With the default spread strategy, you should end up with a container on each node:

Note that the image has a healthcheck defined which uses the default interval of 30 seconds so expect it to take some multiple of 30 seconds for each task to start. Liam’s WASdev article talks more about the healthcheck and also demonstrates how to rollout an update. Here I’m going to look at the reconciliation behaviour. Let’s stop one of the work nodes and then watch the task state again:

You will see the swarm detect that the task is no longer running on the node that has been stopped and is moved to one of the two remaining nodes:

(You’ll see that there is a niggle here in the reporting of the state of the task that is shutdown.)

This article only scratches the surface of the capabilities of both swarm mode and SoftLayer. For the latter, I’d particularly recommend looking at the bare metal capabilities where you can benefit from the raw performance of containers without the overhead of a hypervisor.