The lifetime of a Docker container is tied to the lifetime of the PID 1 process executed when the container was started. WebSphere Liberty has a convenient server run
command to run the application server in the foreground. Sadly, that’s not the case with the traditional WebSphere’s startServer.sh
script which simply starts the server process in the background and then exits. To ensure that the container didn’t exit as well, we started out with a script something along the following lines:
1 2 3 4 5 |
startServer.sh sleep 10 while [ -f "server1.pid" ]; do sleep 5 done |
where server1.pid
is a file created by the server process (but not immediately, hence the initial sleep). That successfully kept the container alive but failed to allow it to shutdown cleanly! A docker stop
, for example, would wait for the default timeout period and then kill the process. Not great for any in-flight transactions! The solution was simple enough, add a trap to catch any interrupt and issue the command to stop the server:
1 2 3 4 5 6 |
trap stopServer.sh INT TERM startServer.sh sleep 10 while [ -f "server1.pid" ]; do sleep 5 done |
All was well with the world until we then enabled server security by default. Unfortunately with security enabled the stopServer.sh
script requires credentials to be provided and there is no way to get those credentials to the script. The solution was to switch to sending the interrupt signal to the server process. I also disliked that initial sleep so I decided to retrieve the process ID via ps
(something that’s safer in a container given the limited process tree) and then wait whilst the processes directory exists in /proc
. The resulting code looked along the following lines:
1 2 3 4 5 6 7 |
stop() { kill -s INT $PID; } trap stop INT TERM startServer.sh PID=$(ps -C java -o pid= | tr -d " ") while [ -e "/proc/$PID" ]; do sleep 5 done |
Note the use of a function so that $PID
is not evaluated at the point the trap is set up.
Another disadvantage with having the server process in the background is the lack of output in the container logs. I decided to rectify that whilst I was at it by adding calls to tail the server log files:
1 2 3 4 5 6 7 8 9 |
stop() { kill -s INT $PID; } trap stop INT TERM startServer.sh tail -F SystemOut.log --pid $PID -n +0 & tail -F SystemErr.log --pid $PID -n +0 >&2 & PID=$(ps -C java -o pid= | tr -d " ") while [ -e "/proc/$PID" ]; do sleep 5 done |
The significance of the tail
parameters is as follows. The capital F
indicates that the attempts to follow the log file should be retried. This ensures that we continue to follow the latest file when the logs roll over. The pid
parameter ensures that the background tail processes exit along with the server process. The -n +0
indicates that the output should start at the beginning of the file so that entries output whilst the startServer.sh
script is running are not lost. As previously noted, Docker preserves stderr across the remote API so we make sure to direct the output from SystemErr.log
there.