Published on

Kubernetes App Not Going Into Ready State

Authors

Today I noticed that one of my apps had been running for 1 day but had been restarted over 300 times and was never reaching a ready state.

kubectl get pods
NAME                           READY     STATUS             RESTARTS   AGE
..
yourApp-123456789-a2bcd        0/1       CrashLoopBackOff   305        1d
yourApp-db-012345678-defgh     1/1       Running            0          3d
..

I checked the logs for the previous run:

kubectl logs yourApp-123456789-a2bcd -p
...
...
...
----------------------------------------------------------
        Application 'yourApp' is running! Access URLs:
        Local:          http://localhost:8088
        External:       http://1.2.3.4:8088
        Profile(s):     [swagger, dev]
----------------------------------------------------------
2018-05-18 09:46:09.278  INFO 1 --- [           main] your.organisation.YourApp             :
----------------------------------------------------------

This is a Spring boot app and as far as these logs go nothing looks wrong and I cannot see any logs for a shutdown hook being called.

In situations where you cannot see anything in the logs you have to describe the pod to look at what life cycle events happened:

kubectl describe pod yourApp-123456789-a2bcd
Name:           yourApp-123456789-a2bcd
...
...
...
Events:
  Type     Reason      Age                  From                      Message
  ----     ------      ----                 ----                      -------
  Warning  FailedSync  5m (x3585 over 1d)   kubelet, your-hostname    Error syncing pod
  Warning  Unhealthy   34s (x6089 over 1d)  kubelet, your-hostname    Readiness probe failed: Get http://1.2.3.4:8080/management/health: dial tcp 1.2.3.4:8080: getsockopt: connection refused

At first glance I could not work out what was going on. I can see that the liveness probe was failing but it is not immediately obvious why. I asked a colleague if he could also have a look for a second pair of eyes and he spotted it.

It turns out this was due to an incorrectly configured app. If you look carefully at the app logs you would see that it is listening on port 8088. But the deployment is configured to look for the liveness probe on port 8080 hence the reason why this kept being restarted by Kubernetes but no shutdown event was being sent to the app.