Kubernetes docs give a good explanation of what their probes are but for your app how can we best translate these?
Think of kubelet asking the following questions to your app and here are some suggestions on how you should answer. Note that kubelet continuously asks these questions, not just at startup.
Can this pod serve traffic?
Currently I have this limited to “can I connect to my database?” That is the absolute must have. If the database is unreachable then this pod is not ready.
One decision you need to make is about partial availability. For example some microservices may have multiple dependencies. If a subset of those fail (open circuit) does that mean your entire app should now no longer serve traffic?
Should this pod be restarted?
The general question you need to ask is “will a restart fix the problem?”
- If the answer is NO then the liveness probe should continue to succeed.
- If the answer is YES then we can indicate to kubelet to recreate the pod.
This generally means memory/CPU exhaustion and therefore this check would largely be determined by your tech stack.
- In a Java/Spring Boot app is there a garbage collection issue?
- Is there a thread deadlock? (thread dump)
- Are there too many open file descriptors
In reality I am not certain how to implement these checks though. If my app suffers an OutOfMemory its set to restart so there is no probe left or if it suffers high CPU use then probably the liveness probe will timeout anyway! If you have suggestions on how to implement this in a Java app then please share.
Should kubelet start interrogating the above two probes?
This is generally for “slow to start” containers and unless your Spring Boot app is doing some heavy work on startup this should not be necessary. One example could be maybe it loads a lot of data from disk into cache. Personally I have not used this one yet.