Service gets restarted if it the services it depends on are stable.
Service does not get restarted if it is unstable.
Scale too many instances for a service that do not fit in the environment (this also happens if any of the instances are not healthy but it is more difficult to reproduce at will).
digitransit-deployer has logic that checks if a service or the depending services are healthy before doing a restart. However, it should only be checked if a the depending services are stable.
Noticed that raildigitraffic2gtfsrt was in a restart loop now because deployer checks the age of the oldest pod (which is from a previous deployment in this case) and decides that a restart is needed. This then causes the new pod to be replaced in the deployment. This loop continues forever. We should either change the logic for counting the age of a deployment (can be complex to only take into account pods from the latest deployment) or we can configure wait time since last deployment (this could be done by checking the timestamp from the label added in the previous deployment if one exists)
Fixed it by using the timestamp of the previous deployment by deployer as the first choice and if it is missing, then the oldest pod’s start time