Based on how docker works it creates a new Hostname whenever a node terminates and a new one spins up. In our docker environment it is not possible to specify static nodeIDs to all containers, as the number of replicas is not static. Only one "Debug" instance has a fixed ID.
The containers reboot/recreate themselves whenever the healthcheck url is dead or during the weekly maintenance.
This leads to two major problems:
1) The "Nodes" list under Server Tasks gets polluted with "Turned Off" nodes that will never come back alive (I once had 1600! Nodes in the list)
2) Once the Number of dead nodes starts to increase midpoint gets difficulties to start because of hundreds of networking related exceptions (see attached log snippet).
As far as I understand midpoint tries to invalidate the objects on all other nodes after they were imported. Which obviously will fail for many of them. This leads to an increased startup time. Which will eventually exceed the startup timeout of 3 minutes. Docker then kills the container (but it is already registered as node). The next starting container then has one more dead node to check and will exceed the startup timeout again.. there we go: circle of death!
I need do wipe the m_node table in the database then.
The situation gets even worse when multiple node startup in parallel (eg. after automatic server maintenance) The nodes see each other before they can accept those cache invalidation calls. I disabled the healthcheck once but anyways the startup never completed because of this deadlock..
Maybe you can improve the handling of docker nodes by adding a cleanup feature for old nodes and have a look at the exceptions.