On 17. February 2020 we put an upgraded server back into the production. In order to make this process automated and reliable we deploy our servers via a configuration management system. Although this procedure is used already for some time in production, in this case the configuration management deployed a wrong configuration to the new instance which caused an interruption in our cluster.
Actions taken:
We took the new node back offline and adjusted the active servers to ignore the traffic from the new node. Once we had isolated the new deployed host, we changed the configuration management and provisioned the node again. At the next maintenance window, we joined the server back to the existing servers, and no issues occurred until now.
Preventive Measures for the future:
We improved our process of the server setup procedure to prevent wrong configurations from being deployed, the configuration management already contains the fixes due to its versioning for future deployments of new nodes.
We sincerely apologise for the trouble this has caused.
Your IDnow DevOps Team