European Data-Center - Partial System Unavailability
Incident Report for IDnow GmbH
Postmortem

On 17. February 2020 we put an upgraded server back into the production. In order to make this process automated and reliable we deploy our servers via a configuration management system. Although this procedure is used already for some time in production, in this case the configuration management deployed a wrong configuration to the new instance which caused an interruption in our cluster.

Actions taken:

We took the new node back offline and adjusted the active servers to ignore the traffic from the new node. Once we had isolated the new deployed host, we changed the configuration management and provisioned the node again. At the next maintenance window, we joined the server back to the existing servers, and no issues occurred until now.

Preventive Measures for the future:

We improved our process of the server setup procedure to prevent wrong configurations from being deployed, the configuration management already contains the fixes due to its versioning for future deployments of new nodes.

We sincerely apologise for the trouble this has caused.

Your IDnow DevOps Team

Posted Mar 11, 2020 - 09:46 CET

Resolved
We monitored the system closely. The system has been working within normal parameters during the extended monitoring phase.

Incident start: 2020-02-17 14:50 UTC+1
Incident end: 2020-02-17 15:35 UTC+1

Impact: during the aforementioned time the system served calls with delays and in some cases couldn't answer them within the defined timeouts. Not all video-calls and/or esignings could be finished during this time.
Posted Feb 17, 2020 - 18:33 CET
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Feb 17, 2020 - 16:17 CET
Identified
We identified the root cause for the service unavailability. Most of the services are already working as expected.
Posted Feb 17, 2020 - 15:16 CET
Investigating
We are currently experiencing a partial service unavailability of our European data-center. We are investigating the issue.
Posted Feb 17, 2020 - 15:02 CET
This incident affected: IDnow Solutions (Video-Ident, eSigning QES, eSigning AES, API, AutoIdent).