All Collections
Crash reports
Incident report 11th May 2022
Incident report 11th May 2022

Incident report on the slowdowns and crash of the 11th May 2022

Félix Vercouter avatar
Written by Félix Vercouter
Updated over a week ago

Summary

During the night of Tuesday May 10th to Wednesday May 11th 2022, we deployed a new logging tool on the infrastructure. This tool, which passed our staging and pre-production validations, turned out to be misconfigured for the production volume. The developers collected as much data as possible on the cause of the problem and the configuration needed to solve it.


Cause

Implementation of a tool

A new logging tool allowing us to track errors, failed actions, and unexpected problems on the platform was deployed. This tool analyzes everything that happens on the portal and digests an extremely large amount of data. This data transit has been tested on our two test environments and has passed validation. In practice, the production was finally slowed down and even some intermittent service interruptions.


Solution

Rollback

We did a rollback of the production to cancel the production release of the tool. This solved the problem immediately.

Corrective mesures

The developers are correcting the configuration of the tool and its integration on the infrastructure so that it no longer encroaches on the data traffic.


Conclusion

The Dokeos team sincerely apologizes for the inconvenience caused on Wednesday, May 11, 2022. The corrective actions described above are already being implemented as you read this.

Did this answer your question?