Service outage
Incident Report for Nosto
Resolved
We experienced increased latency resulting in response timeouts for Search, Category Merchandising 2.0 and Suggestion requests. The resulting downtime lasted around 9 minutes.

The downtime was the result of a faulty recovery process intended to automatically restart nodes when they run out of resources. While the restart worked as intended, the affected node caused issues when it went online again after the restart.

We are investigating why the restart caused issues despite the fact that our infrastructure provides enough redundancy to cover faulty nodes. Moving forward, we are also extending our monitoring to discover resource shortages before they can affect the system's availability.
Posted Nov 29, 2023 - 12:51 UTC