Cloud Egress service degraded in India
Incident Report for LiveKit
Postmortem

Cloud Egress service was degraded in India for 2.5 hours, due to a large sudden influx of Egress requests in the region. We have discovered a bottleneck in number of DB connections per Cloud Egress instance, which caused prolonged downtime after automatic scale up. We have increased the DB worker pool size, along with increasing the number of Egresses per CPU core. We are confident this will mitigate this issue should we receive another large burst of Cloud Egress requests.

Posted Sep 20, 2024 - 15:34 PDT

Resolved
Cloud Egress service was degraded in India for 2.5 hours, due to a large influx of Egress requests in the region. We are continuing to investigate and exploring mitigation strategies.
Posted Sep 20, 2024 - 06:00 PDT