Cloud dashboad is offline

Incident Report for LiveKit

Postmortem

17:45 UTC New deployment of cloud dashboard service was triggered. The configuration contained decommissioned database endpoints, which the new service tried to establish connections unsuccessfully, and subsequent pings on the failed connection led to panics and service going offline

18:15 UTC we identified the decommissioned database endpoints and updated the configurations, and the service was back online

Remediation steps:

  1. Add checks to prevent pings on failed database connections
  2. Read/sync configuration from central configuration repository to prevent reading stale configurations
Posted Feb 08, 2024 - 13:00 PST

Resolved

This incident has been resolved
Posted Feb 08, 2024 - 10:39 PST

Monitoring

A fix has been applied, we are monitoring the service
Posted Feb 08, 2024 - 10:19 PST

Identified

The issue has been identified and fix has been applied
Posted Feb 08, 2024 - 10:18 PST

Investigating

We are currently investigating the issue
Posted Feb 08, 2024 - 09:49 PST
This incident affected: Cloud Dashboard (cloud.livekit.io).