Intermittent errors for Google Gemini models via Inference

Incident Report for LiveKit

Resolved

This incident has been resolved. The errors were caused by an upstream issue at our model provider (Google) that incorrectly triggered an account-level usage cap on our Gemini API access, returning rate-limit errors for a subset of Gemini requests routed through LiveKit Inference. Google identified the cause, rolled back the change on their side, and raised our account limits to prevent recurrence. Gemini requests are now serving normally and error rates have returned to baseline. Other models and providers were unaffected throughout.

Posted Jun 11, 2026 - 16:47 PDT

Identified

We have identified the cause as an upstream limit on our Google Gemini API account. Automatic failover to an alternate Gemini deployment is in place and has reduced the impact, but a subset of requests to Google Gemini models may still intermittently return errors when failover capacity is exceeded. We are actively engaged with Google support to restore full capacity and will share further updates as we have them. Other models and providers remain unaffected.

Posted Jun 11, 2026 - 12:09 PDT

Investigating

We are currently investigating intermittent 429 errors for Google Gemini models routed through LiveKit Inference.

Posted Jun 11, 2026 - 10:52 PDT

This incident affected: Global Inference.