The SLA badge issue for Zendesk Pod 13 was resolved by redeploying the malfunctioning Kubernetes pod. This action restored the missing SLA events, which were then backfilled. However, the backfill process inadvertently removed SLA data on closed tickets, resulting in 'Null' SLA data in Explore. The resolution involved a quick system reset to fix the main service, followed by a detailed investigation in a safe testing environment to understand the problem better.
The missing SLA badges on February 9, 2024, were due to a malfunction in one of the Kubernetes pods in Pod 13. This pod experienced an unplanned restart, which disrupted the 'redis' host, a critical component for the Metric Event Service (MES)….
Following the SLA badge incident in Pod 13, several remediation steps were taken. These included exploring better ways to organize and pass environment variables to ensure readiness during system restarts, improving the turnaround time for fixing…
The SLA badge issue had a significant impact on closed tickets in Zendesk. During the resolution process, the backfill/restoration of data inadvertently removed SLA data on closed tickets, resulting in 'Null' SLA data in Explore. This was an…
For more information about Zendesk system incidents, you can check the system status page. This page provides current system status information and usually includes a summary of post-mortem investigations a few days after an incident has ended. If…