blocks and users APIs experiencing higher-than-expected latencies and error rates

Resolved
Partial outage
Started 2 months ago Lasted 4 minutes

Affected

APIs
blocks.dopt.com
users.dopt.com
Updates
  • Resolved
    Resolved

    At 10:59 AM, we re-deployed services that had gone into dead-lock waiting for other services to come up. This resolved all issues with higher than expected latencies and errors.

    At the peak (~10:58 AM), less than 3% of requests had to be retried. All systems are back to normal post re-deploy.

    We're actively working on mitigating dead-locking and k8s coordination.

  • Investigating
    Investigating

    We're investigating an incident which was automatically triggered by a health-check failure.

    Starting at 10:57 AM, we received reports of higher-than-expected p99 latency and 500s on both the blocks and users APIs.