Webhook retries
Overview​
This section describes the retry logic implemented for handling webhooks in case of failures. The webhook retry mechanism ensures that temporary issues are handled gracefully and that retries occur in a way that minimizes the risk of overwhelming the client.
Retry conditions​
The system retries the webhook under the following circumstances:
Any Connection Error:​
- ECONNABORTED
- ECONNREFUSED
- ECONNRESET
- ETIMEDOUT
- ENOTFOUND
- etc.
HTTP Status Codes:​
Retries are triggered on the following HTTP status codes:
- 403: Forbidden
- 429: Too Many Requests
- 502: Bad Gateway
- 503: Service Unavailable
- 504: Gateway Timeout
These status codes typically indicate that the server is either overloaded or temporarily down, hence retries are necessary.
In these cases, the service will attempt to resend the webhook after a delay.
Retry Logic​
Overview​
Retries are controlled by exponential backoff with jitter, which prevents overwhelming the server when multiple retries occur. The delay between retries increases exponentially with each retry attempt.
Maximum retry cycle is 24h: we will be trying to resend a webhook for 24 hours before stopping retry attempts.
Base delay is 10s: the first retry attempt will happen after 10 seconds since the initial webhook attempt fails.
Maximum delay is 8h: as the delay increases exponentially its upper limit is set to 8 hours.
Delay Calculation​
The delay between retries is calculated as follows:
delayMs = min(maxDelayMs(8h), baseDelayMs(10s) * 2 ^ retryAttempt);
halfDelayMs = delayMs / 2;
finalDelayMs = halfDelayMs + random(0, halfDelayMs); // jitter