Retries

Handle transient failures with automatic retry logic and backoff strategies. Make your application resilient to network issues and server errors.

Transient network issues or server overloads can cause temporary failures. Caller's built-in retry system helps make your application more resilient by automatically retrying failed requests with configurable backoff strategies.


Basic Usage

Enable retries on any request using .retry(maxAttempts, options?):

// Retry up to 3 times (4 total attempts including the initial one)
const result = await api.get('/health')
  .retry(3)
  .execute();

The first parameter maxAttempts is the number of retry attempts — the initial request is always made, so total attempts = 1 + maxAttempts.


Configuration

Customize retry behavior with the second parameter:

await api.get('/external-service')
  .retry(5, {
    baseDelay: 500,           // Initial delay in ms (default: 1000)
    backoff: 'exponential',   // 'linear' or 'exponential' (default: 'linear')
    retryOnStatus: [502, 503], // Only retry on these status codes
  })
  .execute();

Options

OptionTypeDefaultDescription
maxAttemptsnumber(required)Number of retry attempts. 3 = up to 4 total calls.
backoff'linear' | 'exponential''linear'Delay strategy between retries
baseDelaynumber1000Initial delay in milliseconds
retryOnStatusnumber[][408, 429, 500, 502, 503, 504]HTTP status codes that trigger retry

Backoff Strategies

Linear Backoff

Delay increases linearly: baseDelay × attempt

Attempt 1 (retry): 1000ms
Attempt 2 (retry): 2000ms
Attempt 3 (retry): 3000ms
await api.get('/api')
  .retry(3, { backoff: 'linear', baseDelay: 1000 })
  .execute();

Exponential Backoff

Delay doubles each attempt: baseDelay × 2^(attempt - 1)

Attempt 1 (retry): 1000ms  (1000 × 1)
Attempt 2 (retry): 2000ms  (1000 × 2)
Attempt 3 (retry): 4000ms  (1000 × 4)
await api.get('/api')
  .retry(3, { backoff: 'exponential', baseDelay: 1000 })
  .execute();

When to use exponential

Use exponential backoff for upstream services that may be overwhelmed. It gives them time to recover without adding more load.


Default Retry Status Codes

Caller retries on these status codes by default:

StatusMeaningWhy retry?
408Request TimeoutServer timed out waiting
429Too Many RequestsRate limit — retry after delay
500Internal Server ErrorTransient server error
502Bad GatewayUpstream server error
503Service UnavailableServer overloaded
504Gateway TimeoutUpstream timeout

Patterns

Critical Operations with Retry + Fallback

const { data } = await api.post('/payments')
  .body(paymentData)
  .headers({ 'X-Idempotency-Key': idempotencyKey })
  .retry(3, {
    backoff: 'exponential',
    baseDelay: 500,
    retryOnStatus: [408, 429, 502, 503, 504],
  })
  .fallback(() => ({ status: 'pending', retryLater: true }))
  .execute();

Health Checks with Short Retry

const { data } = await api.get('/health')
  .retry(2, {
    baseDelay: 200,
    backoff: 'linear',
    retryOnStatus: [502, 503],
  })
  .timeout(3000)
  .execute();

External API with Aggressive Retry

const { data } = await api.get('/third-party/data')
  .retry(5, {
    backoff: 'exponential',
    baseDelay: 1000,
  })
  .timeout(30_000)
  .execute();

Idempotency Considerations

Safety First — Non-Idempotent Requests

Be very careful when retrying POST, PATCH, and PUT requests that aren't idempotent. If a request succeeds but the response fails, a retry could create duplicate resources.

Always use idempotency keys for critical operations:

// ✅ Safe — with idempotency key
await api.post('/orders')
  .headers({ 'X-Idempotency-Key': `order-${userId}-${Date.now()}` })
  .body(orderData)
  .retry(3)
  .execute();

// ⚠️ Risky — without idempotency key
await api.post('/orders')
  .body(orderData)
  .retry(3) // Could create duplicate orders!
  .execute();

Safe to retry without extra precautions:

  • GET — reads are idempotent by definition
  • HEAD — no body, safe to retry
  • DELETE — idempotent (deleting something twice has the same effect)
  • PUT — idempotent by definition (replaces entire resource)

Use idempotency keys for:

  • POST — creating resources
  • PATCH — partial updates (may not be idempotent)

Observability

Every retry attempt emits a telemetry event:

// Event: igniter.caller.retry.attempt.started
// Attributes:
//   ctx.retry.attempt: number     — which attempt (1-based)
//   ctx.retry.maxAttempts: number — total allowed retries
//   ctx.retry.delayMs: number     — delay before this attempt
//   ctx.request.method: string    — HTTP method
//   ctx.request.url: string       — request URL
import { IgniterCallerTelemetryEvents } from '@igniter-js/caller/telemetry';

// Monitor retry frequency in production
telemetry.on('igniter.caller.retry.attempt.started', (event) => {
  metrics.increment('api.retries', {
    method: event.attributes['ctx.request.method'],
    attempt: event.attributes['ctx.retry.attempt'],
  });
});

Retry + Telemetry Example

import { IgniterTelemetry } from '@igniter-js/telemetry';
import { IgniterCallerTelemetryEvents } from '@igniter-js/caller/telemetry';
import { IgniterCaller } from '@igniter-js/caller';

const telemetry = IgniterTelemetry.create()
  .withService('my-service')
  .addEvents(IgniterCallerTelemetryEvents)
  .build();

const api = IgniterCaller.create()
  .withBaseUrl('https://api.example.com')
  .withTelemetry(telemetry)
  .build();

// Retries will emit: igniter.caller.retry.attempt.started
const { data } = await api.get('/critical-data')
  .retry(3, { backoff: 'exponential' })
  .execute();

Best Practices

Retry Guidelines

  • Use idempotency keys for POST/PATCH requests
  • Start with linear backoff — switch to exponential if the service is frequently overloaded
  • Set reasonable timeouts — combine .retry() with .timeout() to avoid hanging
  • Use fallback values.fallback() for non-critical data when all retries fail
  • Monitor retry rates — high retry frequency indicates a systemic issue
  • Keep maxAttempts low — 3-5 retries is usually enough; beyond that, the service is likely down

Common Mistakes

  • Don't retry indefinitely — set a reasonable maxAttempts
  • Don't retry 400 Bad Request — client errors won't fix themselves
  • Don't use exponential backoff with very short baseDelay — it defeats the purpose
  • Don't retry without a timeout — the total wait could be very long

Next Steps