Exponential Backoff In Go

Exponential backoff is required in many operations which requires a retry but at a particular pace. Some applications of where you might need this:

  • DynamoDB Updates: If your queries using ConditionExpression and you have some update queries on the same row within a span of 10-30ms then one of them might fail as per your expression. You can use retries here with exponential backoff to ensure entries are updated. Popular go dynamodb lib uses it by default.
  • Trying to set up a MySQL connection.
  • Retrying an API call

We can implement exponential backoff algorithms ourselves but don't if we have one for the basic use-case. We are gonna discuss cenkalti's backoff lib .

We first need to understand some key parameters which are unfortunately not explained in a proper manner in the doc.

type ExponentialBackOff struct {
    InitialInterval     time.Duration
    RandomizationFactor float64
    Multiplier          float64
    MaxInterval         time.Duration
    MaxElapsedTime time.Duration
    Stop           time.Duration
    Clock          Clock

    currentInterval time.Duration
    startTime       time.Time
}
  • Initial Interval: Initial interval in which retries should happen.

  • Randomization Factor: Finding when to do the next retry the lib uses this factor to add randomization into the interval with the formula.

    Randomized interval = RetryInterval * (1 ± RandomizationFactor)
    so if RetryInterval = 2 and RandomizationFactor = 0.5
    Randomized interval = [1, 3]
    
  • Multiplier: As per the above formula, the actual interval is actually decided by multiplying it with the multiplier, So

    Randomized Interval = 2 * [1,3] = [2, 6] i.e between 2 and 6
    
  • MaxInterval: Use to have a cap on the retry interval.

  • MaxElapsedTime: After MaxElapsedTime the ExponentialBackOff returns Stop. It never stops if MaxElapsedTime == 0.

So going by the default settings from the lib for exponential backoff `backoff.Retry(yourRetryFn, backoff.NewExponentialBackOff())`

Run interval will be:
// The default max elapsed time is 15 minutes.
// The default retry intervals are shown below, in seconds.
//  1          0.5                     [0.25,   0.75]
//  2          0.75                    [0.375,  1.125]
//  3          1.125                   [0.562,  1.687]
//  4          1.687                   [0.8435, 2.53]
//  5          2.53                    [1.265,  3.795]
//  6          3.795                   [1.897,  5.692]
//  7          5.692                   [2.846,  8.538]
//  8          8.538                   [4.269, 12.807]
//  9         12.807                   [6.403, 19.210]
// ...

If you want short duration retries then

b := backoff.NewExponentialBackOff()
b.MaxElapsedTime = time.Second * 2
backoff.Retry(cb, b)

A more custom set of configs based on how fast the request you are trying to retry on.

- When request are high fast in nature.
// latency of 30ms request, this will do max 3~4 retries, with max latency
// of 250~380ms.
  b := &backoff.ExponentialBackOff{
      InitialInterval:     50 * time.Millisecond,
      RandomizationFactor: 0.5,
      Multiplier:          2,
      MaxInterval:         150 * time.Millisecond,
      MaxElapsedTime:      250 * time.Millisecond,
      Clock:               backoff.SystemClock,
  }

- When requests are medium fast in nature.
// latency of 100ms request, this will do max 5 retries, with max latency
// of 1~1.5s.
  b := &backoff.ExponentialBackOff{
      InitialInterval:     100 * time.Millisecond,
      RandomizationFactor: 0.5,
      Multiplier:          1.5,
      MaxInterval:         500 * time.Millisecond,
      MaxElapsedTime:      1 * time.Second,
      Clock:               backoff.SystemClock,
  }
    
- When requests are slow in nature.
// latency of 300ms request, this will do max 8-9 retries, with max latency
// of 10-14s.
  b := &backoff.ExponentialBackOff{
      InitialInterval:     100 * time.Millisecond,
      RandomizationFactor: 0.5,
      Multiplier:          2,
      MaxInterval:         3 * time.Second,
      MaxElapsedTime:      10 * time.Second,
      Clock:               backoff.SystemClock,
  }

I hope the above will make the decision to decide on an ideal configuration less intimidating.

Thanks!