Rate limiting

How the 10 req/sec limit works, what 429 looks like, and how to retry with backoff.

Both the Platform API and APIchat cap requests at 10 per second per token. Calls beyond that limit return 429 Too Many Requests.

This is a hard ceiling — it isn't a soft target you can negotiate up. Design your integration to stay below it, and to back off gracefully when you do hit it.

What 429 looks like

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
{
  "errors": {
    "rate": ["Too many requests"]
  }
}

(APIchat returns the simpler { "success": false, "error": "..." } shape — see Errors.)

There's no Retry-After header to read; you decide the wait yourself.

Retry with exponential backoff

The standard pattern: catch 429, sleep for a growing interval, retry. Cap the number of attempts so you don't loop forever.

async function callWithRetry(url, options, { maxAttempts = 5 } = {}) {
  let attempt = 0;
  while (true) {
    const res = await fetch(url, options);
    if (res.status !== 429) return res;
    if (++attempt >= maxAttempts) return res; // give up, caller handles
    const wait = Math.min(2 ** attempt * 100, 5000) + Math.random() * 100;
    await new Promise(r => setTimeout(r, wait));
  }
}
import random, time, requests

def call_with_retry(method, url, *, max_attempts=5, **kwargs):
    for attempt in range(max_attempts):
        res = requests.request(method, url, **kwargs)
        if res.status_code != 429:
            return res
        wait = min(2 ** attempt * 0.1, 5) + random.random() * 0.1
        time.sleep(wait)
    return res

Two details worth keeping:

  • Jitter. The + random() term spreads retries out when multiple clients hit the limit simultaneously. Without it, every client retries in lockstep and you keep getting 429.
  • A ceiling. 2 ** attempt grows fast. Cap it so a misbehaving service doesn't sleep for an hour.

Staying under the limit

For bursty workloads, throttle on the client side. A simple token-bucket gets you most of the way:

class RateLimit {
  constructor(perSecond = 10) {
    this.interval = 1000 / perSecond;
    this.last = 0;
  }
  async wait() {
    const now = Date.now();
    const delay = Math.max(0, this.last + this.interval - now);
    this.last = now + delay;
    if (delay) await new Promise(r => setTimeout(r, delay));
  }
}

const limit = new RateLimit(8); // headroom under the 10/s cap
for (const customer of customers) {
  await limit.wait();
  await sendMessage(customer);
}

Run a bit below the cap (8/sec is a reasonable working target) so transient bursts don't trip 429.

Patterns that hit the limit

  • Bulk send loops. Sending to N customers with Promise.all from one process exceeds the cap immediately. Throttle to 8–10/sec.
  • Tight polling. Don't poll GET /customers/ in a while (true) without sleeping; you'll burn the budget on no-op reads.
  • Webhooks fanning out into return calls. If your hook handler makes a Platform API call per inbound message and you receive bursty traffic, queue the calls and drain at ≤ 10/sec.

When 429 isn't transient

The retry pattern above assumes 429 is rare and recovers in a few hundred milliseconds. If you're hitting it sustained over multiple seconds, you're sending faster than 10/sec on average and backoff won't help — you need to reduce your request rate at the source.