Free ToolBy GitIntel

API Rate Limit Design Guide: Algorithms, Thresholds, and Implementation

Rate limiting is one of those features that seems simple until you implement it in production

GitIntel tracks AI-generated code across your entire git history — giving every tool on this page the attribution layer that standard dev tooling misses.

Try GitIntel free

Rate limiting is one of those features that seems simple until you implement it in production. The algorithm choice, threshold values, and response format each have concrete effects on user experience and API abuse resistance.

Four common algorithms:

Fixed window: count requests per time window (e.g., 100 requests per minute). Simple to implement and understand. Weakness: allows 2x the limit in a burst at window boundaries — 100 requests at minute 0:59, then 100 more at minute 1:00.

Sliding window log: store the timestamp of each request, count requests in the trailing window. Accurate and burst-resistant, but memory-intensive at high scale (you store one entry per request per user).

Sliding window counter: approximate the sliding window using weighted counts from the current and previous window. 99%+ accuracy at O(1) memory per user. The algorithm behind Cloudflare and Upstash's rate limiting.

Token bucket: users accumulate tokens at a fixed rate, each request consumes a token. Allows bursts up to bucket size, then enforces steady-state rate. Most flexible for APIs where occasional bursts are acceptable. Stripe uses token bucket.

Response headers matter for DX. Return X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset on every response. Return Retry-After on 429 responses. Well-implemented rate limit responses let API clients backoff gracefully without polling.

Threshold guidance for REST APIs: 100-1000 requests/minute per authenticated user is a reasonable default for most use cases. Set burst limits 3-5x higher than sustained limits. Separate limits for read vs write endpoints.

Frequently Asked Questions

What's the difference between rate limiting and throttling?

Rate limiting rejects requests that exceed the threshold with a 429 status code. Throttling slows requests down rather than rejecting them — the API processes them but at a reduced rate. Rate limiting is more common for public APIs because it's simpler and gives clients clear feedback. Throttling is sometimes used for internal services where dropping requests is unacceptable.

Should I rate limit by IP or by user?

Both. IP rate limiting provides a first layer of defense before authentication and protects against unauthenticated abuse. Per-user (API key or authenticated session) rate limiting is the main protection layer — it's fair to all users regardless of shared IPs (corporate NAT, mobile carriers). Apply stricter limits to unauthenticated endpoints.

How do I implement rate limiting in Redis?

Use Upstash Rate Limit (open-source, works with any Redis-compatible store) or implement the sliding window counter yourself: MULTI/EXEC block with ZADD, ZREMRANGEBYSCORE, and ZCARD on a sorted set keyed by user ID + window. The TTL on the key prevents unbounded growth. Upstash's hosted Redis is $0.20/100K commands — the cheapest option for low-to-medium rate limiting at scale.

Start Using GitIntel Free

Open source. No account required. Works on any git repository.