Rate Limiting with Dependencies
Learn how to implement rate limiting using FastAPI's dependency injection system with in-memory request tracking.
🎯 What You'll Learn
- •Understand rate limiting concepts and sliding window algorithms
- •Implement a rate limiter using FastAPI dependencies
- •Track request counts per client using in-memory storage
- •Return appropriate HTTP 429 responses when limits are exceeded
Rate Limiting with Dependencies
What You'll Learn
- Understand why rate limiting is essential for API protection
- Implement a sliding window rate limiter using FastAPI dependencies
- Use in-memory storage to track request counts per client
- Return proper HTTP 429 responses when rate limits are exceeded
Theory
Rate limiting restricts the number of requests a client can make to your API within a given time window. It protects your server from abuse, ensures fair usage, and prevents resource exhaustion.
Why Rate Limiting Matters
Without rate limiting, a single client can:
- Overwhelm your server with too many requests
- Consume all available resources, denying service to others
- Scrape your data or abuse your API endpoints
- Drive up infrastructure costs unexpectedly
Common Rate Limiting Algorithms
Fixed Window
Counts requests in fixed time intervals (e.g., 100 requests per minute). Simple but can allow burst traffic at window boundaries.
|---Window 1---|---Window 2---|
90 requests 90 requests
Sliding Window
Tracks individual request timestamps and counts requests within a rolling time period. Smoother than fixed windows and prevents boundary bursts.
|------60 second window------|
t1 t2 t3 ... tN [new request]
Token Bucket
A bucket holds tokens that are consumed per request and refilled at a steady rate. Allows short bursts while maintaining a long-term average rate.
Implementing Rate Limiting with Dependencies
FastAPI's dependency injection system is ideal for cross-cutting concerns like rate limiting. A dependency function can:
- Inspect the incoming request
- Check against stored rate limit data
- Raise an exception if the limit is exceeded
- Return useful information (like remaining requests)
def rate_limiter(request: Request):
client_ip = request.client.host
# ... check and update rate limit state ...
if over_limit:
raise HTTPException(status_code=429, detail="Rate limit exceeded")
return remaining_requests
The Sliding Window Approach
In this lesson, we use a sliding window algorithm with in-memory storage:
- Store timestamps - Keep a list of request times per client IP
- Clean expired entries - Remove timestamps older than the window
- Check the count - If requests in the window exceed the max, reject
- Record the request - Add the current timestamp to the list
request_counts: dict = {}
def rate_limiter(request: Request):
client_ip = request.client.host if request.client else "unknown"
current_time = time.time()
window = 60 # 1 minute
max_requests = 10
if client_ip not in request_counts:
request_counts[client_ip] = []
# Remove expired timestamps
request_counts[client_ip] = [
t for t in request_counts[client_ip]
if current_time - t < window
]
if len(request_counts[client_ip]) >= max_requests:
raise HTTPException(status_code=429, detail="Rate limit exceeded")
request_counts[client_ip].append(current_time)
return max_requests - len(request_counts[client_ip])
Using Dependencies for Cross-Cutting Concerns
Dependencies are powerful for implementing concerns that span multiple endpoints:
| Concern | Dependency Pattern |
|---|---|
| Rate Limiting | Track requests, enforce limits |
| Authentication | Validate tokens, return user |
| Authorization | Check permissions, raise 403 |
| Logging | Record request metadata |
| Caching | Check cache before processing |
HTTP 429 Too Many Requests
The HTTP 429 status code tells clients they have sent too many requests. Best practice is to include information about when they can retry:
raise HTTPException(
status_code=429,
detail="Rate limit exceeded"
)
Key Concepts
- Sliding Window - A rolling time period that smoothly tracks request rates
Depends()- FastAPI's mechanism for injecting dependency return values into endpointsrequest.client.host- Accesses the client's IP address from the request- HTTP 429 - The standard status code for rate limit exceeded responses
- In-Memory Storage - Using Python dictionaries to track state (suitable for single-process apps)
- Window Expiration - Cleaning out old timestamps to maintain accurate counts
Best Practices
- Always identify clients reliably (IP address, API key, or authentication token)
- Return informative error messages with 429 responses so clients know when to retry
- Clean expired entries on every request to prevent memory leaks
- Use in-memory storage for development; consider Redis or similar for production
- Apply rate limiting selectively: not every endpoint needs the same limits
- Consider different rate limits for different user tiers or API key levels
- Document your rate limits in your API documentation so consumers can plan accordingly
- Test rate limiting behavior with concurrent requests to verify correctness
Additional Resources
💡 Hint
Create a dependency function that tracks request timestamps per IP in a dictionary. Clean expired entries, check the count against the limit, and raise HTTPException(status_code=429) if exceeded.
Ready to Practice?
Now that you understand the theory, let's put it into practice with hands-on coding!
Start Interactive Lesson