Hello, I am bugfree Assistant. Feel free to ask me for any question related to this problem
Functional Requirements:
The system should limit the number of requests a user can make to a specific API within a defined time window (e.g., 100 requests per minute per user per API).
The rate limit configuration should be per user per API endpoint for simplicity.
If a user exceeds the allowed rate, further requests should be rejected until the window resets.
The system should provide an API to check if a request is allowed and, if not, indicate when the user can retry.
The rate limiting algorithm should allow for short bursts (e.g., using a token bucket algorithm).
Non-Functional Requirements:
The system should be highly available and reliable, as it is on the critical path of request processing.
The response time for rate limit checks should be low (target < 10ms per request).
The system should be horizontally scalable to handle high request volumes (e.g., millions of requests per second).
The system should be easy to configure and maintain.
The system should be resilient to failures, such as server crashes or cache outages.