Gloogle Cloud: Rate-limiting strategies and techniques
- Even in the cases where the rate limiting is implemented entirely on the server side, the client should be engineered to react appropriately.
- Decisions about failing open or failing closed are mostly relevant on the server side, but knowledge of what retry techniques the clients use on a failed request might influence your decisions made about server behavior.
- In HTTP services, the most common way that services signal that they are applying rate limiting is by returning a 429 status code in the HTTP response. A 429 response can provide additional details about why the limit is applied (for example, a freemium user has a lower quota, or the system is undergoing maintenance).
- Build your system with robust error handling in case some part of your rate-limiting strategy fails, and understand what users of your service will receive in those situations. [...] U