书名：TypeScript Microservices
作者名：Parth Ghiya
本章字数：1030字
更新时间：2021-06-25 21:48:39

Proxy routing and throttling

When you have multiple microservices that you want to expose across a single endpoint and that single endpoint routes to service as per need. This application is helpful when you need to handle imminent transient failures and have a retry loop on a failed operation, thus improve the stability of the application. This pattern is also helpful when you want to handle the consumption of resources used by a microservice.

This pattern is used to meet the agreed SLAs and handle loads on resources and resource allocation consumption even when an increase in demand places loads on resources:

Problem: When a client has to consume a multitude of microservices, challenges soon turn up such as client managing each endpoint and setting up separate endpoints. If you refactor any part of the code in any service then the client must also be updated as the client is directly in contact with the endpoint. Further, as these services are in the cloud, they have to be fault tolerant. Faults include temporary loss of connectivity or unavailability of services. These faults should be self-correcting. For example, a database service that is taking a large number of concurrent requests should throttle further requests until the memory load and resource utilization has decreased. On retrying the request, the operation is completed. The load on any application varies drastically on time period. For example, a social media chatting platform will have very less load during peak office hours and a shopping portal will have extreme load during festive season sales. For a system to perform efficiently it has to meet to agreed LSA, once it exceeds, subsequent requests needs to be stopped until load consumption has decreased.
Solution: Place gateway layer in front of microservices. This layer includes the throttle component, as well as retry, once failed component. With the addition of this layer, the client needs only to interact with this gateway rather than interacting with each different microservice. It lets you abstract backend calls from the client and thus keeping the client end simple as the client only has to interact with the gateway. Any number of services can be added, without changing the client at any point in time. This pattern can also be used to handle versioning effectively. A new version of the microservice can be deployed parallelly and the gateway can route too, based on input parameters passed. New changes can be easily maintained by just a configuration change at the gateway level. This pattern can be used as an alternative strategy to auto-scaling. This layer should allow network requests only up to a certain limit and then throttle the request and retry once the resources have been released. This will help the system to maintain SLAs. The following points should be considered while implementing the throttle component:
One of the parameters to consider for throttling is user requests or tenant requests. Assuming that a specific tenant or user triggers throttle, then it can be safely assumed that there's some issue with the caller.
Throttling doesn't essentially mean to stop the requests. Lower quality resources if available can be given, for example, a mobile-friendly site, a lower quality video, and so on. Google does the same.
Maintaining priority over microservices. Based on the priority they can be placed in the retry queue. As an ideal solution, three queues can be maintained—cancel, retry, and retry-after sometime.

Take care of: Given here are some of the most common pitfalls that we can come across while successfully implementing this pattern:
The gateway can be a single point of failure. Proper steps have to be taken to ensure that it has fault tolerant capabilities during development. Also, it should be run in multiple instances.
Gateway should have proper memory and resource allocation otherwise it will introduce a bottleneck. Proper load testing should be done to ensure that failures are not cascaded.
Routing can be done based on IP, header, port, URL, request parameter, and so on.
The retry policy should be crafted very carefully based on the business requirements. It's okay in some places to have a please try again rather than having waiting periods and retrials. The retry policy may also affect the responsiveness of the application.
For effective application, this pattern should be combined with Circuit Breaker Application.
If service is idempotent, then and only then should it be retried. Trying retrial on other services may have unhealthy consequences. For example, if there is a payment service that waits for responses from other payment gateways, the retry component may think it fails and may send another request and the customer gets charged twice.
Different exceptions should handle the retry logic accordingly, based on the exceptions.
Retry logic should not disturb transaction management. The retry policy should be used accordingly.
All failures that trigger a retry should be logged and handled properly for future scenarios.
An important point to be considered is this is no replacement for exception handling. The first priority should be given to exceptions always, as they would not introduce an extra layer and add latency.
Throttling should be added early in the system as it's difficult to add once the system is implemented; it should be carefully designed.
Throttling should be performed quickly. It should be smart enough to detect an increase in activity and react accordingly by taking appropriate measures.
Consideration between throttling and auto-scaling should be decided based on business requirements.
The requests that are throttled should be effectively placed in a queue based on priority.

When to use: This pattern is very handy in the following scenarios:
To ensure that agreed LSAs are maintained.
To avoid a single microservice consuming the majority of the pool of resources and avoid resource depletion by a single microservice.
To handle sudden bursts in consumption of microservices.
To handle transient and short-lived faults.

When not to use: In the following scenarios, this pattern should not be used:
Throttling shouldn't be used as a means to handle exceptions.
When faults are long-lasting. If this pattern is applied in that case, it will severely affect the performance and responsiveness of the application.