HTTP Timeouts

Timeout

When making a restful http call your framework sometimes makes you specify some timeouts. Sometimes the framework allows you to leave the timeouts unspecified. Yet you should define them in any way!

But how long should the timeout be?

Connect Timeout (or Connection Timeout)

is the time given to establish a network connection between your client and the server.

What are the reasons for a server not answering in time?

Your server is far away on the other side of the planet

Then your packets will have many hops and the time to establish a connection is long.

Usually this is not the case in a microservice environment. Your server will sit next door in the same network.

In most microservice environments the connect should be done within a few milliseconds.
The server is under heavy load and can not answer in time.

What will be the result of a high connect timeout?

Your client service will start more and more http calls and they will be waiting long time and eat up all the threads.

Then your client service will be infected by the high load of the server and become instable itself.

In this scenario high connect timeouts might make your whole microservice environment become inresponsive!
The server is not there at all, nobody is responding.

Will a longer connect timeout help? No!

So in a typical microservice environment you should set a very small connect timeout. One second might be the upper limit.

And how much time in detail?

Colin Scott has collected some numbers, that you can use to calculate your initial settings when you begin programming your service: Latency Numbers Every Programmer Should Know↗.

Later you should tune them according to the numbers, that you collected from your monitoring.

Is your service slow and it uses an SQL database? Have a look at The Cost of JDBC Server Roundtrips↗ by Lukas Eder.

Response Timeout

is the time given to get the whole answer from the server.

The timeout might be set to several seconds. How many seconds? That largely depends on your business case:

Operations: Overall timeouts on the network

Your operators will usually set an overall timeout on http requests on the network depending on their needs to operate the network.

This timeout is your upper limit.

Circuit Breakers

Maybe you use a Circuit Breaker↗. The circuit breaker’s settings depend on statistics and maths. It usually does not open for one failing request, because the network is by definition unreliable. There should be a defined percentage of failing requests within a given period of time to open the circuit. The longer your request timeout is, the longer the circuit breaker needs to detect that there is something going on. Meanwhile there are more requests going into a failing service. The statistics problem gets worse if the circuit breaker is in half-open state.

Add Monitoring

Keep track of the actual response times in your production environment. When they start increasing then you will need to act. Give yourself enough time for fixing! You will probably not solve the problem within one day before the response times hit the timeout limit!

User Experience

When you are shopping on the internet, will you wait for 45 seconds for a page to show up? Surely not!

Jakob Nielsen described 3 time limits for user interaction in his book “Usability Engineering” (1993)↗.

0.1 second: “instantanious”.

This is irrelevant in our case. If the request comes from the user’s browser to your service and then to an other service and then all the way back… there must be magic to do that in 0.1 seconds. Usually a web application can only meet this limit if there is no http request involved.
“1.0 second is about the limit for the user’s flow of thought to stay uninterrupted, even though the user will notice the delay.”

This time can be met by web applications with http requests but only under best conditions. Under worse conditions the delay will go up to 2 - 5 seconds and more. Usually you should add some loading indicator to the user interface to indicate there is something going on.
“10 seconds is about the limit for keeping the user’s attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish, so they should be given feedback indicating when the computer expects to be done. Feedback during the delay is especially important if the response time is likely to be highly variable, since users will then not know what to expect.”

You should spend some effort in order to keep the overall response time below 10 seconds. You may exceed it only if the user understands, that there is something really important going on and he knows that it’s important to wait and he knows how long he approximately needs to wait. A simple spinner will no be an appropriate loading indicator in this case.

Source: https://www.nngroup.com/articles/response-times-3-important-limits/

Different use cases

Usually you will have different use case classes in your web application. Some requests are called very often and they must be really fast. Some requests are done only a few times a day.

Thinking about a web shop:

Really often and really fast: Browsing the catalogue.

Your users will not wait 10 seconds for an article to show up but they will simply go to another shop.
Medium: Last final click in the checkout process.

Up to 10 seconds might be ok with a good loading indicator on the user interface.
Only a few times a day: Changing product prices.

Maybe it’s enough to give advice to your collegues to wait.

So you should discuss the non functional requirement↗ of response time for each of those use case classes with the business people. Then you will need to set up monitoring, alarming and timeouts for each of those classes individually.

Examples

Browsing the catalogue:

Really small timeouts and tight monitoring and circuit breakers and caching. Every second costs money.
Last final click in the checkout process:

Better implement an asynchronous execution. User interface can poll on the order status.
Changing product prices:

Configure the maximum timeout that is allowed in the network.

More general thoughts on this topic: Fallacies of Distributed Systems↗

Update 2022-03-20: Add link to https://colin-scott.github.io/personal_website/research/interactive_latency.html↗

Update 2022-04-30: Add link to The Cost of JDBC Server Roundtrips↗

Update 2022-07-20: Add link to Fallacies of Distributed Systems↗

Any comments or suggestions? Leave an issue or a pull request!