Dan, a loader.io user, emailed recently to ask about some
confusinginteresting test load test results. The test configuration was fairly simple:
- 250 clients per second for 60 seconds
- a single URL (a simple landing page for his app)
- 50-second timeout
- 100% error threshold (errors are ignored)
loader.io split up Dan’s test and ran it on several load generator servers, hitting the target URL with a combined 250 requests per second over the 60-second duration of the test. A bit of quick math shows that we should see 250 * 60 = 15,000 requests total.
Let’s look at the results:
The green line, representing active clients, climbs quickly up to about 1,500 and stays there for most of the test. What gives? There were only supposed to be 250 per second! 1,500 requests for almost 50 seconds is nearly 75,000 requests. No wonder there are so many errors, right?
The answer lies in how we define the “active clients” represented by that line. Active clients during a certain timeframe (in this case 1 second of the test) are not the same as requests being made, but rather a client that made a new request, or that is still waiting for a response to a request it made previously.
If responses take a long time, active clients waiting for responses can stack up; the steep climb of that green line shows this happening. Stacking clients is an indication that the server is not handling requests fast enough. The steeper the climb, the more clients are waiting too long for a request.
For this type of test, we want to see a flat line, the closer to the number of clients specified in the test the better. In this case, we want to see a flat green line at 250, meaning that each client makes a request and gets a prompt response.
Let’s have a look at the request/response details graph. This one shows some more details about requests and responses during the test:
Using this graph, we can get a pretty good idea of what happened. Understanding this graph will help us understand the first, and hopefully point us to ways we can improve the application being tested. Let’s take it a few seconds at a time. There are four major sections that we want to examine: the first 6 seconds, 6 to 10 seconds, 10 to 52 seconds, and 52 to 57 seconds. We’ll ignore the last few seconds this time, since they aren’t really important here.
the first 6 seconds
- loader made 250 requests per second (light blue area on the details graph)
- loader got ~50 responses per second (green area on the details graph)
The request rate is showing exactly what it should be for this test: 250 requests per second. However, the low rate of responses is a Bad Thing. It does explain the rapid growth of the green line in the first graph though: 250 clients are making requests each second, and only 50 are getting responses. That means the number of active clients is growing by about 200 clients per second. We’ll see what happens to all those active clients towards the end.
6 to 10 seconds
- 500 errors grow from 0 to ~245 per second (yellow)
- 200 responses decline to between 1 and 6ish per second (green)
Over the course of four seconds, we see successful responses drop quickly to under 10 per second, and 500 errors rise to over 240 per second. We don’t really know why this happens, but it would be really interesting to see what was happening on the server at this point. Something seems to have hit a critical level; Dan didn’t say what exactly was on fire, but *something* definitely was.
This is where you should check your monitoring systems and logs to see what is going on, and is a perfect example of how loader.io can show you what your app’s limits are.
Normally loader would abort the test after a few seconds of this, but since the error threshold was set to 100% it keeps chugging away.
10 to 52 seconds
- steady stream of 500 errors
- between 1 and 11 successful responses per second
Nothing extremely interesting here, but I do want to note that since almost all requests are being promptly responded to with a “503 Service Unavailable” error, the number of active clients (the green line) stops growing as quickly, and mostly flattens out at ~1,400. This also provides us with nice-looking but misleadingly low response times.
52 to 57 seconds
- timeouts grow quickly to 250 per second (orange)
- error responses decrease slightly (yellow)
- total responses received grows (purple)
Timeouts start to spike here because of the 50-second timeout period, that much is clear—all of these timeouts are clients that have been waiting for responses since the first few seconds of the test. This spike in timeouts corresponds to a drop in the number of active clients, which is expected.
The most interesting part of this period is the drop in error responses. It looks like as clients time out and tear down their connections, the server frees up some resources. It is then able to accept a few requests, instead of immediately responding with the 503 error. This is a hint that the server may have simply reached an open socket limit or ran out of worker threads to process requests. If the test ran for another 50 seconds past this, we might see another small spike in timeouts.
The total responses grows in the same shape as the timeout spike. loader.io counts timeouts along with other responses, even though it may not be a valid HTTP response with a response code and all. A timeout is a type of response; it says “the server is busy”, but not in so many words.
Understanding the results of your loader.io test results is an important step in identifying, understanding, and improving your application’s performance. There are important lessons for you in these charts, if you take the time to understand them.
There are three important things to take away from this case study in particular:
- Use all the graphs. There is important data on each one, and you should not just be worried about response times. You can still have great response times if all responses are 500 errors, for example. Even that bandwidth graph can shed some light from time to time.
- Know your test type. Depending on the type of test, you should expect to see different behavior on that “active clients” line:
- “Clients per test” and “Clients per second” should have flat green lines; steep growth is a sign of trouble.
- “Maintain load” tests should have lines growing as specified by the “from” and “to” fields in your test
- Watch your logs and monitoring tools. When loader.io starts fires on your system, you will want to know exactly when, where, and why so that you can take steps to keep that fire from starting in the first place.
- Bonus tip! Adjusting error threshold and timeout configurations can make a big difference. If Dan had left the timeout at the default of 10 seconds, we might have seen more frequent but smaller spikes of timeouts. If you set the timeout longer than the test duration, you will never see any timeout responses. Take these values into account as you evaluate your own test results, and adjust them as necessary in future tests.