So, you’ve decided that you’d like to do some load testing on your website. What’s an acceptable response time?
I like to think of response time as a service-level agreement (SLA), the level of responsiveness, that you are going to commit to offer to your users. According to usability research, if an interactive site takes longer than 1 second, it will interrupt the user’s flow of thought, and if it takes longer than 10 seconds will lose the user’s attention.
We also need to think about our SLA in terms of statistics, since we are dealing with multiple responses. We don’t want to just use the mean, because outliers are quite common. Your site may respond to most requests in under 300 milliseconds, but once in a while it may take 1.5 seconds to respond.
Percentiles are a useful statistical tool to avoid the problems of outliers. For example, we might say that we want at least half of the requests to respond in under 400 ms, at least 80% of the requests to respond in under 500 ms, and at least 95% of the requests to respond in under 1 second.
Now let’s see how we can use Loader along with New Relic Insights to check if we are hitting our SLA.
Configuring Loader to push data to New Relic Insights
You’ll need to enable New Relic Insights integration in the integrations menu and then specify your New Relic Insights credentials:
Defining a test
For this blog post, I used a load test of 1000 clients over one minute against a machine in one of our test environments.
Loader data available in New Relic Insights
The simplest way to see what data is available in New Relic Insights is to run a test, and then do a NRQL query like this:
SELECT * from loaderio
For this blog post, we’ll focus on avg_time (the average response time for a data interval) and clients (the number of test clients active during a time interval).
You can find more detailed info about these fields on the Loader New Relic Insights Integration documentation page.
Setting up a New Relic Insights Dashboard
On my dashboard, I like to see the maximum number of active clients during the test interval, as well as a plot of the number of clients over time. This info lets me know that the test is actually doing what I think it is.
I like how the New Relic Insights dashboard will let me peg the views to certain time intervals. When I’m watching the test live, I do:
SINCE 1 minute ago
After I’ve run my tests and I want to review results, I specify a specific time interval that the test took place on
SINCE '2014-10-06 19:53:00' UNTIL '2014-10-06 19:54:30'
My screenshots will show views pegged to a specific time interval, but in my NRQL examples I’ll assume you’re viewing live data.
SELECT max(clients) from loaderio SINCE 1 minute ago
SELECT average(clients) FROM loaderio TIMESERIES SINCE 1 minute ago
Next, I put up some displays that show the SLA metrics we’re tracking. I’m going to use the same numbers as the example I mentioned earlier, at least 50% of the requests should respond in under 400 ms, at least 80% of the requests should respond in under 500 ms, and at least 95% of the requests should respond in under 1000 seconds. I used the critical threshold feature so that the numbers turn red if they exceed these thresholds.
SELECT percentile(avg_time, 50) from loaderio SINCE 1 minute ago
SELECT percentile(avg_time, 80) from loaderio SINCE 1 minute ago
SELECT percentile(avg_time, 95) from loaderio SINCE 1 minute ago
We can see that we’re hitting our 50th percentile and 80th percentile metrics, but not our 95th percentile, which suggests that we have some outliers.
I also like to visualize how these SLA metrics vary over time, as well as the distribution of response times across the entire test period:
SELECT average(avg_time), percentile(avg_time, 50, 80, 95) from loaderio SINCE 1 minute ago TIMESERIES AUTO
Finally, here’s what the whole dashboard looks like:
New Relic Insights Integration is available to all Loader users, including those on the Free plan.