February 2, 2009

Apache Benchmark Tool

Sick of hearing people complain about how slow your web site takes to load? First off, congratulations for creating a website with a big enough user base that users are actually complaining! Now that you have a good amount of people interested, it's probably a good idea to develop a strategy of how to test your site's response times from a user perspective. I've found that the Apache http benchmarking tool is pretty handy for this purpose.

Take a look inside your apache server installation's bin directory and you should be able to find the Apache Benchmarking tool, it's a program named "ab". AB can be run from the command line to simulate requests to your site's URLs so you can get a sense of how much traffic your server can handle. Use -n to tell ab how many requests to send, and use -c to control how many of those requests are "concurrent". For example, to simulate 10 people all trying to load the main page on my site at exactly the same time, I used the following command:

      # ./ab -n10 -c10 http://upgradingdave.com

Here's an example of what the output looks like:

   This is ApacheBench, Version 2.3 <$Revision: 1663405 $>
   Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
   Licensed to The Apache Software Foundation, http://www.apache.org/
   Benchmarking upgradingdave.com (be patient).....done
   Server Software:        Apache/2.4.12
   Server Hostname:        upgradingdave.com
   Server Port:            80
   Document Path:          /
   Document Length:        234 bytes
   Concurrency Level:      10
   Time taken for tests:   0.555 seconds
   Complete requests:      10
   Failed requests:        0
   Non-2xx responses:      10
   Total transferred:      4510 bytes
   HTML transferred:       2340 bytes
   Requests per second:    18.02 [#/sec] (mean)
   Time per request:       554.957 [ms] (mean)
   Time per request:       55.496 [ms] (mean, across all concurrent requests)
   Transfer rate:          7.94 [Kbytes/sec] received
   Connection Times (ms)
   min  mean[+/-sd] median   max
   Connect:       71   73   1.9     75      75
   Processing:    82  201 117.4    184     482
   Waiting:       81  188  99.2    184     428
   Total:        153  275 117.5    258     555
   WARNING: The median and mean for the initial connection time are not within a normal deviation
   These results are probably not that reliable.
   Percentage of the requests served within a certain time (ms)
   50%    258
   66%    274
   75%    307
   80%    355
   90%    555
   95%    555
   98%    555
   99%    555
   100%    555 (longest request)

I think the "requests per second" and "average time for each request" are the 2 most useful pieces of information from the generated output. AB will also show you the number of requests that fail and whether they failed due to bad connection, a timeout, or a http error (like 404 page cannot be displayed).

Don't be scared to really give you're server a good shake down. As a rule of thumb I think 2 seconds is way too long to wait for any web page. So, increase the number of users and concurrency until the results show an average time of more than 3 seconds for each request. This should give you a very rough estimate of the amount of traffic your web server can handle. At the time of this writing, it looks like the threshold for my site is around 50 concurrent users, that is, 50 users all accessing the site at exactly the same time.

Keep in mind that measuring concurrent user response time doesn't translate to the actual number of people visiting your web site. For example, say that 100 different people visit your site on a saturday and that they all visited at different times during the day. This would mean that you had 100 users but none of them were hitting the site at the same time (concurrently). On the other hand take a site like eBay for instance. If eBay puts out a button labeled "place a bid" and 100 people all click the button at the same time, this would roughly translate into 100 concurrent users. So, normally, the number of total users visiting your site is going to be much greater than the number of concurrent users (unless you have some page where you expect lots of users will access at the exact same time).

Also remember that ab is just a simulation. Since you typically run ab from the command line on the same computer that's acting as a server, ab is using some of the server's resources to run the tests. In the real world, the server would only be replying to requests and wouldn't have to worry about running ab.

It's also not normal for someone to open up 1000 browsers and load the same exact web page every 3 seconds back to back to back. Another reason to take ab results with a grain of salt.

Even though it's not precise, ab can be helpful to get a ballpark estimate of how much traffic your web site can handle and to troubleshoot performance. Take some time to run a few tests and record the results and then re-run the same tests each week or each month to track the your site's performance.

Tags: sysadmin tools tech