Website Performance tip for Apache
I’ve been meaning to write a post titled something along the lines of “10 things you can do to improve the performance of your web application” for the last few months, but so far I’ve not had the time! There are lots of articles out there on the web that tell you in general terms what you should do, but I like to provide examples when I do these things so I’m afraid you’ll have to wait for that one.
In the meantime, here’s a handy tip to reduce some of the load from your web site.
Quick Tip
I go on about caching a lot on this blog. Whether it’s server side caching of generated content (I really dig Memcached at the moment) or sending the appropriate HTTP headers to tell a client that they can cache something for a given length of time at their end (client side caching), these all help reduce the work your web site has to do. And less work is a good thing!
I’m quite pleased to see that, by default apparently (or on a standard install), Apache will generate Etags and Last-Modified response headers. Here’s a request/response example:
Request*
GET /images/icon_britblog_80x15.gif HTTP/1.1 Accept: */* User-Agent: Mozilla/5.0 Host: www.britblog.com
Response*
HTTP/1.1 200 OK Last-Modified: Tue, 15 Jun 2004 09:57:43 GMT ETag: "4000b7-116-40cec817" Content-Type: image/gif
This is all very useful, but if I reload a page with the same image on it, my browser will still make a request to the server. This time though, the request will have some additional information:
Request*
GET /images/icon_britblog_80x15.gif HTTP/1.1 Accept: */* If-Modified-Since: Tue, 15 Jun 2004 09:57:43 GMT If-None-Match: "4000b7-116-40cec817" User-Agent: Mozilla/5.0 Host: www.britblog.com
Response*
HTTP/1.1 304 Not Modified Etag: "4000b7-116-40cec817"
You can see that the browser supplied two additional headers: If-Modified-Since and If-None-Match. Together, these two headers allow the web server to check if the content that it would normally serve is any different to the copy that the client already has locally. As my image hasn’t changed on the server, Apache knows to return a 304 Not Modified response, meaning that my browser doesn’t need to re-download the image.
This is all good stuff, and this kind of technique should be employed by more web applications than currently do. You can save a huge amount of bandwidth, plus you reduce the load on your web server as client connections are dropped more rapidly. But I digress - that’s another story …
So the next question is “how can we make this even more efficient?” When testing some web servers recently, I noticed that one site running IIS would return an additional header with their images:
Cache-Control: "max-age=900"
This header tells my browser that the copy of an image it has just downloaded can be cached locally and is to be considered valid (fresh) for the next 900 seconds. This means is that next time my browser needs that image, providing we are still within that 900 second window of time, it can use its local copy of the image quite happily, safe in the knowledge that it will look right etc. All this happens without going anywhere near the web server to verify the validity of this component (i.e. using the If-None-Match and If-Modified-Since) headers.
This stuck me as a great idea, so I went off to find out how I can force Apache to do this. Now, I’m no Apache expert so there may well be a better solution to this problem, but the one I’ve come up with seems to work just fine.
For this to work, you need to load the Mod_Headers Apache module into your Apache configuration (also available for Apache 2.0):
LoadModule headers_module /usr/lib/apache2/modules/mod_headers.so
Once that’s done, you can bung the following somewhere in your config (mine is in the VirtualHost block):
<filesmatch ".(gif|jpe?g|png)$">
<ifmodule mod_headers.c>
# Set max-age for images to 30 minutes
Header add Cache-Control "max-age=1800"
</ifmodule>
</filesmatch>
This looks for any request made that is for an image (using the regular expression \.(gif|jpe?g|png)$), and adds our new Cache-Control header as needed.
So back to my first example! Now if I request that image I get the following response:
HTTP/1.1 200 OK Last-Modified: Sun, 21 Jan 2007 15:26:07 GMT Etag: "1f46c6-116-905261c0" Cache-Control: max-age=1800
And now if my browser needs the same image again (within the next 30 minutes) it knows that it doesn’t need to go anywhere near the web server for the image because it’s local copy is fine. And this means one less hit on the box!
Notes
There are a few things I should point out:
- You may have seen an
Expires:header sometimes. TheCache-Control:header carries more weight than this, so if a browser sees both it should ignore theExpiresvalue. TheExpiresheader carries a specific date/time with it too, rather than an absolute lifetime in seconds. Because clients and servers can have widely different clock times, the use of this header for caching should be avoided. - Depending on your browser, if a web server doesn’t return a
Cache-Control: max-ageheader, it can decide on its own how long the local copy of a component is valid for. - If all your images (and other static content) are located in the same directory, you could use a
configuration directive rather than adirective. This is probably more efficient than using a regular expression. - I’m not an Apache guru by any stretch of the imagination. You should refer to the Apache documentation for more information.
More reading
- HTTP 1.1 specification - especially the section on caching and the header definition section.
- Apache HTTP server
- Apache’s Mod_Headers for Apache 1.3 and Apache 2.0.
* The request and responses have been shortened to show just the pertinent information.









