This morning I’ve been looking at the mod_gzip Apache module, and have had some impressive results!
mod_gzip is, for those of you that don’t know, an apache module that compresses content before it is sent to the client’s web browser. Compressing content before sending it to the client means there is less content to transfer, which equates to faster download times. The time taken to compress the content at the server end, transfer the compressed content to the client, and then restore it again at the client end is usually significantly faster than transferring the original uncompressed HTML file across the wire.
The graph below shows the download time for the London blogs page over at BritBlog, measured at a download speed of 40 Kbps:

This morning, at around 11.00 AM I configured mod_gzip, and you can see straight away the effect it has had on the download speed. I first of all only enabled compression on HTML content, but as I have a hefty CSS file and a large JavaScript file, I enabled compression for this content too. (I know this may cause problems for some browsers - we’ll have to see what happens).
The following two graphs show the component breakdown for these pages:
Before content compression:

After content compression:

As you can see, the text-based content comes down much more quickly. In fact, it knocks nearly 2/3rds of the download time off!
What are the cons?
The downside with mod_gzip is the additional load on the server. The table below shows some of the data that went into plotting the top graph, and you see that the data start time goes up once mod_gzip is enabled (between 10:28 and 10:38). Note: my mod_gzip configuration wasn’t correct straight away, which is why the download speed came down gradually over the next few tests).
The data start time on my server isn’t that impressive at the best of times - and that’s without any normal server load. It will be interesting to see what happens once the server goes live. I’m also planning on building in some sort of page cache to the site: the pages don’t change that rapidly, so building content on-the-fly like this isn’t really necessary.
mod_gzip Configuration
I’m not going to go into this in much depth; if you’re using mod_gzip for the first time then this is a useful example configuration file - you should read through it.
The configuration snippet below shows the main part of the mod_gzip configuration. This is the bit that specifies which stuff should get compressed and which shouldn’t. The best candidate for content compression is text, as you can easily knock 60% off the amount of data that needs to be transferred.
The configuration below allows all PHP, HTML, JavaScript and CSS files to be compressed. Images are already compressed, so trying to compress them any further is a waste of time. For this reason, images are excluded from the compression process.
mod_gzip_item_include file .php$
mod_gzip_item_include file .html$
mod_gzip_item_include file .js$
mod_gzip_item_include file .css$
mod_gzip_item_include mime ^text/html
mod_gzip_item_include mime ^text/plain
mod_gzip_item_include mime ^text/css$
mod_gzip_item_include mime ^application/x-javascript$
mod_gzip_item_exclude file .gif$
mod_gzip_item_exclude file .png$
mod_gzip_item_exclude file .jpg$
When I was trying to get this to work, I ran into some problems as my PHP and HTML files weren’t being compressed. I did a GET -ed http://www.britblog.com/directory/.../london.html and discovered that the Content-Type header (mime type) was coming back with a charset parameter:
Content-Type: text/html; charset=iso-8859-1
So I removed the $ from the regexp (the $ matches at the end of the string) for the mime filters so that they would allow for additional parameters, et voilĂ !
At the moment this is only running on the development site, but I’ll turn it on once we move to the new server.
If you are interested, the download data was captured and plotted by Site Confidence.