I’ve been playing with web service APIs quite a lot lately, especially the amazon and flickr ones. Anyway, my current tinkering lead to … wait for it … wankr.co.uk.
It’s a pretty simple site: it displays recent photos tagged with the word ‘wankr’. However, the important thing for me was building a workable caching mechanism. The main advantage of web services is also be their biggest problem: they typically run on remote web servers. This means that (if you want to incorporate results from a web service call into a web page), there can be a noticeable delay for the user of your web site while they wait for the page to be compiled (server-side) and returned to them.
So as I said, the caching mechanism was the important thing for me on this one, and so far it works quite well I think (famous last words…). The main cause for delay here is getting the tag details for each photo that is displayed. You have to run a seperate query against each image, so in this particular example that is 1 query to get the list of images, plus 36 queries to get the tag details for each of the photos.
That’s a lot of queries, and as you can see in the graph below (at 12:46:05), it takes quite a while. (The data start time shows how long it takes for the first byte of data to come back from the web server once the request has been made. This typically is the time taken for the web server to generate the page, pulling in content from sources like databases, and in this case, remote web services.)
The caching I’ve implemented stores tag details for each photo indefinitely. I’m guessing that photo tags won’t be updated that frequently — at least in the time they’re displayed on the wankr homepage — so this shouldn’t cause a problem. The remaining task here is to write a process to clean up old cache files.
In terms of the main API call (the one that gets all the photos tagged with ‘wankr’), this is effectively cached for 5 minutes. I actually cache the rendered HTML for five minutes, so the response in the event of cache validity is quite good. On the graph below, the results at 12:46:42 show cached image tag data (so no need to make those 36 queries to flickr) and a re-query of the flickr API (to look for any new photos). Finally, the results at 12:47:19 show totally cached HTML (so no web service queries), and as you can see a nice quick data start (the yellow) time!
The cache model isn’t perfect (for example there is no file locking implemented), so things could get a bit messy if there were a few users on the site at once. I’ve added some cache-friendly HTTP headers too, which may improve the performance for some frequent visitors…
Anyway, the site is just a bit of an experiment at the moment, but feel free to mention it to others It’s got a tag cloud, so does that make it a Web 2.0 project?!