When Written: April 2006
Whilst we are on the subject of web server logs and activity, there is one group of products that is causing some problems with excess bandwidth usage as well as making a web site appear much busier that it actually is. We have all see adverts for ‘speed up your browsing’ products.
These often work simply by following the links on the current page of a web site that you are looking at and downloading these other pages in the background as a form of local cache, so that when you click on a link the page is already downloaded to your browser. This is called ‘pre-fetching’.
The problem with this behaviour from a web site’s aspect, is that many hosting companies charge on the amount of data downloaded to browsers and this can suddenly increase with the use of such products. A web site can also be rendered less responsive because of the increased number of requests, the majority of which are not needed. This happened with the release of Fasterfox, which ‘pre-fetches’ web pages in the way I have just described. However at least with this ‘accelerator product’ there is a way of stopping it doing this on your web site. You are probably already aware of the trick of placing a text file in the root of your web site called ‘robots.txt’ which can instruct most search engine to ignore certain folders. Well, placing an entry in this file will also stop FasterFox from trawling your web site. The entry is as follows:
User-agent: Fasterfox
Disallow: /
It’s as simple as that. However if you wanted to stop a product like Google’s Web accelerator, you can’t using this method as it totally ignores the robots.txt file against all conventions. If you want to see if your web site is suffering from this extra traffic then look for ‘X-moz: prefetch’ in the log files. As for stopping it? Well that might be a topic for another article!
Article by: Mark Newton
Published in: Mark Newton