When Written: Sept 2006
In a previous article I wrote about the problems of giving clients a way of extracting data from their web site onto their office machines. For example you have people registering on a web site and your client wants to load these people into their office database. There are several ways you might achieve this, from establishing VPN tunnels to cutting and pasting the data from a web page.
However the other week I stumbled across an extremely neat way of gathering data from almost any web page. The technique is so simple and yet so flexible I wondered if I was the only one to have missed it all this time. But everyone I have shown it to has expressed delight (we are sad like that here!) With such a useful method of extracting data in a usable format from web pages, it is embarrassing for me to learn that this feature has been in Microsoft’s Excel for some time but in Office 2003 it is particularly easy to use. But, hey, this is a real world column and in the real world we are humans who miss things, not great gods of knowledge who live in ivory towers dispensing gems of wisdom. So here goes.
Web Queries – a great way to get data from a web site
First find yourself a web page that contains some tabular data, this is where the information is contained within <TABLE> tags. Open your copy of Excel 2002 or newer. From the menu bar select Data | Import External Data | New Web Query. A window will open up showing a web page with an address bar. In this address bar enter the URL of the web page that contains the tabular data you want to import, or just browse to it.
You will notice small yellow boxes with arrows in them, these define areas that the Web Query has found that it can import (normally the start of a <TABLE> tag) . Click on the yellow box that is in the top left hand corner of the area that contains the data you are interested in, the box will change to green and will now containing a ‘tick’ mark. Next click the ‘Import’ button at the bottom of the web query window. The next dialog box asks where you want to put the data and also allows you to change the properties of the web query, but for now we will ignore these extra options as the defaults are fine for most uses. Click Ok and watch as Excel imports the data in to a spreadsheet.
The really neat thing about this way of doing things is that this data is placed in a ‘data area’ which can be made to refresh itself from the web page anytime you want, either manually or automatically. So if a client wants to get to the data on their web site, you just need to produce a simple dynamic web page which displays the data in a table and point an Excel Web Query at it. If the data is huge, then being Excel, you can program the process of a web query and get it to step through several pages gathering data, but if the data is this large, then perhaps you should be looking at replicating the data between an on-line and an off-line database, but for that you will need to put the ‘long trousers’ on.
Article by: Mark Newton
Published in: Mark Newton