When Written: Dec 2012
The other day we needed to produce a web application that an administrator could use to produce a series of several thousand unique numbers and store them in a database. The reason for this was so that holographic serial number labels could be created, each with a unique id which would be attached to the product. Then the purchaser of the product could enter this ID on a web site to confirm the originality of the product and make sure that it was not a fake, and in doing this the number would be marked as registered and could not be re-used. With more and more fakes being made and some of these being almost indistinguishable from the originals, more manufacturers are turning to such systems to reassure customers that they have a genuine product and also to alert the manufacturer to any copies being produced.
With this requirement defined it was discovered from the client that they might have a need to produce up to quarter of a million of these numbers at any one time. Now whilst at first this did not seem to be an issue, the problem with such a large number of numbers is that the length of time the web script would need to run on the server would exceed any time out periods on the web server and not only would the user be presented with an unresponsive page but also the same page would probably throw an error when the script timed out which is always a consideration when writing web based applications . Looking on-line for suggestions came up with ideas that mostly involved using SQL to call a command line script which ran in the background, but this would have meant changing the security settings on the web server to allow the ASP service to run operating system commands and this would leave a security hole large enough to float the Titanic in sideways! What was needed was a fast running SQL script that could be used to produce any number of records. I must admit I was stumped on this problem and the technique I used did not have the performance required. My SQL query looked something like:
insert into mydata ( country) select 'uk' union all select 'uk' union all select 'uk'
This example would produce three records each with ‘uk’ in their country field and the unique Id would be created in another field that had a default value of GUID. But as you can see this gets very clumsy when you want to add a lot of records and the performance was not great, although perhaps surprisingly it would produce up to 10,000 records. However, we were aiming for 250,000.
About the same time as I was working on this problem a very good friend, Dr Harvey Fox, who I have written about here in the past, was spending the weekend with us. Now Harvey lives and breathes databases and is always up for a challenge. So when we were talking about our current work projects he mentioned that he had already come across this problem with his work on a large commerce system that he wrote. On a regular basis he needed to generate a set number of records in a SQL database. Harvey’s solution was elegant and more important, fast. The technique he used was one of cross joins, which normally occur when you have a couple of database tables and you forget to specify the join. What happens is that every record in the first table is joined to every record in the second table so if the first table has ten records and the second table has twenty tables you end with an output table of 10 x 20 or two hundred records! This normally is not what you want when joining tables and is often a ‘doh!’ moment as your database grinds to a halt, but for our case here it can be put to good use. To show you how let’s say you have five tables each with ten records in each of them. Then by using cross joins like:
SELECT * FROM Numbers AS Units, Numbers AS Tens, Numbers AS Hundreds, Numbers AS Thousands, Numbers as TensOfThousands;
You will end up with 10 x10 x 10 x10 x10 or 100,000 records. To fully make use of this technique to produce our codes instead of multiple tables we can use just one table and use aliases for the others. So we create a small table with ten records in it and a single field called ‘num’ in this we store the numbers 0 – 9 in these records then with a query like:
SELECT [TensOfThousands][num]*1000 + [Thousands].[num]*1000 + [Hundreds].[num]*100 + [Tens].[num]*10 + [Units].[num] AS Id
FROM Numbers AS Units, Numbers AS Tens, Numbers AS Hundreds, Numbers AS Thousands, Numbers AS TensOfThousands;
We can now output a recordset with 100,000 records in it, and obviously this number of records can be expanded or reduced as required. As with any query that adds a large number of records to a table it is a good idea to switch any indexing off first, as doing this will speed the inserts up considerably and then when the inserting is finished, switch the indexing back on. A very useful technique and one that can be adapted to produce any number of new records, thanks Harvey!
Article by: Mark Newton
Published in: Mark Newton