When Written: Feb 2013
A couple of enquiries cropped up the other week relating in different way to using foreign characters in web sites. Normally as long as you set the page encoding to ‘UTF-8’ all should be fine, but one of the enquires wanted a web site converted for display in Libya where the writing is in Arabic and as a result reads from right to left. Fearing a bit of a battle with getting this to work I suggested that the client sent me a word document with the Arabic text and the English equivalent. I then copied the text from the word document and within Dreamweaver pasted just the text without any formatting to avoid the horrendous Word generated html.
Flowing text from right to left is surprisingly easy to do in HTML
The text went in perfectly and retained the Arabic characters. However the text flow was from left to right as with most European languages. To fix this all I had to do was to edit the html tag to:
<html dir=”rtl”>
The text then flowed correctly from right to left. This included the menu bar as it was a styled unordered list. If you want it to be positioned so it read in the same order as the English version of the web site then all you need to do is to add to the enclosing <div> tag:
<div dir=”ltr”>
It really is as easy as that, a lot of work has been done over the years for multilingual support on the web and in the tools that we use, it is really satisfying when it all comes together and works!
The second enquiry I had was about a web site that we are not involved in but a friend is. This site has just moved to a new server and is written in ASP .ASP.NET using SQL Server to serve some of its pages as well as a content management system. The problem was searching for accented words from within a web page. The data was stored in the SQL database using nvarchar data types which is correct but the problem was that when the database was created the wrong collation was used. The collation tells the database engine how to handle differing character sets. To change the collation is a big job meaning the recreation of the database, creating the tables and then bulk copying the data back in. Backups could not be used as the collation is stored in that. Rather than rebuild this entire database just so that a single query would work it was decided to change the query so when a web user typed a person’s name to look for the database engine would ignore not only the case but also would ignore accents so José would be found if ‘”jose” was entered. The query was changed from:
SELECT Name FROM PeopleWHERE Name =@userentry ORDER BY Name ASC
To
SELECT Name FROM PeopleWHERE Name =@userentry collate SQL_Latin1_General_CP1_CI_AI ORDER BY Name ASC
The extra command of ‘collate SQL_Latin1_General_CP1_CI_AI’ tells SQL server to ignore accents and casing for the purpose of finding and ordering. If you later want to change this behaviour then changing CI controls the casing behaviour and AI the accent behaviour.
Article by: Mark Newton
Published in: Mark Newton