Any software can localize US English into UK English?
Thread poster: Catlin Fu

Catlin Fu
United Kingdom
Local time: 14:00
May 14, 2010

Hi,

One of our clients would like to localize 1 million words from US English spelling into UK English spelling. I wonder if there is any software which can automate this process.

Any suggestions?

Cat


 

Jack Doughty  Identity Verified
United Kingdom
Local time: 14:00
Member (2000)
Russian to English
+ ...
Paul Tate's converter. May 14, 2010

By googling, I found that a guy called Paul Tate has developed software called "Briticizer" which you can download free from here: http://us2uk.eu/

I haven't tried it, but you might like to.


 

Catlin Fu
United Kingdom
Local time: 14:00
TOPIC STARTER
It works! May 14, 2010

Hi Jack,

I have tested out the site and it works!

Thank you very much!


 

xxxOlaf
Local time: 15:00
English to German
Variant Conversion Info (VARCON) perl package May 14, 2010

With 1 million words to convert you might prefer an offline solution like Varcon. It's a free Perl package that can be downloaded here:

http://downloads.sourceforge.net/wordlist/varcon-4.1.zip

If you don't have Perl installed, you can get it from ActiveState:
http://www.activestate.com/activeperl/downloads/

Since Perl can only process text files, you'll need to convert the source files to text files first, unless they are .html or .xml files.
The actual conversion is done on the command line. For example, if your want to convert the American English file a.txt to the British English file b.txt , you'd enter:

perl translate american british < a.txt > b.txt

(Of course, you need to copy the files that you want to convert to the Varcon folder.)

You could automate this by creating a batch file with one line for each file that you need to convert. Since this script only replaces words, it cannot fix American English punctuation issues, e.g. periods and commas inside quotation marks etc.


 

opolt  Identity Verified
Germany
Local time: 15:00
English to German
+ ...
Only the spelling? May 14, 2010

Does that only include the spelling (no change wrt the terminology)? That would strike me as rather odd, even ill-advised ...

If someone wants such a huge number of strings localized from en_US to en_UK, it will be mostly for cultural reasons, things like dust bin vs. trash can and the like -- because they are quite aware of the sensibilities of their British clients, or whoever strongly prefers UK over US spelling. In that case, wouldn't they want to change the terminology as well? I thought that's what proper l10n was all about ...

(I have been following US-UK localization in the Linux world for years, and it never really took off -- because it'd be a lot of work for very little gain.)

Just wondering. Anyway, the client is king.

-- opolt


 

Peter Linton  Identity Verified
Local time: 14:00
Member (2002)
Swedish to English
+ ...
Differences USE : UKE May 15, 2010

There are many other hidden hazards in localisation. Can this software deal with the following:

Word order: in USE you might say ": The result will likely be announced tomorrow". Not possible in UKE -- need something like " The result is likely to be announced tomorrow".

Capitalisation: whereas a headline in American newspaper might say "Overhaul Plan for Vote System Will Be Delayed " (New York Times), in UKE it would have less capitalisation, probably "Overhaul plan for vote system will be delayed" or perhaps "Overhaul plan for Vote System will be delayed".

Hyphenation: USE hyphenates words more by sound, UKE by morphology. So in USE "triumphant" would be hyphenated "trium-phant", UKE would be based on the structure of the word, therefore "triumph-ant".

Dates: USE uses mm/dd/yy, UKE uses dd/mm/yy. A dangerous source of confusion. 1/2/2010 is either 1st February or 2nd January.

Telephone numbers may need to be changed. US texts often have just the US phone number without the international dialling code. Also, UKE numbers do not have hyphens, so they should be removed. Also US websites may give 800 numbers (no charge) but there may be a charge when dialling from outside the US, so that needs to be mentioned.

Addresses: USE often gives US states just as 2-letter abbreviations (MO for Missouri etc), and without mentioning the country. Should be expanded for the benefit of non-US postmen.

Punctuation: USE uses fullstops more, eg Mr. Smith, whereas in UKE just Mr Smith. USE might have 11:00 a.m., UKE nomally just 11 am or 11.00 am. But both use etc. with a period (I mean fullstop).

Quotation marks: If a whole sentence is in quotation marks, any punctuation stays inside in both USE and UKE, e.g. "Have a nice day," said the bus driver. But if only part of the sentence, then in USE it says inside, but in UKE goes outside.

You may feel that some of these points are minor. True, but they immediately give away whether a text has been properly localised or not. People can easily take offence if they feel they are not being addressed in their native language. So localisation matters.


 

us2uk
Local time: 17:00
Thanks for the plug Jack! May 15, 2010

Jack Doughty wrote:

By googling, I found that a guy called Paul Tate has developed software called "Briticizer" which you can download free from here: http://us2uk.eu/

I haven't tried it, but you might like to.


A friend of mine told me I'd received an honourable mention over here - so here I am!

In reply to Peter's questions there are a lot of things which my converter can't do, just yet. However my line of work (some of it, anyway) involves working to particular house style guides - mainly the UN & EU guides - so there are a number of things which I'm working on to try and make my own life a bit easier. Dates and numbers are an important thing to get right in documents for these types of organisations, as are abbreviations and acronyms, so I'm going to introduce some elements of this to my "Briticiser" though I'm thinking of developing a separate tool to allow people to check their document against their chosen style guide. I suppose I could build that in to the existing tool but I don't really want to over-burden it with too much code to run through. Right now it's very fast, reasonably reliable and can handle 70 pages or more. I developed it from a Word macro I wrote, which was better in some ways, but very slow and it didn't make you read the document as the online converter does which, for me, is a great way to get a first-feel for the document.

With regard to words such as "pants/trousers" & "trunk/boot" etc. the best it can do is to highlight those words and warn you about them - in fact I've just added a couple of examples to the converter and I'll be adding a lot more soon.

"Word order" is a very difficult challenge - a lot depends on how consistent the various writers are with the way they say things. Trying to work out the context of the sentence where it finds a particular word is difficult too, for instance using the word "can" as a noun (as in a "can" of soup instead of a "tin" of soup) presents an obvious challenge.

I'm glad to hear that someone else has found it useful - I do read all of the comments which people send to me through the contact form at http://us2uk.eu (because there are so few!) so I'd be happy to converse with people personally if there are any improvements they'd like to see made to the converter.

/ реклама

[Edited at 2010-05-15 13:40 GMT]


 

Craig Meulen  Identity Verified
United Kingdom
Local time: 14:00
German to English
+ ...
Style Guides May 17, 2010

us2uk wrote:
However my line of work (some of it, anyway) involves working to particular house style guides - mainly the UN & EU guides - so there are a number of things which I'm working on to try and make my own life a bit easier. Dates and numbers are an important thing to get right in documents for these types of organisations, as are abbreviations and acronyms, so I'm going to introduce some elements of this to my "Briticiser" though I'm thinking of developing a separate tool to allow people to check their document against their chosen style guide.



A tool to check against a chosen style guide - that would be VERY useful!!


 

us2uk
Local time: 17:00
Re: Style Guides May 21, 2010

Craig Meulen wrote:

A tool to check against a chosen style guide - that would be VERY useful!!


But quite difficult to do as an all-in-one thing, though I have a couple of things in mind which will help to take the grind out of one or two of the processes - checking abbreviations/acronyms against a selected list, for instance. It's probably going to be beyond my capabilities to check document formatting - but formats of dates, checking that things such as "etc." are all preceded by a comma and a space, and checking that there are no "etc " or "Etc.', for example, will be relatively easy, and it should be possible to allow people to define their own personal sets of rules for those kinds of things.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Any software can localize US English into UK English?

Advanced search






Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
WordFinder Unlimited
For clarity and excellence

WordFinder is the leading dictionary service that gives you the words you want anywhere, anytime. Access 260+ dictionaries from the world's leading dictionary publishers in virtually any device. Find the right word anywhere, anytime - online or offline.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search