ATA Presentation 2007: Roll your own: Web-based Translation Portals and Translation Management Systems
Copyright © ProZ.com and the author, 1999-2013. All rights reserved.
One of the hottest trends in the translation industry is the migration of language services to the Web. Large services providers are able to reap additional efficiencies by offering web portals for online translation as well as translation management. In this article I explore the possibilities of using high-quality free and open source web portals for online translation and translation management. Learn how you can integrate your translation workflow and project management system as well as create an online translation portal for team translations.
Web-based translation has been adopted as another arrow in their quiver by many large language service providers and translation tools vendors. Many of these organizations have created proprietary versions of online translation portals and have added its usage to their offering as a value-added service for customers to streamline project management. In addition, they may require contracted service providers to use their particular offering. This article describes the implementation and usage of Pootle, a free and open source translation portal and translation management system.
History of Web-Based Translation
Web-based translations have their origins in the Open Source gettext utilities of the Free Software Foundation’s GNU project and date back as far as 1995. A group of dedicated individuals formed the Translation Project and provided a framework for internationalization and translation in order to make Native Language Support (NLS) for GNU software possible.
The GNU gettext toolset helps programmers and translators produce, update, and use translation files, mainly those files that are textual, editable files. The vehicle used by the gettext utilities and for making translation of text resources possible is the .po file. The letters PO in ‘.po’ file stand for Portable Object. This paradigm, as well as the PO file format, was inspired by the NLS standard developed by Uniforum and first implemented by Sun in their Solaris operating system.
A step further in the evolution of these projects is the WordForge project, whose aim is to build standards-based localization tools for Free and Open Source Software (FOSS). The package enabling web-based translation is a toolset called Pootle. The acronym Pootle is derived from the term PO-based Online Translation / Localization Engine.
What is Pootle?
According to its README file, Pootle is a web translation and translation management engine. What does that mean? Well, two things, really.
The first is web translation, or more precisely, web-based translation. In other words, translation that uses the Web to facilitate the translation activity. Using Pootle, translators can translate a document from within their favorite web browser. All they need to translate anywhere in the world is a computer with an Internet connection and a web browser – an offline component is currently in development. Like any modern Translation Environment Tool (TET), Pootle can use suggestions from a translation memory. In suggestion mode, contributions can be checked by translators with sign-off authority before being accepted into the final copy, Language Tech News May 2007 which is ideal for bug reporting. Pootle also performs terminology matching from glossaries in the terminology project. Pootle uses Unicode throughout, so it works with any target language, including right-to-left display for those languages that need it. Clickable characters help users who cannot type all the characters of their languages. Furthermore, Pootle allows for highlighting errors in translation and translator comments. The latter is especially useful in a translation team environment and as part of an online review process.
The second part, translation management, refers to the web-based management of translations. Unlike most other major test, Pootle does not rely on a backend database, but rather on Lucene, a fast text indexing engine that enables Pootle’s terminology matching and translation memory recall.
Updates to previously translated documents are facilitated through integration into a version control system such as CVS or its successor, Subversion. By integrating Pootle with the Translate Toolkit, translation managers can access statistics with string and word counts. The Toolkit also allows for translation of files in XLIFF format in addition to the aforementioned PO files, making it compatible with many other tools.
Furthermore, Pootle can handle many different file formats by converting to PO and back again, including OpenOffice, Mozilla, CSV, Qt, plain text, HTML, XLIFF, Java properties, TMX, and TBX. Translation managers can set goals for translations teams and assign work to various translators with permissions for different functions.
How does Pootle work?
On a low level, the concept works as follows: a PO file is a document that can be used directly by the translation engine, rather than by being stored in a database. In order to allow for quick access to statistics, they are cached in .stats files. This allows most checks and so on to be performed in advance and then quickly iterated.
On a high level, upon logging in, the translator is presented with a webpage containing navigational features at the top and available languages and projects, as well as account information, in the upper right. After the source language is selected, the relevant source files are presented. Once a source file has been selected, the pertinent project data and two columns containing source and target segments appear and make up the body of the page. Segments can range from short phrases to several sentences, depending on how the PO files were generated and parsed.
When the translator is ready to start translating, he/she moves the mouse over the selected segment and clicks on the Edit button that appears next to the selected segment. This makes the target segment editable via form input and provides links for navigation, sizing of the form fields, and two ways to submit the translated segment: by way of suggestion and by submitting the translated segment.
Pootle depends on several software packages to provide low-level functionality. These are all FOSS, but since there is no unified installer for the whole package, this makes installing a working copy of Pootle somewhat less than trivial.
The first requirement for Pootle is the Translate Toolkit. This software package provides a range of functions used by the Pootle software package under the hood. The Translate Toolkit is hosted by the largest FOSS repository in existence at sourceforge.net. Be sure to check the README file in the Pootle software package for the required version of Translate Toolkit in order to avoid any incompatibilities.
Both Pootle and the Translate Toolkit require Python to be installed. Python is a dynamic object-oriented programming language and is the code in which Pootle is written. Again, check the README file in the Pootle and Translate Toolkit software packages for the required version of Python. Depending on your operating system, you may have to install Python from source code so the Translate Toolkit will be accessible in your Python Path.
For the web interface of the package, Pootle requires jToolkit, also available from sourceforge.net. JToolkit is the server engine Language Tech News May 2007 that enables the server- and web-based portions of Pootle. It can run applications either via the most popular web server on the planet, Apache, or standalone from the command line. For reasons I won’t go into here, at the moment Pootle works best run from the command line. It’s also much easier! Downloads of PO file bundles are compressed by another software package called ZIP and this enables users to download ZIP files of directories for translation offline.
For integration with the version control systems CVS and/or Subversion, the relevant version control client must be installed and accessible from the respective directory contained in the PATH variable of the user profile.
The templating functionality of Pootle is taken care of by a software package by the name of Kid, which in turn requires the software package ElementTree for XML processing.
Installation of the software package PyLucene is optional, but helps to speed up searching.
PyLucene uses the text-indexing engine Lucene to provide an effective and very fast way of indexing PO files.
Once installation of all required packages is complete, Pootle itself can be installed using the instructions from the above-mentioned Pootle README file.
Once Pootle is installed, you should edit the preferences file and specify your languages and projects. The server is started by entering the PootleServer command in a terminal client window.
At this point you are ready to begin translating. Just point your web browser to the local address http://localhost:8080/ and you should see the projects and languages page where you can register. After registering, you may log in and start translating.
If you would like to have a look at a Pootle server in action, please visit the good folks at the WordForge project and point your Web browser to http://pootle.wordforge.org/. There you can participate in various localization projects in (at the time of this writing) 94 languages and 12 projects, including the popular web browser Firefox and the office suite OpenOffice.
Now we are going to dive into actually translating online using a Pootle server, the web-based translation portal. When I wrote the first part in this three-part series, I had planned on using the Pootle server I installed on my own machine. After some consideration, however, I decided to use another, active public server for an existing project. The advantages of this will become evident shortly.
First of all, I had not installed any projects except a small test project that does not have a whole lot to show for in terms of available source text. By choosing a server for an already existing project, this would allow me to access a large collection of source texts. Moreover, it would allow a better demonstration of the complete feature set of the latest version of Pootle. Lastly, these efforts would actually contribute to an existing open source software translation project, thereby helping the user community at large use a free and open source piece of software in their language of choice, a fabulous side effect.
The project I chose is the OpenOffice.org office suite of productivity software, an alternative to the offering from Microsoft. The Pootle server for this project is available off the Pootle page listing public servers at . It can be accessed at :
This server hosts a couple of different projects and language combinations. We will be focusing on the OpenOffice.org 2.3 UI project in German since that is my particular forte. If you would like to find out more about the particular Pootle server you are working with, just click the "About this Pootle server" link, and this is what you will see:
Once you have zeroed in on your project of choice, you need to register with the project in order to be given access:
You will be redirected to the registration page:
Once registered, you can log in using the activation code you received in the registration e-mail to access the project:
You will once again be redirected, this time to the login page:
You will be greeted by a warning that your login failed; don't let that detract you, just enter your login credentials:
Upon logging in, you are asked to select your language and a project:
The options window lets you customize not just your language and project, but also a few other parameters to suit your working environment:
Projects and Languages
Now it's time to select your language:
After the source language is selected, you are presented with navigational features at the top as well as account information in the upper right. Below that you will find a list of available projects as active links:
Once you click on the project of choice, the relevant source folders are presented on the individual project page:
For this exercise, I picked the source folder because it was close to being 100 % translated:
The available source files appear once you click on the source folder:
Once a source file has been selected, the pertinent project data and two columns containing source and target segments appear and make up the body of the page. Segments can range from short phrases to several sentences, depending on how the PO files were generated and parsed. Now we're ready to enter an actual translation in the translation window that contains the original text in the source language, and the translation or an Edit button in the Translation area for your translation in the target language:
When the translator is ready to start translating, he/she moves the mouse over the selected segment and clicks on the Edit button that appears next to the selected segment. This makes the target segment editable via form input and provides links for navigation, sizing of the form fields, and two ways to submit the translated segment: by way of suggestion and by submitting the translated segment. The Edit field sports some navigational features and a copy button to copy source text in the translation field:
When you're done translating, click on the submit button:
Since this was the last string to be translated, we are redirected to the End-of-batch page:
Back on the index page we can now see that 100% of this file has been translated:
For the statistics geeks out there, there are statistics about the available project folders, as well:
And individual file statistics give project managers a better handle on managing a particular translation project:
The documentation page sports links to a user guide for Pootle, as well as some additional resources to address issues that you might have:
The howto part of the documentation gives a short overview of the general process for using Pootle as an online translation portal. This is not exhaustive and serves as a reference point rather than an extensive tutorial:
If you would like to have a look at a Pootle server in action, please visit the good folks with the Wordforge project and point your Web browser to . There you can participate in various localization projects in (at the time of this writing) 94 languages and 12 projects, including the popular Web browser Firefox and the office suite OpenOffice.
Project administrators can assign work in the same project to different people. Different files from the same project can be assigned, or even a single file can be split between different people.
Pootle introduces the concept of goals used akin to a translation project. At the start of a project, all files are not part of a goal, and the project manager assigns files and translators to a given goal in edit mode.
Assignments can be made per given source file directory, file, or even string that is part of a goal by adding a user to the goal. You can even differentiate between translated and untranslated strings. This implies that assignments can be made for the same directory, file, and string to more than one translator.
Pootle is a web-based translation portal and translation management system for use in production-level environments. It enables online translation and review, allows for highly granular work assignment, gives statistics and even allows for offline translation. Pootle can be run as an Internet server for remote distribution or managed in-house on an Intranet. Its ability to use existing TMX translation memories and TBX glossaries with suggestion capability as well as its ability for integration into version control systems make it a very powerful option for translation service providers and even individual translators.
2) Translate Toolkit