Measuring PO files - word counts, etc.
Thread poster: Carl Carter

Carl Carter
Germany
Local time: 21:53
German to English
Mar 18, 2010

Dear colleagues,

I've received over 70 PO files to translate and have been asked to make a quote for localising them now. My usual CAT tool DXV is able to import and count them, which I have also done, but I'd like to ask anyone who has already translated files like these if they can recommend any other tools for doing a word count.

I don't want to do any "workarounds" or file conversions in this case, but just use a tool that includes this type of counting functionality. POedit is a great little application for translating POs with, but it doesn't have a counting facility, for example. My OS is Windows Vista (Business). What I'd like to count is the no. of words that have to be translated and then obviously use this figure to calculate the cost of the project for the customer.

The reason for me asking is that I'd like to verify DXV's word count before I make my quote as the project's quite a big one and I want the quotation to be accurate. I don't trust DVX's usual word counts for Microsoft Office formats as they differ from other standard word count tools' a fair bit, and I've just discovered that DXV's line and word counts for some of the PO files are higher than the no. of lines and words I've counted manually while they were open in POedit.

Any suggestions, please?

Thanks in advance.

Carl

Amper Translation Service
Carl Carter
Bahnhofstr. 2
D-82256 Fürstenfeldbruck

Tel. 08141/36379-65 // Fax 08141/36379-63

E-Mail: carl.carter@ampertrans.de // Web: http://www.ampertrans.de

Mitglied des deutschen Übersetzerverbandes BDÜ, www.bdue.de

Proz profile: http://www.proz.com/profile/774269

================================


Direct link Reply with quote
 

Didier Briel  Identity Verified
France
Local time: 21:53
Member (2007)
English to French
+ ...
OmegaT Mar 18, 2010

Carl Carter wrote:

I've received over 70 PO files to translate and have been asked to make a quote for localising them now. My usual CAT tool DXV is able to import and count them, which I have also done, but I'd like to ask anyone who has already translated files like these if they can recommend any other tools for doing a word count.


OmegaT can load PO files, and then provide statistics.

Didier


Direct link Reply with quote
 

Eric Le Carre  Identity Verified
France
Local time: 21:53
Member (2004)
English to French
+ ...
The Translate Toolkit Mar 18, 2010

Hi Carl,

I once used 'The Translate Toolkit' available at: http://translate.sourceforge.net/wiki/toolkit/index.

It includes a very useful PO couting tool called pocount (see here: http://translate.sourceforge.net/wiki/toolkit/pocount) that I managed to run from the Windows command line.

I wrote 'I managed' because this is definitely not a low-tech solution. You need to read the accompanying documentation carefully to get it run, but it worked for me in the end.

Good luck.

Eric


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 21:53
Member (2006)
English to Afrikaans
+ ...
Working with PO files Mar 18, 2010

Carl Carter wrote:
I've received over 70 PO files to translate and have been asked to make a quote for localising them now.


The Translate Toolkit has a tool for counting words and strings in PO files. This is a command line tool but works on directories also. So if your PO files are in the folder "clientfiles", you'd issue this command:

pocount.exe clientfiles > clientfiles_count.txt

...and it will create a report per file and for the total files. The report includes the number of words and number of strings (both source and target counts), of the translated, fuzzy and untranslated (and total) string counts.

I don't want to do any "workarounds" or file conversions in this case, but just use a tool that includes this type of counting functionality.


There is a deliberate bug in the Translate Toolkit in that it refuses to read UTF-8 files that have a byte order mark. So if your PO files have it, you have to remove it first. You can do so in bulk using Rainbow from Okapi tools:

Select the files, go to the Output tab, deselect the option to add a new file extension, and then go Utilities > Byte-order-mark conversion.

You can also use OmegaT to do the counting, as Didier had indicated. OmegaT's support for PO is not 100% up to scratch (no support for plurals, no support for msgctxt, etc), but it is quite good nonetheless.


Direct link Reply with quote
 

Didier Briel  Identity Verified
France
Local time: 21:53
Member (2007)
English to French
+ ...
Simple plurals are supported Mar 18, 2010

Samuel Murray wrote:

OmegaT's support for PO is not 100% up to scratch (no support for plurals, no support for msgctxt, etc), but it is quite good nonetheless.


Simple (i.e., one plural only) plurals are supported. E.g.
msgid "Delete File"
msgid_plural "Delete Files"
msgstr[0] "Supprimer le fichier"
msgstr[1] "Supprimer les fichiers"

Didier


Direct link Reply with quote
 
Achim Herrmann
Local time: 21:53
English to German
Use a software localization tool for software files Mar 19, 2010

Hello Carl,

I would recommend to use a real software localization tool that supports PO files. The benefit will be that you can automatically align the files if they already contain translations and you can easily update the localization project if your client is sending a new drop.

Update features in software localization tools are able to detect which file has changed. Changes and new strings will be incorporated into the project. So no need to restart translation with your TM, only translate new and changed strings.

Our own tool SDL Passolo has a special PO add-in that also supports plural cases.

Hope this helps.

Achim Herrmann


Direct link Reply with quote
 

Carl Carter
Germany
Local time: 21:53
German to English
TOPIC STARTER
Everyone's suggestions Mar 19, 2010

Thanks very much for all your suggestions and advice! That was a very quick response with lots of useful information.)

Regards

Carl

Amper Translation Service
Carl Carter
Bahnhofstr. 2
D-82256 Fürstenfeldbruck

Tel. 08141/36379-65 // Fax 08141/36379-63

E-Mail: carl.carter@ampertrans.de // Web: http://www.ampertrans.de

Mitglied des deutschen Übersetzerverbandes BDÜ, www.bdue.de

Proz profile: http://www.proz.com/profile/774269

================================


Direct link Reply with quote
 

Ivaylo Ivanov  Identity Verified
Luxembourg
Member (2005)
English to Bulgarian
+ ...
PO files with Trados Studio? Mar 19, 2010

Does anyone know whether or not it is possible to translate PO files using Trados Studio 2009?

Direct link Reply with quote
 

Carl Carter
Germany
Local time: 21:53
German to English
TOPIC STARTER
What's a "real" software localisation tool? Mar 19, 2010

Achim Herrmann wrote:

Hello Carl,

I would recommend to use a real software localization tool that supports PO files.


Hi Achim,

I appreciate your comments, but I think it should be said in all fairness that tools like POedit ARE "real software localisation tools"; they may not be as elaborate or feature-rich as Passolo, but they still work well and include a TM component as well.


Our own tool SDL Passolo has a special PO add-in that also supports plural cases.


Actually, all I need the tool for is for making a reliable word count. The actual translation is going to be done by someone else using either POedit or LocFactory.

You didn't say what sort of counting is possible using Passolo. Does it measure "msgstr" as well as "msgid"? I need to find out the no. of words in each "msgstr" line, which is for the translation. The reason is that the files have already been translated once (into German) and the customer wants us to translate the German version into French. Can Passolo do this? Déjà Vu X can.

Regards

Carl

P.S. Just as an aside, why isn't it possible to download a trial version of Passolo without having to fill in a form first of all? I don't like the thought of a sales rep contacting me at some point as a result of me disclosing my personal data. As I've said, all I want to do is count the words in our PO files. Perhaps your firm should reconsider its sales approach. After all, people who want to buy the product will get in touch with you at some point anyway, won't they?

======================


Direct link Reply with quote
 

Carl Carter
Germany
Local time: 21:53
German to English
TOPIC STARTER
The Translate Toolkit Mar 19, 2010

Eric Le Carre wrote:

Hi Carl,

I once used 'The Translate Toolkit' available at: http://translate.sourceforge.net/wiki/toolkit/index.

It includes a very useful PO couting tool called pocount.



Hi Eric,

Thanks again for your helpful suggestion. I took a look at this today, but decided not to proceed with it any further because the PO counting tool is basically just a small utility in a very big software package; it seems you (now) have to download three applications as a bundle to get The Translate Toolkit. Installing the programs and then having to read the documentation to use the count tool via the Windows command line - which I'm not very familiar with either - seems like a lot of work to me. A colleague of mine did a word count using LocFactory (for Apple Macs), which worked, and I managed to do one pretty easily using OmegaT today, which I haven't used before; the results were OK, I guess, but it doesn't show you which strings have been measured - I expect it counted "msgid", but I need to count "msgstr" (as I've explained in my reply to Achim's message).

Anyhow, although I haven't got much further with my problem, I have learnt a bit more about the programs used for this type of work.

Regards

Carl

Amper Translation Service
Carl Carter
Bahnhofstr. 2
D-82256 Fürstenfeldbruck

============


Direct link Reply with quote
 

Didier Briel  Identity Verified
France
Local time: 21:53
Member (2007)
English to French
+ ...
OmegaT count both msgid and msgstr Mar 19, 2010

Carl Carter wrote:
I managed to do one pretty easily using OmegaT today, which I haven't used before; the results were OK, I guess, but it doesn't show you which strings have been measured - I expect it counted "msgid", but I need to count "msgstr" (as I've explained in my reply to Achim's message).


Since 2.0.4, OmegaT reads existing translations in PO files.

That means that both msgid and msgstr are counted.

Didier


Direct link Reply with quote
 

Juan Pagola  Identity Verified
Argentina
Local time: 16:53
English to Spanish
+ ...
Use POedit Jan 9, 2014

Hi Didier,

Here's an easy way to count words on a .PO file.

1. Download POedit at: http://www.poedit.net/download.php (it's free).

2. Open your .PO file and then go to File>Export as HTML.

3. Open the exported HTML file and copy all the text to translate.

4. Paste the text in a new Word document and count the words as you would usually do


Direct link Reply with quote
 
xxxvslavik
Czech Republic
Re: Use Poedit Jan 11, 2014

Juan Pagola wrote:
1. Download POedit at: http://www.poedit.net/download.php (it's free).
2. Open your .PO file and then go to File>Export as HTML.
3. Open the exported HTML file and copy all the text to translate.
4. Paste the text in a new Word document and count the words as you would usually do

Hi Juan, Poedit developer here. Let me chime in with a brief remark -- you don't have to go through these hops anymore, Poedit now (finally!) has word count statistics in its Pro version. Sorry it took me so long to fix this.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Measuring PO files - word counts, etc.

Advanced search






BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search