Different word count when opening pdf and idml file in trados
Thread poster: Phuong Nguyen

Phuong Nguyen
Vietnam
Local time: 19:01
Member (2016)
English to Vietnamese
+ ...
Jan 12

Hello,
My client send me a pdf and a idml version of the same file. However, when I opened the 2 files in Trados Studio 2017, the word counts are totally different. The word count for the pdf file is more than 2400 words, while that of the idml file is only 2300 words. Can anyone let me know the reason for that, and which file should I use for word count analysis?
Thanks and best regards,


 

Roy Oestensen  Identity Verified
Norway
Local time: 14:01
Member (2010)
English to Norwegian (Bokmal)
+ ...
Could you expand a little? Jan 12

I understand that you let Studio count the words of the idml file, but how do you count the words in the pdf? If you copy the contents into Word, then you should be aware that Word and Studio do not count words the same way. There is no generally agreed definition of what constitutes a word when it comes to counting. For instance in-house: One or two words? One motor will count it as two, another as one.

You find this subject discussed on the web, by the way. Just search for "word count" or something similar. There actually are places where Word, Dejavu and Studio are compared and where the differences are discussed.

If you have imported the pdf file into Studio (but why would you), then, of course, I would expect basically the same number, though.

Roy


 

Phuong Nguyen
Vietnam
Local time: 19:01
Member (2016)
English to Vietnamese
+ ...
TOPIC STARTER
I counted both of them using the same version of Trados Jan 12

Roy Oestensen wrote:

I understand that you let Studio count the words of the idml file, but how do you count the words in the pdf? If you copy the contents into Word, then you should be aware that Word and Studio do not count words the same way. There is no generally agreed definition of what constitutes a word when it comes to counting. For instance in-house: One or two words? One motor will count it as two, another as one.

You find this subject discussed on the web, by the way. Just search for "word count" or something similar. There actually are places where Word, Dejavu and Studio are compared and where the differences are discussed.

If you have imported the pdf file into Studio (but why would you), then, of course, I would expect basically the same number, though.

Roy

I use Trados to count both of them. My client asked me to give them a quote, and at first I used the pdf file for word count to use for my quote. However, after that they said that their word count analysis was different, and asked me to recheck, I created a project with the idml file to count the words, and it gave a me different result! I did try saving the idml file in pdf format, and count it again in Trados, and this one resulted in a totally different word count from those 2 files! So weird!


 

Jaime Oriard  Identity Verified
Mexico
Local time: 07:01
Member (2005)
English to Spanish
+ ...
Image Jan 12

This is the first thing that comes to mind. Maybe there is an image with editable text embed ed in the InDesign file. In this case, the text can be extracted from the PDF directly, but will not be included in the IDML.

Hope this helps.


 

Phuong Nguyen
Vietnam
Local time: 19:01
Member (2016)
English to Vietnamese
+ ...
TOPIC STARTER
Problem identified Jan 12

Jaime Oriard wrote:

This is the first thing that comes to mind. Maybe there is an image with editable text embed ed in the InDesign file. In this case, the text can be extracted from the PDF directly, but will not be included in the IDML.

Hope this helps.

Thank you! I've had a check, and found out that the when counting words in the PDF files, Trados does take into account the texts in the header and footer of the document, but did not do that to the IDML file. OMG, it took me the whole day trying to figure it out!


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Different word count when opening pdf and idml file in trados

Advanced search







BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »
SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search