Mobile menu

Strange html-file?
Thread poster: Heinrich Pesch

Heinrich Pesch  Identity Verified
Finland
Local time: 08:01
Member (2003)
Finnish to German
+ ...
Nov 22, 2007

Yesterday a customer sent me 6 pages of pdf. I had done a large job for them last summer, when they had delivered the tables in Excel and the texts as rtf. Now the text was together with the tables.
I scanned the files into a doc-file and counted 2100 words (Word statistics). But I was not sure if the customer would like the doc-format, so I asked if he could send the dtp-file directly. It turned out to be pagemaker. Because I cannot handle pagemaker files (Trados and SDLX cannot import them directly I believe) I asked for a html-export.
When I analysed this html-file in Workbench, I got a result of about 10 000 words total, 61 % repetitions and 4900 new words.

I always believed Trados wordcount would be lower than Word's, because WB does not count numbers, but this result astonished me. I knew I could not believe it, because 6 pages and 10 000 words is far too much.

So I finally created a project in SDLX from my doc-file, confirmed manually the segments which contain only numbers and got down to 1500 words.

When I look at the html-file in TE, the segmentation is very strange, the same happens in SDLX, if I use the html-file, and the statistics talk about more than 10 000 untranslated words.

I always thougt translation of html was child's play. What could be wrong?

(SDL Trados 2006)

Heinrich

[Bearbeitet am 2007-11-22 16:34]


Direct link Reply with quote
 
Margreet Logmans  Identity Verified
Netherlands
Local time: 07:01
English to Dutch
+ ...
Tags not recognised? Nov 22, 2007

Hi Heinrich,

all I can think of is that - probably because of all these conversions - the tags and formatting is not recognised correctly.

I also found this article, perhaps it is of some use to you:
http://ell.proz.com/translation-articles/articles/22/1/How-to-collect-stories-with-PageMaker-for-Mac-(and-PC-too)-without-Trados-Story-Collector

Like you, I always thought HTML is child's play - let's not get worried yet.

Good luck!

Margreet


Direct link Reply with quote
 

Vito Smolej
Germany
Local time: 07:01
Member (2004)
English to Slovenian
+ ...
"When I analysed this html-file in Workbench" Nov 23, 2007

It would be safer to import it into TagEditor and then look at the ttx. I would guess the HTML codes got counted as well, and the real stuff is 4900 words (minus the HTML codes of course).

Regards

Vito


Direct link Reply with quote
 

Heinrich Pesch  Identity Verified
Finland
Local time: 08:01
Member (2003)
Finnish to German
+ ...
TOPIC STARTER
Looks terrible Nov 23, 2007

Vito Smolej wrote:

It would be safer to import it into TagEditor and then look at the ttx. I would guess the HTML codes got counted as well, and the real stuff is 4900 words (minus the HTML codes of course).

Regards

Vito


I did look at it in TE, and it looks terrible. The segmentation is all wrong. What I do not understand is why there are segments with all numbers (from the tables) and that the table headers are split. The table header could be (pump head) and Trados makes two segments of it; pump and head. Abbyy Finereader at least did a better job on the pdf.

The customer will send me another format today, lets see what he comes up with.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Strange html-file?

Advanced search


Translation news related to SDL Trados





SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs