Problems with project files and segmentation
Thread poster: Viivi

Viivi
Finland
Local time: 04:36
Aug 17, 2010

Hello again!

I am a beginner in terms of OmegaT, and I apologise for my lack of tech-savviness. Anyway, I have now managed to get the glossaries to work and did a short test translation, which worked just fine.

Now I have a doc-file with lots of text boxes, pictures and whatnot. I converted it to an odt-file in Open Office Writer, created a new project in OmegaT and this is roughly what I get:

< f0 >U< /f0 >< f1 >n< /f1>< f2>a< /f2>< f3>u< /f3>< f4>t< /f4>< f5>h< /f5>< f6>o< /f6>< f7>r< /f7>< f8>i< /f8>< f9>z< /f9>< f10>e< /f10>< f11>d < /f11>< f12>c< /f12>< f13>ha< /f13>< f14>ng< /f14>< f15>e< /f15>< f16>s < /f16>< f17>o< /f17>< f18>r < /f18>< f19>mo< /f19>< f20>d< /f20>< f21>ifi< /f21>< f22>c< /f22>< f23>a< f24>ti< /f24>< f25>o< /f25>< f26>n < /f26>< f27>t< /f27>< f28>o < /f28>< f29>t< /f29>< f30>h< /f30>< f31>i< /f31>< f32>s < /f32>< f33>s< /f33>< f34>ys< /f34>< f35>t< /f35> ...

(I had to add some spaces after the < so you can see what it looks like to me.)

That particular segment/sentence is supposed to be about unauthorized changes etc. It is obvious I cannot translate like this.

Does anybody have any ideas what is wrong with my documents/project?

I am using OmegaT 2.1.7_1. The source language is EnUS and target language Finnish. I have four doc-files, which I have converted into odt-files. I have not yet tried to add glossaries or translation memories to the project. I simply added the project files. I am assuming the source files are too fancy somehow...? I suspect they might even have been originally something else than Word documents.

Is there any way to translate these with the help of OmegaT?


Direct link Reply with quote
 

esperantisto  Identity Verified
Local time: 04:36
Member (2006)
English to Russian
+ ...
Nothing wrong with OmegaT, actually Aug 17, 2010

Is your file really .doc? The text looks typical to .docx (MS Word 2007), which is a really lousy format. If you can obtain the original .docx file, do it and try translating with the latest build of OmegaT (1.8.0). Otherwise, if you have Microsoft Office, try exporting the file to RTF and converting back to .doc, then to ODT.

Also search the Yahoo! group of OmegaT, the topic of tag reduction has been discussed.


Direct link Reply with quote
 

Susan Welsh  Identity Verified
United States
Local time: 21:36
Member (2008)
Russian to English
+ ...
A couple of notes Aug 17, 2010

These things are known in the trade as "tag soup." Apart from what esperantisto wrote, let me add that you also get a lot of this junk if a document has been converted from a PDF--especially when there are lots of graphic elements, text boxes, etc., which yours has. Note that esperantiso was referring to build 2.1.8.0--he left out the 2, which might confuse you.

If your document is not a .docx, but rather a conversion from PDF, and esperantisto's instructions don't help, you should ask the client for the original file. (This is one reason that people charge extra for translating from PDFs.) Even though I have ABBYY PDF Converter, which does a pretty good job, I have found that converting PDFs is no good for translating in a CAT tool, because of the tag soup. I usually strip it down to "text" (via Adobe Reader), translate it, and then reformat it. But something with as many graphics as yours has would be quite time consuming for me, given my level of expertise with Word/OOo Writer.

good luck


Direct link Reply with quote
 

esperantisto  Identity Verified
Local time: 04:36
Member (2006)
English to Russian
+ ...
The last resort Aug 17, 2010

If no fancy formatting is required, reset everything to the style default formatting (select text, press Ctrl+M in OOo Writer).

Direct link Reply with quote
 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Problems with project files and segmentation

Advanced search






SDL Trados Studio 2017 only €415 / $495
Get the cheapest prices for SDL Trados Studio 2017 on ProZ.com

Join this translator’s group buy brought to you by ProZ.com and buy SDL Trados Studio 2017 Freelance for only €415 / $495 / £325 / ¥60,000 You will also receive FREE access to our getting started eLearning program!

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search