Mobile menu

How to create uncleaned RTF without Trados
Thread poster: John Fossey

John Fossey  Identity Verified
Canada
Local time: 13:55
Member (2008)
French to English
Jan 1, 2009

I recently completed a project for a Trados agency. I don't have Trados, and don't intend to get it. I was given the RTF source files and a large TMX to maintain uniformity with the client's previous documents. Usually I work in Wordfast (unlicensed) but since this large TMX was too large for the max. 500 TU limitation, I did the translation in OmegaT, which worked without a hitch. Now the agency has asked for the "uncleaned" RTF file for proofreading in Trados, but OmegaT only produces cleaned target files. So I have the original RTF, cleaned RTF, and a variety of TMXs. Does anyone know how I can produce an uncleaned RTF? There used to be a program RTFStyler by MaxPrograms (Swordfish) that claimed to be able to do this, but it seems to have been discontinued. Many thanks for any help!

Direct link Reply with quote
 

Anna Sylvia Villegas Carvallo
Mexico
Local time: 12:55
English to Spanish
This will help Jan 1, 2009

http://en.wikipedia.org/wiki/OmegaT

Luck,
Tadzio.


Direct link Reply with quote
 

Rodolfo Raya  Identity Verified
Local time: 14:55
English to Spanish
RTFStyler moved to Swordfish Jan 1, 2009

John Fossey wrote:
There used to be a program RTFStyler by MaxPrograms (Swordfish) that claimed to be able to do this, but it seems to have been discontinued. Many thanks for any help!


The functionality of RTFStyler has been incorporated in Swordfish.

Convert your source RTF to XLIFF using Swordfish selecting "Tagged RTF" as source format. Use your TMX file to recover as much as possible from your translations and then convert the translated XLIFF to RTF again. Swordfish will add the missing Trados markup and you will have an "uncleaned" RTF that your client can review in Trados.

Regards,
Rodolfo


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 19:55
Member (2006)
English to Afrikaans
+ ...
I'll try... Jan 2, 2009

John Fossey wrote:
Usually I work in Wordfast (unlicensed) but since this large TMX was too large for the max. 500 TU limitation...


Great, so at least you know what an uncleaned file looks like.

...I did the translation in OmegaT, which worked without a hitch. Now the agency has asked for the "uncleaned" RTF file for proofreading in Trados, but OmegaT only produces cleaned target files.


I think I was able to figure out a way to help. I don't have time to test it, though.

==

Very brief instructions to create uncleaned files after you've translated a file using OmegaT. Because ProZ.com's forum software mangles code, I've added spaces where there should not be spaces. Hopefully you'll see where the spaces should not be spaces.

Open the project_save.tmx file in a text editor and do a regex find/replace to change this:

<tu>
<tuv lang="EN-US">
<seg>Your text here.</seg>
</tuv>
<tuv lang="AF">
<seg>Jou teks hier</seg>
</tuv>
</tu>

into this:

<tu>
<tuv lang="EN-US">
<seg>Your text here.</seg>
</tuv>
<tuv lang="AF">
<seg> { 0 & g t ; Your text here. & l t ; } 0 { & g t ; Jou teks hier. & l t ; 0 } </seg>
</tuv>
</tu>

So basically, the target text should contain:

1. This: { 0 & g t ;
2. The source text
3. This: & l t ; } 0 { & g t ;
4. The target text, and
5. This: & l t ; 0 }

If you're using MS Word as your text editor, the following regex find/replace (with wildcards enabled) should do the trick:

Find what:
(\<tuv lang=\"EN-US\"\>)(*)(\<seg\>)(*)(\<\/seg)(*)(\<seg\>)(*)(\<\/seg)
Replace with:
\1\2\3\4\5\6\7 { 0 & g t ; \4 & l t ; } 0 { & g t ; \8 & l t ; 0 } \9

When using MS Word, remember to enable "Confirm conversion at open" and to use "File -> Open" to open the file, and open it as an Encoded text file, with the encoding Unicode UTF8.

Please let me know the syntaxes for other tools (eg jEdit etc) so that we can post the hack on the OmegaT mailing list.

Then reload the project and create target documents. The resulting ODT file, converted to DOC, would possibly be accepted by Trados and Wordfast as an uncleaned file, even though it does not contain the correct styles.

Ideally, the "uncleaned" tags should be made purple, and not just any purple, but in the style tw4winMark. Doing that is beyond the scope of this post, but basically, you should create a style called tw4winMark and make it purple, and then use the "More" option in find/replace to add a style to the replace box. I'm not sure if a third-party tool can be used to do that. If you have a file translated in WF, you can try to copy one of the purple thingies into your document and then hopefully the tw4winMark style would come with it, so that you can use it in find/replace.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 19:55
Member (2006)
English to Afrikaans
+ ...
Allow me to fix the smilies... Jan 2, 2009

Samuel Murray wrote:
Find what:
(\<tuv lang=\"EN-US\"\>)(*)(\<seg\>)(*)(\<\/seg)(*)(\<seg\>)(*)(\<\/seg)
Replace with:
\1\2\3\4\5\6\7 { 0 & g t ; \4 & l t ; } 0 { & g t ; \8 & l t ; 0 } \9


Perhaps in ten years' time, the ProZ.com forum software will work. Until then, let me add more spaces:

Find what:
( \ & l t ; t u v l a n g = \ " E N - U S \ " \ & g t ; ) ( * ) ( \ & l t ; s e g \ & g t ; ) ( * ) ( \ & l t ; \ / s e g ) ( * ) ( \ & l t ; s e g \ & g t ; ) ( * ) ( \ & l t ; \ / s e g )
Replace with:
\ 1 \ 2 \ 3 \ 4 \ 5 \ 6 \ 7 { 0 & g t ; \ 4 & l t ; } 0 { & g t ; \ 8 & l t ; 0 } \ 9


Direct link Reply with quote
 

Vito Smolej
Germany
Local time: 19:55
Member (2004)
English to Slovenian
+ ...
If you send me the material Jan 2, 2009

Many thanks for any help!

... I'll patch it up with Trados. Use it to mop up NOPs on my machine instead of Seti (g).

Regards

Vito


Direct link Reply with quote
 

John Fossey  Identity Verified
Canada
Local time: 13:55
Member (2008)
French to English
TOPIC STARTER
Thanks for all the suggestions. Jan 2, 2009

What I did, which seems to have worked (crossing fingers), is that I found that I had (fortunately) saved TMs of sections of the job, which were less than 500 TUs each - it was the client's master TM that was way too big. So I was able to import the source file into Wordfast, point it to the TMs, and run TranslateUntilNoMatch, which produced an uncleaned RTF. So far the client has accepted it.

But I would like to try the OmegaT hack suggested, because it seems to me it would beneficial to be able to produce Trados-acceptable work with open source software. Maybe a simple macro could be recorded to do it.

Many thanks for all the suggestions, which gave me a much better understanding of the whole situation.


Direct link Reply with quote
 
FarkasAndras
Local time: 19:55
English to Hungarian
+ ...
500 TUs Jan 2, 2009

If you run into similar issues in the future, just chop up your TMs.
TMX is just a text file, made up of a header and the TUs in sequence after it.
You can keep the header and delete any surplus TUs to cut it down and make it fit under 500, add the closing tag at the end and away you go.
Make as many small TMs out of a big one as you need.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 19:55
Member (2006)
English to Afrikaans
+ ...
Editing is disabled after 24 hours... Jan 3, 2009

...so I can't fix things unless I repost.

Find what:
(\< t u v l a n g = \ " E N - U S \ " \ > ) ( * ) ( \< s e g \> ) ( * ) ( \< \ / s e g ) ( * ) ( \< s e g \> ) ( * ) ( \< \ / s e g )
Replace with:
\ 1 \ 2 \ 3 \ 4 \ 5 \ 6 \ 7 { 0 & g t ; \ 4 & l t ; } 0 { & g t ; \ 8 & l t ; 0 } \ 9


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 19:55
Member (2006)
English to Afrikaans
+ ...
Try this script I wrote Jan 4, 2009

John Fossey wrote:
Now the agency has asked for the "uncleaned" RTF file for proofreading in Trados, but OmegaT only produces cleaned target files.


I've written my solution from a few posts previously into an AutoIt script (with accompanying EXE file if you don't have AutoIt installed). Download "UncleanifyTMX" here: http://leuce.com/tempfile/omtautoit/. Let me know if it works for you.


Direct link Reply with quote
 

Anthony Baldwin  Identity Verified
United States
Local time: 13:55
Member (2006)
Portuguese to English
+ ...
anaphraseus Jul 1, 2009

I primarily use OmegaT for my work, too, but, when clients require uncleaned rtf or doc files, I now use Anaphraseus ( http://anaphraseus.sourceforge.net ).
Anaphraseus works similarly to older versions of Wordfast®, as I understand it (I've never used WF), but as an extension to OpenOffice ( http://www.openoffice.org ), not MSOffice®.
Sometimes I will still translate files in OmegaT (large project, various references tm files, etc.), and then simply use Anaphraseus to "convert" the target files to unclean by importing the project's tmx file into anaphraseus, other times I simply use Anaphraseus to translate the file.
Here's a manual for use of the latest release of Anaphraseus:
http://www.linguasos.org/bsoft/AnaphraseusManual_1.23b.html
with screenshots, etc.

bonne chance

[Edited at 2009-07-01 13:31 GMT]


Direct link Reply with quote
 
esperantisto  Identity Verified
Local time: 20:55
Member (2006)
English to Russian
+ ...
Anaphraseus works exactly as Wordfast Classic does Jul 1, 2009

The reference to older versions relates to its functionality, not to the workflow. Files, produced by Anaphraseus, are cleaned-up with Wordfast without any problem. Beware of possibility to loose complex formatting, however.

Or you can use Wordfast without a license — it should produce an uncleaned document with an existing TM (an unlicensed copy won’t update a TM with new TUs, but all other features will be operable).

[Edited at 2009-07-01 14:59 GMT]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to create uncleaned RTF without Trados

Advanced search


Translation news related to CAT tools





PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs