Trados tags in translation memories
Thread poster: Roberto Silva

Roberto Silva  Identity Verified
United Kingdom
Local time: 03:51
English to Spanish
Nov 26, 2004

Hi,

Does anyone knows where can I find a list of all tags present in translation memories?

When I export a translation memory, I have troubles to stablish which tags I can safely delete (because they are come from wrong symbols in Word, corrupted characters, conversion errors, etc.) and which ones I should not touch.

I would like to create a macro to clean-up translation memories from errors.

Thanks in advance for your help


Direct link Reply with quote
 

Roberto Silva  Identity Verified
United Kingdom
Local time: 03:51
English to Spanish
TOPIC STARTER
Any ideas about this question? Thanks Nov 29, 2004

Carlos Pereira wrote:

Hi,

Does anyone knows where can I find a list of all tags present in translation memories?

When I export a translation memory, I have troubles to stablish which tags I can safely delete (because they are come from wrong symbols in Word, corrupted characters, conversion errors, etc.) and which ones I should not touch.

I would like to create a macro to clean-up translation memories from errors.

Thanks in advance for your help


Direct link Reply with quote
 

Jerzy Czopik  Identity Verified
Germany
Local time: 04:51
Member (2003)
Polish to German
+ ...
Go to Trados user gropu on Yahoo Nov 29, 2004

and download the file TagRemover.exe (http://groups.yahoo.com/group/TW_users/files/)

With this small application (posted courtesy of Vickie Dimitriadou) you can safely remove formatting TAGs and the RTF preamble.

When you wish to do this manually, you can safely remove all formatting TAGs, but not other.
Formatting TAGs look like that {/f10 Text} or {/b Text}.
The first example makes the text to be set in font 10 according to your Trados font table. This is what mostly causes garbaged characters in segments. The other example is for bold text and does not need to be deleted, if it comes alone. But very often you will see, that there are more informations about formatting, as only bold. You can then delete everything, but not bold. To delete formatting TAGs you need to delete the brackets and the format instruction with the following space (ie in above example "{/f10 " AND "}". Be sure to remove both opening and closing brackets, as otherwise the segment will be corrupted (if you are editing with an extern editor) or Trados won´t let you save the segment, if you use the internal editor from Workbench.

Regards
Jerzy


Direct link Reply with quote
 

Roberto Silva  Identity Verified
United Kingdom
Local time: 03:51
English to Spanish
TOPIC STARTER
TagRemover Nov 29, 2004

Dear Jerzy,

Thank you very much for the link and for the info about deleting the formatting info. I have downloaded and tried the application. It seems to work fine.

Two additional questions about the same issue:

1.- Does deleting this info manually or with this application means that Trados won't be able to show / save the translated segment with the same formatting as the source segment?
2.- Many garbaged formatting info in my TM comes from OCR text or PDF extracted text. Does the application remove this garbage or just the one coming from Trados?

Thanks again,

Carlos

Jerzy Czopik wrote:

and download the file TagRemover.exe (http://groups.yahoo.com/group/TW_users/files/)

With this small application (posted courtesy of Vickie Dimitriadou) you can safely remove formatting TAGs and the RTF preamble.

When you wish to do this manually, you can safely remove all formatting TAGs, but not other.
Formatting TAGs look like that {/f10 Text} or {/b Text}.
The first example makes the text to be set in font 10 according to your Trados font table. This is what mostly causes garbaged characters in segments. The other example is for bold text and does not need to be deleted, if it comes alone. But very often you will see, that there are more informations about formatting, as only bold. You can then delete everything, but not bold. To delete formatting TAGs you need to delete the brackets and the format instruction with the following space (ie in above example "{/f10 " AND "}". Be sure to remove both opening and closing brackets, as otherwise the segment will be corrupted (if you are editing with an extern editor) or Trados won´t let you save the segment, if you use the internal editor from Workbench.

Regards
Jerzy


Direct link Reply with quote
 

Jerzy Czopik  Identity Verified
Germany
Local time: 04:51
Member (2003)
Polish to German
+ ...
Trados does not produce garbage Nov 29, 2004

Either Word does this not too...
So you ask who does - the users are.
This represents only my own oppinion and is not in any way related to Trados. This is not an official statement.

The occuring problems result allways (or at least this was the case for me) from bad formatting.
Word allows you to format text in a lot of ways, ie. in the "sophisticated" way using styles and templates, which is certainly the better way, and in the "usuall" way, formatting text pieces manually according to the needs. Trados however asumes, that the text was at least partly formatted in the sophisticated way and tries to restore the formatting by closing the segment. So if for example your text is formatted Times New Roman 12 pt Chinese, but the document uses Arial 10 pt English, and you translate the whole into Polish, you almost certainly will get big problems. Things got better since Trados 6.5.5, but even this version will garbage characters - the only way to avoid this is to reformat the document.

To answer your questions:
1. Removing formatting TAGs only from target text will miss up the segments. However, if you carafully remove formatting both from source AND target text, then the segment will remain "neutral". For the above reasons Trados saves formatting only if neccessary, what means allways than, if the formatting is irregular. On regular formated sentences Trados does only save the segment, but omits fonts information. This is the crucial point for messing up the characters or not. If font information are saved, then they are often incorect, as Trados tries to use the font for the target language, which may not exist in your PC (ie. SimSun) or even doesn´t exist at all (good example are fonts without polish characters, which were treaded by Trados as would they have a CE companion - as it was previously with Arial/Arial CE).

2. This garbage is caused by the same reasons as explained above. Befor starting to translate a file comming from OCR/PDF please adapt the formatting of your documents, so that the styles will match the document or the document will match the styles. Most important is the font information for any style used and ON TOP the language setting! Your document MUST be set to he target language within the styles and locally. It means all styles used MUST be set the target language, and then you should mark the whole document (CTRL+A) and select the language manually once again (menu Tools - Language). Be sure to deselect "Automatic changing of language", as it will mess the document additionally.

Best regards
Jerzy


Direct link Reply with quote
 

Roberto Silva  Identity Verified
United Kingdom
Local time: 03:51
English to Spanish
TOPIC STARTER
Users produce garbage... Nov 30, 2004

Dear Jerzy,

Thank you very much for a very comprehensive reply. It has provided me with interesting clues for the resolution of several problems I am experiencing. I will do as you suggest for a while and observe the results, "pre-formatting" the document before starting the translation. It is a tedious task, but so it is to repair a damaged document or translation memory.

By the way, most documents I translate are formatted Times New Roman X pt X language, the document uses Arial X pt X language, and the target language is not english. And yes, I agree that things have gotten better since version 6.5.x, though not much better (Unicode support is important). Before version 6, formatting, styles and symbol support was, er, rudimentary...

Thanks again,

Carlos


Jerzy Czopik wrote:

Either Word does this not too...
So you ask who does - the users are.
This represents only my own oppinion and is not in any way related to Trados. This is not an official statement.

The occuring problems result allways (or at least this was the case for me) from bad formatting.
Word allows you to format text in a lot of ways, ie. in the "sophisticated" way using styles and templates, which is certainly the better way, and in the "usuall" way, formatting text pieces manually according to the needs. Trados however asumes, that the text was at least partly formatted in the sophisticated way and tries to restore the formatting by closing the segment. So if for example your text is formatted Times New Roman 12 pt Chinese, but the document uses Arial 10 pt English, and you translate the whole into Polish, you almost certainly will get big problems. Things got better since Trados 6.5.5, but even this version will garbage characters - the only way to avoid this is to reformat the document.

To answer your questions:
1. Removing formatting TAGs only from target text will miss up the segments. However, if you carafully remove formatting both from source AND target text, then the segment will remain "neutral". For the above reasons Trados saves formatting only if neccessary, what means allways than, if the formatting is irregular. On regular formated sentences Trados does only save the segment, but omits fonts information. This is the crucial point for messing up the characters or not. If font information are saved, then they are often incorect, as Trados tries to use the font for the target language, which may not exist in your PC (ie. SimSun) or even doesn´t exist at all (good example are fonts without polish characters, which were treaded by Trados as would they have a CE companion - as it was previously with Arial/Arial CE).

2. This garbage is caused by the same reasons as explained above. Befor starting to translate a file comming from OCR/PDF please adapt the formatting of your documents, so that the styles will match the document or the document will match the styles. Most important is the font information for any style used and ON TOP the language setting! Your document MUST be set to he target language within the styles and locally. It means all styles used MUST be set the target language, and then you should mark the whole document (CTRL+A) and select the language manually once again (menu Tools - Language). Be sure to deselect "Automatic changing of language", as it will mess the document additionally.

Best regards
Jerzy


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Trados tags in translation memories

Advanced search







Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs