Can I ignore tags in Studio 2009 and save a target file?
Thread poster: Vincent Lemma

Vincent Lemma  Identity Verified
Local time: 05:45
Member (2008)
Italian to English
+ ...
Jul 22, 2013

Hello, I have some medium sized files converted from PDF via OCR recognition and this conversion has filled my file with tags that seems "useless". With this I mean that the PDF file has no particular
text formatting or style, just a standard Arial character with its dimension.
Can I ignore the hundreds of tags to speed things up, or are there any short-cuts to streamline this work?
Sorry if this has been covered before, but I am unable to find the related posts.

Thanks so much,


Direct link Reply with quote

Bernard Lieber  Identity Verified
Local time: 05:45
English to French
+ ...
Codezapper Jul 22, 2013

Hi Vincent,

Use Codezapper to clean the file: before translating it will make life a lot easier for you.



[Edited at 2013-07-22 10:40 GMT]

Direct link Reply with quote

Local time: 05:45
English to Hungarian
+ ...
Alternative to CodeZapper Jul 22, 2013

You can also remove some or all of the formatting manually quite easily.

If the text is all same-size Arial, you can select the entire document and set the font and font size. OCR software often inserts extra-wide and extra-narrow spaces in documents for some unfathomable reason; these can also be fixed. If all you need is uniform running text without any formatting whatsoever, you can copy-paste your text into a txt file, open a new word file and copy-paste the text back in the new file. This is the only way to remove all tags, but it also removes all formatting including bold, italic, text alignment etc.
Obviously, these things need to be done before processing the file with Trados.
Also, if you're the one doing the OCR, you can set the OCR software to produce cleaner, simpler text. Usually, there are several settings ranging from plain text (easy to process but looks nothing like the original visually) all the way to very faithfully rendered but horribly tag-ridden text.

[Edited at 2013-07-22 13:05 GMT]

Direct link Reply with quote

To report site rules violations or get help, contact a site moderator:

You can also contact site staff by submitting a support request »

Can I ignore tags in Studio 2009 and save a target file?

Advanced search

memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »
PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »

  • All of
  • Term search
  • Jobs
  • Forums
  • Multiple search