Can I ignore tags in Studio 2009 and save a target file?
Thread poster: Vincent Lemma

Vincent Lemma  Identity Verified
Italy
Local time: 21:44
Member (2008)
Italian to English
+ ...
Jul 22, 2013

Hello, I have some medium sized files converted from PDF via OCR recognition and this conversion has filled my file with tags that seems "useless". With this I mean that the PDF file has no particular
text formatting or style, just a standard Arial character with its dimension.
Can I ignore the hundreds of tags to speed things up, or are there any short-cuts to streamline this work?
Sorry if this has been covered before, but I am unable to find the related posts.

Thanks so much,

Vince


 

Bernard Lieber  Identity Verified
Local time: 21:44
English to French
+ ...
Codezapper Jul 22, 2013

Hi Vincent,

Use Codezapper to clean the file: http://asap-traduction.com/CodeZapper before translating it will make life a lot easier for you.

HTH,

Bernard

[Edited at 2013-07-22 10:40 GMT]


 

FarkasAndras
Local time: 21:44
English to Hungarian
+ ...
Alternative to CodeZapper Jul 22, 2013

You can also remove some or all of the formatting manually quite easily.

If the text is all same-size Arial, you can select the entire document and set the font and font size. OCR software often inserts extra-wide and extra-narrow spaces in documents for some unfathomable reason; these can also be fixed. If all you need is uniform running text without any formatting whatsoever, you can copy-paste your text into a txt file, open a new word file and copy-paste the text back in the new file. This is the only way to remove all tags, but it also removes all formatting including bold, italic, text alignment etc.
Obviously, these things need to be done before processing the file with Trados.
Also, if you're the one doing the OCR, you can set the OCR software to produce cleaner, simpler text. Usually, there are several settings ranging from plain text (easy to process but looks nothing like the original visually) all the way to very faithfully rendered but horribly tag-ridden text.

[Edited at 2013-07-22 13:05 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Can I ignore tags in Studio 2009 and save a target file?

Advanced search







CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »
SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running, helps experienced users make the most of the powerful features, ensures new

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search