Pages in topic:   [1 2] >
Can I get rid of tags in a source (Word) file?
Thread poster: Lotus LS

Lotus LS
Israel
Local time: 21:58
Hebrew to English
+ ...
Feb 25, 2012

Hi all, sorry if this is incredibly basic, I'm still getting my bearings with Trados and haven't been able to find this information by searching.

I have a source text that is just a plain Word file. There's no special formatting and in any case it's not important - I just need to translate the text alone.

But when I open the document for translation in Trados Studio, I get tags before and after every single word. I tried going back to the Word doc to clear all formatting and then opened it in Trados again but got the same result.

Is there some way I can remove or ignore these tags entirely and just get the text to translate? Or is there something I can do to the source file to make it open without all the tags?

Thanks!!


Direct link Reply with quote
 

HarryHedgehog
Germany
Local time: 20:58
German to English
I've had that, too Feb 25, 2012

I've had this in source files that were copied and pasted from PDFs. As far as I can tell, the problem is that Word saves the spacing information for each individual character that is embedded in the PDF, making translation with TE impossible.

The only way I've found to solve it is to save the document as a .txt, close it, open the .txt file, and save that as a .doc(x) file. If you copy and paste the content to a new file, the embedded spacing information is copied as well.


Direct link Reply with quote
 

Giles Watson  Identity Verified
Italy
Local time: 20:58
Italian to English
Check the four blue symbols Feb 25, 2012

Mouse-over the group of four blue angular symbols in the toolbar. Click the one on the left, which says "No Tag Text".

If there are still tags visible, it probably means your Word document is actually an export from a PDF or DTP program which has left extraneous tags. Try copying and special-pasting the source text into a new Word document as unformatted text.

HTH


Direct link Reply with quote
 

Heinrich Pesch  Identity Verified
Finland
Local time: 21:58
Member (2003)
Finnish to German
+ ...
Get CodeZapper Feb 25, 2012

YOu must clean the Word-file first from all superfluous tags.

Direct link Reply with quote
 

Squi  Identity Verified
Denmark
Local time: 20:58
English to Russian
+ ...
Code Zapper is the best solution Feb 25, 2012

I have just bought this fine macro for Word, and it cleans tags from Word files beautifully - even from those made from PDFs! You have to donate a small amount, and the zip arrives without delay with instructions etc.
www.asap-traduction.com/CodeZapper
Best regards
Squi


Direct link Reply with quote
 

Mark Nathan  Identity Verified
France
Local time: 20:58
Member (2002)
French to English
+ ...
Code zapper Feb 25, 2012

I did not know about Code Zapper. So, having cleaned a source word document of all its tags, what happens to the format of the target document when you generate the translation (in Trados)?

Direct link Reply with quote
 

Lotus LS
Israel
Local time: 21:58
Hebrew to English
+ ...
TOPIC STARTER
Thanks! Saving the file as a .txt did the trick. Feb 26, 2012

I wasn't able to get rid of the formatting in Word with any kind of pasting maneuver, but switching to a .txt format worked well. I actually didn't even resave as .docx - the text file opened directly in Trados. Of course, this was only an option because I really, really didn't care about the formatting, so I will check out the Zapper as well. Thanks everyone!

Direct link Reply with quote
 

Jerzy Czopik  Identity Verified
Germany
Local time: 20:58
Member (2003)
Polish to German
+ ...
CodeZapper should not remove formatting of the file Feb 26, 2012

but just this superfluous extra like spacing or kerning.
So nothing should happen th the formatting of the final file.

BTW, it is not bad to get to know what kind of formatting is possible, as bold & italic is not everything.
You say "having cleaned a source word document of all its tags" - have you thought about what a tag is?
If you remove all tags (what CodeZapper will most certainly not do), you have plain text there. So only the styles applied will be used, if that was a Word file.


Direct link Reply with quote
 

Vladimir Vasek  Identity Verified
Czech Republic
Local time: 20:58
English to Czech
Reset font settings, do not buy anything Mar 1, 2012

Hi,
I had a similar problem last week and after some time of playing around with font settings in Word I found the following. Please note that my copy of MS Office is localized to Czech so some UI items I mention might be slightly different - it is a guess.

Select all text in a document - Ctrl+A
Press Ctrl+D - a Font dialog box should open
Click Advanced tab
Here you should reset the fields on top of the dialog box to their default values (Scale, Gaps, Position) and uncheck Kerning for fonts.
Click OK.

The text formating in a document changes slightly because spacing and font kerning was modified, but since the length of the translated text is different from the original and must be adjusted when finalizing the layout anyway it actually does not matter.
It does not affect font formatting like colors, bold, italics and so on.

Now, when you open the document in Tag Editor or Studio, most of the useless tags should be gone. Sometimes you need to repeat the procedure separately for Header/Footer, tables and independent text boxes.

You do not need to buy anything:-)

Hope it helps
Cheers
Vladimir


Direct link Reply with quote
 

Lotus LS
Israel
Local time: 21:58
Hebrew to English
+ ...
TOPIC STARTER
vladimirvv - how did you know what the default values were? Mar 3, 2012

In the Font > Advanced dialogue box those boxes are empty with a choice of options. I don't see any kind of "restore defaults" option.

Direct link Reply with quote
 

Jerzy Czopik  Identity Verified
Germany
Local time: 20:58
Member (2003)
Polish to German
+ ...
Checking yourself? Mar 3, 2012

For example in a new clean Word document, where you would see the defaults?
100%, normal and no kerning. I would not touch position at all.

BTW, this is not a new method, as it has been presented here many times in the past, both in conjunction with tags and PDF conversion.


Direct link Reply with quote
 

Lotus LS
Israel
Local time: 21:58
Hebrew to English
+ ...
TOPIC STARTER
thanks - i thought i might be doing it wrong Mar 4, 2012

because it still didn't get rid of the tags...
i'll go back to .txt i guess.

thanks again!


Direct link Reply with quote
 

Vladimir Vasek  Identity Verified
Czech Republic
Local time: 20:58
English to Czech
It's weird Mar 4, 2012

Lillee, if you don't mind, you can send me the original Word document via e-mail. Then I'll be hopefully able to tell you where's the problem and why it didn't work for you. In my opinion it should work, but maybe there are other settings I'm not aware of.
Vladimir.vasek@privatebox.eu
Vladimir


Direct link Reply with quote
 

Jerzy Czopik  Identity Verified
Germany
Local time: 20:58
Member (2003)
Polish to German
+ ...
Superfluous formatting tags must be gone after the mentioned operation Mar 4, 2012

So now look at the tags and tell us, what kind of tags do remain there.
I mean the EXACT text on the tag, not just an information "rogue codes" or similar.


Direct link Reply with quote
 

Lotus LS
Israel
Local time: 21:58
Hebrew to English
+ ...
TOPIC STARTER
Most of the tags say: Mar 4, 2012

and the others say


When I originally posted the question, the document I was working on had these tags after every word. In my current (similar) document, only two paragraphs have this problem and the rest of the text is normal. It's likely that the text was pasted into the source document from various files, possibly also from Excel files. The source text is in Hebrew, if that makes any difference.

Vladimir, thanks for the offer but this is a client's material so I don't feel comfortable sending the actual file.


Direct link Reply with quote
 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Can I get rid of tags in a source (Word) file?

Advanced search







SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search