tags between each and every character
Thread poster: Noha Kamal, PhD.

Noha Kamal, PhD.  Identity Verified
Local time: 05:22
Member (2007)
English to Arabic
+ ...
May 17, 2012


I am getting started on a new projects, opened the file in Studio 2011, and was taken aback by what I saw. There is a tag between each two characters in all words!! Why did this happen? And how can I fix it? The original doc is in Calibri font 10. any ideas??



Hermann  Identity Verified
Local time: 04:22
English to German
+ ...
Use CodeZapper May 17, 2012

Noha Kamal, PhD. wrote:

Why did this happen? And how can I fix it?

"CodeZapper" is a powerful, easy to use set of Word VBA macros designed to “clean up” Word files before being imported into a standalone translation environment (DVX, memoQ, SDL Studio, TagEditor, Swordfish, OmegaT, Wordfast Pro, etc.).
Word documents are often strewn with “rogue codes” or junk tags (so-called “smart tags”, language tags, track changes tags, spellchecker tags, soft hyphenations, scaling and spacing changes, redundant bookmarks, etc.).
This tagged information shows up in the translation grid as spurious codes{1}around{2}, or even in the mid{3}dle of, words, making sentences difficult to read and translate and generally negating many of the productivity benefits of the program.




Melanie Meyer  Identity Verified
Local time: 23:22
Member (2010)
English to German
+ ...
how to fix this tag problem in Studio May 17, 2012

I had the same problem once and this is what worked for me:

Open Studio 2011 and go to Tools -> Options -> File Types -> Microsoft Word 2007-2010 -> Common -> Skip Advanced font formatting.

Now open the corrected .docx and you will see that this gave you a much cleaner file in Studio.

I hope this helps.


Noha Kamal, PhD.  Identity Verified
Local time: 05:22
Member (2007)
English to Arabic
+ ...
No change May 17, 2012

Melanie, I tried all the steps you just described. No change at all. The same characters are still displayed.


SDL Community  Identity Verified
United Kingdom
Local time: 05:22
The Tag problem is caused... May 17, 2012

... by the tags being in the source file. Studio and all CAT tools are designed to see these tags so you make sure they are in the target. Some tags are necessary for the recreation of the target file... others such as formatting tags are often not necessary... but may be desirable depending on the file.

In your case it may be that your source file is the result of a scanning process and contains lots of irrelevant tags put there by the scanning software. If this is the case then you musty remove them first. In this case the suggestion from Hermann is an excellent one.

The options in Studio will remove and clean up some formatting but not all.




Mike Wilkinson  Identity Verified
Local time: 05:22
Member (2006)
Dutch to English
+ ...
the magic paintbrush may also help... May 17, 2012

Agree that it's likely to be spurious codes from an overzealous OCR package or something like that (other CAT tools can suffer similar problems).

If you have got the Word source but haven't got the various tools suggested, another option that can work is the "magic paintbrush" tool in Word (i.e. the "copy formatting" feature). May be an option, depending on the size of your source file and the amount of formatting you need to retain. Use the magic paintbrush to make sure the formatting is consistent, and then recreate your bilingual file.


Ask your client May 17, 2012

I had this once with a file for which I had to use another CAT-tool. I asked the client and he came back with another file, it was indeed an "overzealous OCR package" (I like that expression). The second file was not perfect (characters missing) but a lot better.


To report site rules violations or get help, contact a site moderator:

You can also contact site staff by submitting a support request »

tags between each and every character

Advanced search

SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »
PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »

  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search