tags between each and every character
Thread poster: Noha Kamal, PhD.

Noha Kamal, PhD.  Identity Verified
Local time: 03:23
English to Arabic
+ ...
May 17, 2012


I am getting started on a new projects, opened the file in Studio 2011, and was taken aback by what I saw. There is a tag between each two characters in all words!! Why did this happen? And how can I fix it? The original doc is in Calibri font 10. any ideas??



Norbert Hermann  Identity Verified
Local time: 01:23
English to German
+ ...
Use CodeZapper May 17, 2012

Noha Kamal, PhD. wrote:

Why did this happen? And how can I fix it?

"CodeZapper" is a powerful, easy to use set of Word VBA macros designed to “clean up” Word files before being imported into a standalone translation environment (DVX, memoQ, SDL Studio, TagEditor, Swordfish, OmegaT, Wordfast Pro, etc.).
Word documents are often strewn with “rogue codes” or junk tags (so-called “smart tags”, language tags, track changes tags, spellchecker tags, soft hyphenations, scaling and spacing changes, redundant bookmarks, etc.).
This tagged information shows up in the translation grid as spurious codes{1}around{2}, or even in the mid{3}dle of, words, making sentences difficult to read and translate and generally negating many of the productivity benefits of the program.




Melanie Meyer  Identity Verified
Local time: 20:23
Member (2010)
English to German
+ ...
how to fix this tag problem in Studio May 17, 2012

I had the same problem once and this is what worked for me:

Open Studio 2011 and go to Tools -> Options -> File Types -> Microsoft Word 2007-2010 -> Common -> Skip Advanced font formatting.

Now open the corrected .docx and you will see that this gave you a much cleaner file in Studio.

I hope this helps.


Noha Kamal, PhD.  Identity Verified
Local time: 03:23
English to Arabic
+ ...
No change May 17, 2012

Melanie, I tried all the steps you just described. No change at all. The same characters are still displayed.


SDL Community  Identity Verified
United Kingdom
Local time: 02:23
The Tag problem is caused... May 17, 2012

... by the tags being in the source file. Studio and all CAT tools are designed to see these tags so you make sure they are in the target. Some tags are necessary for the recreation of the target file... others such as formatting tags are often not necessary... but may be desirable depending on the file.

In your case it may be that your source file is the result of a scanning process and contains lots of irrelevant tags put there by the scanning software. If this is the case then you musty remove them first. In this case the suggestion from Hermann is an excellent one.

The options in Studio will remove and clean up some formatting but not all.




Mike Wilkinson  Identity Verified
Local time: 02:23
Member (2006)
Dutch to English
+ ...
the magic paintbrush may also help... May 17, 2012

Agree that it's likely to be spurious codes from an overzealous OCR package or something like that (other CAT tools can suffer similar problems).

If you have got the Word source but haven't got the various tools suggested, another option that can work is the "magic paintbrush" tool in Word (i.e. the "copy formatting" feature). May be an option, depending on the size of your source file and the amount of formatting you need to retain. Use the magic paintbrush to make sure the formatting is consistent, and then recreate your bilingual file.


christela (X)
Ask your client May 17, 2012

I had this once with a file for which I had to use another CAT-tool. I asked the client and he came back with another file, it was indeed an "overzealous OCR package" (I like that expression). The second file was not perfect (characters missing) but a lot better.


To report site rules violations or get help, contact a site moderator:

You can also contact site staff by submitting a support request »

tags between each and every character

Advanced search

Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »

  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search