Problem with foreign characters being treated as separate words.
Thread poster: ScanTran

Local time: 18:57
Danish to English
+ ...
Apr 27, 2017

I've been translating a Swedish book in Studio 2011 from a docx source. I first noticed that some lines would end in a Swedish character ä,ö,or å, where the character was actually in the middle of the word. So, the word "fråga" for example could be shown as frå at the end of a line, with the next line starting with ga.

Then I noticed that the word count seemed to be wrong and realised that Studio was treating Swedish characters as separate words. Thus, the word "översättare", for example, would be treated as four words, ö-vers-ä-ttare.

When I look at my TM, there are tags all over the place, before and after these Swedish characters in every case, which is making it very difficult to use for concordance searches.

I have tried saving the file in different formats without any results.

Has anyone experienced anything similar?


Nina Esser
Local time: 19:57
Member (2017)
English to German
Converted file? Apr 27, 2017

Was your file converted from pdf by any chance? If so, you will have to tidy up the formatting of the Word file, I'm afraid. I found this thread, which might help:

In addition to what Jerzy suggests, you may also want to make sure the font colour is the same throughout the file/within each sentence. I've had a few conversions where special characters had a different colour (grey instead of black), for whatever reason.


Elif Baykara Narbay  Identity Verified
Local time: 20:57
German to Turkish
+ ...
Turkish Apr 27, 2017

I use the program occasionally with Turkish source and target texts. In Turkish, the following letters are different than in Englisch: ç, ı, ğ, ö, ş, ü.

I have never experienced such a problem with doc files. On the other hand, similar problems occur with powerpoint files both in Studio 2011 and in memoQ 2015. When applicable, I select all the file and change the font to something else and then beck to the original. This solves the problem. However, if you have many font types in your file, this would be not feasible.

Perhaps you can try with a short portion of the file and if it works, you can ask your customer to adjust the fonts.



To report site rules violations or get help, contact a site moderator:

You can also contact site staff by submitting a support request »

Problem with foreign characters being treated as separate words.

Advanced search

SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »

  • All of
  • Term search
  • Jobs
  • Forums
  • Multiple search