Tag soup - accents
Thread poster: Thomas Carey

Thomas Carey  Identity Verified
Local time: 20:18
Member (2011)
French to English
+ ...
Apr 23, 2014

Hi all,

I have recently been experiencing problems with tags in source when it comes to apostrophes and accents (source text in French). I use MemoQ 2013 R2 (6.8.56).

At first, I thought it was due to the document being a converted .pdf, but I've been told it is actually an original word file. Now, nearly every time I receive a document from this client, I have these tag problems.

The original file is a .doc file (97-2003). The main text font is Arial but the accents are in Arial Unicode MS (and the style is slightly different).

I tried saving the doc as docx, this sometimes works or helps but not perfectly.

I also selected all the text and changed all font to normal Arial, this removed tags but messed up the text by missing or repeating parts of sentences (I even ended up with what appeared to be some kind of Asian language!?!).

Has anyone else been experiencing such problems recently? Any suggestions?

Thanks,

Tom


 

esperantisto  Identity Verified
Local time: 21:18
Member (2006)
English to Russian
+ ...
OCRd document? Apr 23, 2014

Is your document really in MS Word? Your description makes me think that it's a result of OCRing by ABBYY FineReader or a similar program. In such a case, it's not really DOC, but RTF. Try the following:
1. Re-save to make sure it's really DOC. Or use Apache OpenOffice / LibreOffice and save to ODF.
2. Make sure that the entire text is in the same (desired) language.
3. After verifying the language, apply uniform font attributes such as fontface, height, color to the text.


 

Thomas Carey  Identity Verified
Local time: 20:18
Member (2011)
French to English
+ ...
TOPIC STARTER
Solved Apr 23, 2014

Hi, thanks, esperantisto

Yes, pretty sure it was Word. Anyway, problem solved by selecting everything again, changing all font to normal arial and saving, and then saving once more in .docx. I don't know why that didn't work the first time I tried...

thanks again,

Tom


 

LEXpert  Identity Verified
United States
Local time: 13:18
Member (2008)
Croatian to English
+ ...
Similar problem, same solution Apr 23, 2014

Thomas Carey wrote:

Hi, thanks, esperantisto

Yes, pretty sure it was Word. Anyway, problem solved by selecting everything again, changing all font to normal arial and saving, and then saving once more in .docx. I don't know why that didn't work the first time I tried...

thanks again,

Tom


Glad that worked for you! I just saw your post, and I've had similar issues in MemoQ with umlauted characters in OCR'd German files - the often appear with a tag pair around every such letter.
Indeed, selecting all and changing the font to normal Arial usually resolves the problem. For good measure I also change the language to EN, but I'm not sure if that step makes any difference.


 

David Turner  Identity Verified
Local time: 20:18
French to English
+ ...
As correctly surmised by esperantisto... Apr 29, 2014

Thomas Carey wrote:
At first, I thought it was due to the document being a converted .pdf, but I've been told it is actually an original word file. Now, nearly every time I receive a document from this client, I have these tag problems.
Tom


... the document is almost certainly a converted PDF if Arial Unicode MS is used for accents. PDFs really get in the way of agencies applying "Trados discounts" so some of them aren't too keen to pass on this information to translators in case they baulk at the idea ("no discounts for PDFs"). They are desperate to convert them and as a result there will usually be a whole host of other formatting and layout problems to be fixed before you can start the translation.


 

Thomas Carey  Identity Verified
Local time: 20:18
Member (2011)
French to English
+ ...
TOPIC STARTER
Definitely Word Apr 29, 2014

David Turner wrote:

... the document is almost certainly a converted PDF if Arial Unicode MS is used for accents.


Hi, Thanks for the info.

Not this time. The sender confirmed the document was created in word.


 

Sofia Costa  Identity Verified
Portugal
Local time: 19:18
English to Portuguese
+ ...
Umlauted letters between tags in MemoQ May 29, 2015

Thank you all.

Your comments helped me a lot.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Tag soup - accents

Advanced search






SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running and helps experienced users make the most of the powerful features.

More info »
SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running, helps experienced users make the most of the powerful features, ensures new

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search