Numbering in Word makes character encoding go nuts
Thread poster: Ksenia Sergeeva

Ksenia Sergeeva  Identity Verified
Russian Federation
Local time: 20:13
English to Russian
+ ...
Mar 16, 2016

So, I've been trying to create a proper Deja Vu project (with Russian as source language) and start the translation, but some segments kept looking like the character encoding is wrong... You know, going all ãòēpù è ùòå and things like that. I've spent a lot of time trying to do something about it, and then some more time trying to see where it happened. Well, turns out that all the segments which go right after auto numbering numbers (e.g. 1.3. This and that) became unreadable in Deja Vu.
Any ideas about what I can do about it?


Direct link Reply with quote
 
VIP9N
Local time: 20:13
Russian to English
+ ...
More info required Mar 17, 2016

So, I've been trying to create a proper Deja Vu project...


With files in which format have you tried: *.doc or *.docx, *.ppt or *.pptx, etc.?

...some segments kept looking like the character encoding is wrong... You know, going all ãòēpù è ùòå and things like that


It would be plausible in the prior-to-unicode reality, but today sounds weird. As long as you didn't mention your incoming format for the files, one would guess only about the reasons: word-file created in Chinese/Japanese version of office, or made in MS Office 97 Or maybe it's just the font, selected on your PC for dispalying your operations in DéjàVu panes does not support Cyrillic characters.

... I've spent a lot of time trying to do something about it...

Like what, for example?


Direct link Reply with quote
 

Ksenia Sergeeva  Identity Verified
Russian Federation
Local time: 20:13
English to Russian
+ ...
TOPIC STARTER
More info :) Mar 18, 2016

It's a docx Word file.
I'm sure it wasn't created in Office 97 or Japanese version of Word. The font is the same throughout the document, and the font in Deja Vu displays Cyrillic characters just fine... until they have numbering in front of them. Deja Vu also failed to import the table of contents from this file.

VIP9N wrote:
Like what, for example?

Yes, you are right, this is weird. So I took some weird actions. I'm new to Deja Vu X3 and never had any problems with X2, so my actions were quite erratic... I've changed filters a couple of times, that's all I can remember properly. Then I decided to use another CAT tool, and this solved my problem.


Direct link Reply with quote
 
VIP9N
Local time: 20:13
Russian to English
+ ...
File cleaning is required Mar 18, 2016

It's a docx Word file. I'm sure it wasn't created in Office 97 or Japanese version of Word. The font is the same throughout the document, and the font in Deja Vu displays Cyrillic characters just fine... until they have numbering in front of them. Deja Vu also failed to import the table of contents from this file.


Well, I would say smth is wrong with the font of the original file. Probably, the Word-file had been saved with embedded fonts for digits or so. I would try to use free TransTools - http://www.translatortools.net/about.html

I'm new to Deja Vu X3 and never had any problems with X2, so my actions were quite erratic...


I have been using this tool for about fifteen years or so, and I would say that all CATs have their advantages and disadvantages, but the only problem, which is similar to yours I’ve ever heard in DéjàVu was its operation with the Armenian language. Never with Cyrillic characters.

Then I decided to use another CAT tool, and this solved my problem.


Well, if I had walked in your shoes I would try to clean formatting of the original file first. Then, to no avail, certainly I would try any other CAT and see its behaviour.


Direct link Reply with quote
 
mikhailo
Local time: 20:13
English to Russian
+ ...
ре Mar 22, 2016

So, I've been trying to create a proper Deja Vu project (with Russian as source language) and start the translation, but some segments kept looking like the character encoding is wrong... You know, going all ãòēpù è ùòå and things like that. I've spent a lot of time trying to do something about it, and then some more time trying to see where it happened. Well, turns out that all the segments which go right after auto numbering numbers (e.g. 1.3. This and that) became unreadable in Deja Vu.
Any ideas about what I can do about it?


Шрифты в оригинале. В документах на перевод старайтесь ограничиться стандартными виндовыми - TNR, Arial, Courier New, ну и новыми типа Calibri, если не хватает старых.
Посмотрите в оригинале стили, связанные с такими нумерованными абзацами. Может там номера хитрым шрифтом делаются, который Дежа применяет ко всему предложению.
После всяких горе-верстальщиков и не такие чудеса бывают.

It's a docx Word file.
I'm sure it wasn't created in Office 97 or Japanese version of Word. The font is the same throughout the document, and the font in Deja Vu displays Cyrillic characters just fine... until they have numbering in front of them. Deja Vu also failed to import the table of contents from this file.


Преобразуйте в DOCX. С ним дежа работает лучше.
А зачем вам TOC? Переведёте документ, обновите поле в переведенном документе и получите готовый TOC.
Это лучше, чем мучиться, расставляя отступы табами или пробелами в оглавлениях, сделанных недоумками.
Мне в таких трудах даже проще расставить заголовки по оригиналу, чтобы потом забацать автоматический TOC.


Yes, you are right, this is weird. So I took some weird actions. I'm new to Deja Vu X3 and never had any problems with X2, so my actions were quite erratic... I've changed filters a couple of times, that's all I can remember properly. Then I decided to use another CAT tool, and this solved my problem.


Дай Бог, чтобы в других КАТ у вас не возникли проблемы гораздо серьёзнее.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Pavel Tsvetkov[Call to this topic]

You can also contact site staff by submitting a support request »

Numbering in Word makes character encoding go nuts

Advanced search






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search