Mobile menu

These embedded codes are driving me crazy.
Thread poster: Alex Boladeras
Alex Boladeras  Identity Verified
Indonesia
Local time: 07:46
English to Spanish
+ ...
Mar 29, 2006

Hi,
I am now a full licensee of DVX professional and these "embedded codes" are driving me craz, as they sometimes they split a w{23}ord in t{24}wo. Is there a way to avoid this? On poster said that codes may be moved to the end of a segment. Is this possible without altering the format of the original document?
Thanks in advance for your help!
Alex


Direct link Reply with quote
 
Kevin Fulton
United States
Local time: 19:46
German to English
Check the original document Mar 29, 2006

Be careful of touching embedded codes at the beginning or the end of a segment, as they usually contain formatting information. This is also true of codes enclosing individual words (markers for bold, italics, etc.).

These codes frequently arise when the author (or editor) of the document changes formatting in the doc but leaves the initial (or closing) marker in place, which is not unusual since these codes can't be checked (only WordPerfect gives you the option of seeing all the codes). They also are often inserted when a file has been converted from another format, such as PDF.

Caution!!
Even if you move these codes to the end, make sure *all* the codes in the segment remain in the original order.

[Edited at 2006-03-29 12:49]


Direct link Reply with quote
 

Victor Dewsbery  Identity Verified
Germany
Local time: 01:46
German to English
+ ...
Avoiding embedded codes Mar 29, 2006

There is a section on avoiding rogue codes at http://necco.ca/dv/word.htm#Rogue_codes

My favourite is saving the Word document "down" to Word 6.0/95, which gets rid of almost all rogue codes (although this is sometimes not advisable if it would spoil design features that are essential in the document, such as merged table cells, hyperlinks etc.). Some folks claim the same success by opening the Word document in OpenOffice, then saving it again as Word from inside OpenOffice. Again, this may change/lose some layout features, but it does wonders for the code.

The other general points (hyphenation, font changes etc.) are of course also important.


Direct link Reply with quote
 

Piotr Sawiec  Identity Verified
Local time: 01:46
English to Polish
+ ...
they drive everybody crazy Mar 29, 2006

If it is consolation for you, you are not alone. But I don't think that it will make you feel better

As for left formatting marks, I tried even sometimes to remove all formatting, and it failed to remove embedded codes, they were exactly in the same places. The Word 95 option also failed many times. However I did not try OpenOffice. What is terrible is the fact that the memory database records all entries with the embedded codes so that in many cases you will not find even 100% matches in terms of words, when expressions differ with embedded codes.

I still use the former version of Deja Vu and I hoped, but did not try, that DVX is devoid of this problems.

Another thing. In Polish we have strange characters (as in other languages of course) and sometimes they are normally displayed in original sentences, and sometimes instead of ć, ś etc. I see c or s with embedded codes, sometimes both versions take place in the same document. Why? I have no idea.

DVX is supposed to work with Unicode, so maybe problems with non-English (or rather non-Latin) characters should cease to exist. Is it the case? What is also very annoying is the fact that I cannot type a letter "ę" (e with a hook), and this is the only Polish character that I cannot type. Why? Do these problems disappear with DVX?

Good luck

Piotr


Direct link Reply with quote
 

Klaus Herrmann  Identity Verified
Germany
Local time: 01:46
Member (2002)
English to German
+ ...
Clean up the source doc Mar 29, 2006

The best advice has already been given - clean up the source document. That's not a DejaVu-specific problem, the text would look equally silly in Tag Editor.
How to treat codes depends heavily on the source format. While it may be risky to even look at codes from Word documents, it's quite safe to delete certain formatting codes from various DTP formats, e.g. from Quark files. Your examples suggest manual kerning, i.e. moving W and o closer together. (Probably Quark, not enough codes for InDesign). If this assumption is correct, the codes can be deleted safely, and should be since they're useless in the target document, anyway . F6 reveals the code, and kerning commands in Quark start with a k, followed by a numeric value (the kerning amount).


Direct link Reply with quote
 
Alex Boladeras  Identity Verified
Indonesia
Local time: 07:46
English to Spanish
+ ...
TOPIC STARTER
Thank you! Mar 29, 2006

Thank you very, very much. The original document was PDF but I had it transformed into .doc with ABBY. As Victor says I will try to save the document in another format and see if I can get rid of all rogue codes. I'll run the experiment and will let you know the results.

Best,

Alex


Direct link Reply with quote
 
Alex Boladeras  Identity Verified
Indonesia
Local time: 07:46
English to Spanish
+ ...
TOPIC STARTER
I've tried it out and it works! Mar 29, 2006

My favourite is saving the Word document "down" to Word 6.0/95, which gets rid of almost all rogue codes.

Hi Victor,
I tried your tip and it worked! I've managed to get rid of most of these unwanted codes and that's made my life much, much easier. Thanks!

Alex


Direct link Reply with quote
 

Yolanda Broad  Identity Verified
United States
Local time: 19:46
Member (2000)
French to English
+ ...
Standardizing fonts in OCR'd docs Mar 29, 2006

Klaus has just reminded me of the he "trick" I find most useful for preventing all those additional, stray codes. Before importing to DVX (I gave up on DV3 a good year ago), I highlight the whole text and then go to Format>Font>Character Spacing and set Spacing and Position to "Normal"; and for good measure, I make sure the Kerning is set to the minimum (8). This gets rid of a whole lot of problems.

Direct link Reply with quote
 

Victor Dewsbery  Identity Verified
Germany
Local time: 01:46
German to English
+ ...
Font salad, anyone? Mar 29, 2006

Yolanda Broad wrote:
Standardizing fonts in OCR'd docs


Too true.
Can't remember which program I was using at the time, but I once had text in an OCR'd (or PDF-extracted) file which switched fonts several times in a paragraph. I think it was text in Times, spaces in Arial. You can imagine the fun I had. (But the remedy was simple: select all, and set just one single font.)


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Pavel Tsvetkov[Call to this topic]

You can also contact site staff by submitting a support request »

These embedded codes are driving me crazy.

Advanced search






memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »
SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs