https://www.proz.com/forum/d%C3%A9j%C3%A0_vu_support/44378-these_embedded_codes_are_driving_me_crazy.html

These embedded codes are driving me crazy.
Thread poster: Alex Boladeras
Alex Boladeras
Alex Boladeras  Identity Verified
Indonesia
Local time: 11:24
English to Spanish
+ ...
Mar 29, 2006

Hi,
I am now a full licensee of DVX professional and these "embedded codes" are driving me craz, as they sometimes they split a w{23}ord in t{24}wo. Is there a way to avoid this? On poster said that codes may be moved to the end of a segment. Is this possible without altering the format of the original document?
Thanks in advance for your help!
Alex


 
Kevin Fulton
Kevin Fulton  Identity Verified
United States
Local time: 23:24
German to English
Check the original document Mar 29, 2006

Be careful of touching embedded codes at the beginning or the end of a segment, as they usually contain formatting information. This is also true of codes enclosing individual words (markers for bold, italics, etc.).

These codes frequently arise when the author (or editor) of the document changes formatting in the doc but leaves the initial (or closing) marker in place, which is not unusual since these codes can't be checked (only WordPerfect gives you the option of seeing all the
... See more
Be careful of touching embedded codes at the beginning or the end of a segment, as they usually contain formatting information. This is also true of codes enclosing individual words (markers for bold, italics, etc.).

These codes frequently arise when the author (or editor) of the document changes formatting in the doc but leaves the initial (or closing) marker in place, which is not unusual since these codes can't be checked (only WordPerfect gives you the option of seeing all the codes). They also are often inserted when a file has been converted from another format, such as PDF.

Caution!!
Even if you move these codes to the end, make sure *all* the codes in the segment remain in the original order.

[Edited at 2006-03-29 12:49]
Collapse


 
Victor Dewsbery
Victor Dewsbery  Identity Verified
Germany
Local time: 05:24
German to English
+ ...
Avoiding embedded codes Mar 29, 2006

There is a section on avoiding rogue codes at http://necco.ca/dv/word.htm#Rogue_codes

My favourite is saving the Word document "down" to Word 6.0/95, which gets rid of almost all rogue codes (although this is sometimes not advisable if it would spoil design features that are essential in the document, such as merged table cells, hyperlinks etc.). Some folks claim the same success by
... See more
There is a section on avoiding rogue codes at http://necco.ca/dv/word.htm#Rogue_codes

My favourite is saving the Word document "down" to Word 6.0/95, which gets rid of almost all rogue codes (although this is sometimes not advisable if it would spoil design features that are essential in the document, such as merged table cells, hyperlinks etc.). Some folks claim the same success by opening the Word document in OpenOffice, then saving it again as Word from inside OpenOffice. Again, this may change/lose some layout features, but it does wonders for the code.

The other general points (hyphenation, font changes etc.) are of course also important.
Collapse


 
Piotr Sawiec
Piotr Sawiec  Identity Verified
Local time: 05:24
English to Polish
+ ...
they drive everybody crazy Mar 29, 2006

If it is consolation for you, you are not alone. But I don't think that it will make you feel better

As for left formatting marks, I tried even sometimes to remove all formatting, and it failed to remove embedded codes, they were exactly in the same places. The Word 95 option also failed many times. However I did not try OpenOffice. What is terrible is the fact that the memory database records all entries with the emb
... See more
If it is consolation for you, you are not alone. But I don't think that it will make you feel better

As for left formatting marks, I tried even sometimes to remove all formatting, and it failed to remove embedded codes, they were exactly in the same places. The Word 95 option also failed many times. However I did not try OpenOffice. What is terrible is the fact that the memory database records all entries with the embedded codes so that in many cases you will not find even 100% matches in terms of words, when expressions differ with embedded codes.

I still use the former version of Deja Vu and I hoped, but did not try, that DVX is devoid of this problems.

Another thing. In Polish we have strange characters (as in other languages of course) and sometimes they are normally displayed in original sentences, and sometimes instead of ć, ś etc. I see c or s with embedded codes, sometimes both versions take place in the same document. Why? I have no idea.

DVX is supposed to work with Unicode, so maybe problems with non-English (or rather non-Latin) characters should cease to exist. Is it the case? What is also very annoying is the fact that I cannot type a letter "ę" (e with a hook), and this is the only Polish character that I cannot type. Why? Do these problems disappear with DVX?

Good luck

Piotr
Collapse


 
Klaus Herrmann
Klaus Herrmann  Identity Verified
Germany
Local time: 05:24
Member (2002)
English to German
+ ...
Clean up the source doc Mar 29, 2006

The best advice has already been given - clean up the source document. That's not a DejaVu-specific problem, the text would look equally silly in Tag Editor.
How to treat codes depends heavily on the source format. While it may be risky to even look at codes from Word documents, it's quite safe to delete certain formatting codes from various DTP formats, e.g. from Quark files. Your examples suggest manual kerning, i.e. moving W and o closer together. (Probably Quark, not enough codes for
... See more
The best advice has already been given - clean up the source document. That's not a DejaVu-specific problem, the text would look equally silly in Tag Editor.
How to treat codes depends heavily on the source format. While it may be risky to even look at codes from Word documents, it's quite safe to delete certain formatting codes from various DTP formats, e.g. from Quark files. Your examples suggest manual kerning, i.e. moving W and o closer together. (Probably Quark, not enough codes for InDesign). If this assumption is correct, the codes can be deleted safely, and should be since they're useless in the target document, anyway . F6 reveals the code, and kerning commands in Quark start with a k, followed by a numeric value (the kerning amount).
Collapse


 
Alex Boladeras
Alex Boladeras  Identity Verified
Indonesia
Local time: 11:24
English to Spanish
+ ...
TOPIC STARTER
Thank you! Mar 29, 2006

Thank you very, very much. The original document was PDF but I had it transformed into .doc with ABBY. As Victor says I will try to save the document in another format and see if I can get rid of all rogue codes. I'll run the experiment and will let you know the results.

Best,

Alex


 
Alex Boladeras
Alex Boladeras  Identity Verified
Indonesia
Local time: 11:24
English to Spanish
+ ...
TOPIC STARTER
I've tried it out and it works! Mar 29, 2006

My favourite is saving the Word document "down" to Word 6.0/95, which gets rid of almost all rogue codes.

Hi Victor,
I tried your tip and it worked! I've managed to get rid of most of these unwanted codes and that's made my life much, much easier. Thanks!

Alex


 
Yolanda Broad
Yolanda Broad  Identity Verified
United States
Local time: 23:24
Member (2000)
French to English
+ ...

MODERATOR
Standardizing fonts in OCR'd docs Mar 29, 2006

Klaus has just reminded me of the he "trick" I find most useful for preventing all those additional, stray codes. Before importing to DVX (I gave up on DV3 a good year ago), I highlight the whole text and then go to Format>Font>Character Spacing and set Spacing and Position to "Normal"; and for good measure, I make sure the Kerning is set to the minimum (8). This gets rid of a whole lot of problems.

 
Victor Dewsbery
Victor Dewsbery  Identity Verified
Germany
Local time: 05:24
German to English
+ ...
Font salad, anyone? Mar 29, 2006

Yolanda Broad wrote:
Standardizing fonts in OCR'd docs


Too true.
Can't remember which program I was using at the time, but I once had text in an OCR'd (or PDF-extracted) file which switched fonts several times in a paragraph. I think it was text in Times, spaces in Arial. You can imagine the fun I had. (But the remedy was simple: select all, and set just one single font.)


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Pavel Tsvetkov[Call to this topic]

You can also contact site staff by submitting a support request »

These embedded codes are driving me crazy.






Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »