Cannot open Deja Vu X Professional .docx file in Word 2007
Thread poster: Anne Spitzmueller

Anne Spitzmueller  Identity Verified
Germany
Local time: 13:46
English to German
Jan 5, 2009

Hello,

I just exported a translated .docx file from Deja Vu X Professional. I tried to open it in Word (2007) but received the following error message (in German):

Unable to open the Office Open XML file because its content is causing problems.
Details: The name in the end tag of the element must match the element type in the start tag.
Position: Component: word/dokument.xml, Line: 2, Column: 26372

Unfortunately, I have no idea what that means ...

Can anyone help, please?

Many thanks in advance!

Anne

[Subject edited by staff or moderator 2009-01-05 14:33 GMT]


Direct link Reply with quote
 

Harry Bornemann  Identity Verified
Mexico
English to German
+ ...
Wrong tag order Jan 5, 2009

Anne Spitzmueller wrote:

The name in the end tag of the element must match the element type in the start tag.
Position: Component: word/dokument.xml, Line: 2, Column: 26372

Hi Anne,

I guess it means that a pair of tags is in the wrong order (closing before opening).
Solution: arrange all tags in DVX in the same order as in the source file.

HTH
Harry


[Edited at 2009-01-05 10:45 GMT]


Direct link Reply with quote
 

Eutychus  Identity Verified
Local time: 13:46
Member (2006)
French to English
+ ...
Possible workaround Jan 5, 2009

If your document does not have any Word 2007-specific features, you can use the following workaround:

create a translation memory just for the file you have translated

save the original source docx as a Word 2003 document (*.doc)

import this *.doc document

translate using your translation memory (it should get everything or nearly straight away)

export

reconvert to Word 2007 or return as a *.doc with an explanatory note.


Direct link Reply with quote
 

Anne Spitzmueller  Identity Verified
Germany
Local time: 13:46
English to German
TOPIC STARTER
Thanks Jan 5, 2009

@ Harry:

I checked all the codes and they are ok so this does not seem to be the problem ...

@ Eutychus:

Yes, that's a good idea! It had crossed my mind but I was hoping to find another way to solve the problem since there are so many codes. (Before I started translating the file, I seperated all the codes at the start of and at the end of all the segments to reduce the number of codes in the TM. Unfortunately, this means that there will be lots of fuzzy matches that will need adjusting when 're-translating' the file).


Direct link Reply with quote
 

Eutychus  Identity Verified
Local time: 13:46
Member (2006)
French to English
+ ...
so many codes? Jan 5, 2009

Do you know why you had so many codes? (eg round accents?). Was the document made using character recognition software or something like that?

Direct link Reply with quote
 

Anne Spitzmueller  Identity Verified
Germany
Local time: 13:46
English to German
TOPIC STARTER
@ Eutychus Jan 5, 2009

I don't know why there are so many codes, sorry. What do you mean by "round accents"?

I am off to bed now, it's already late in New Zealand. Thank you for all suggestions and ideas, I really hope, I can solve this problem tomorrow. (I am supposed to deliver the translation to the client tomorrow).


Direct link Reply with quote
 

Eutychus  Identity Verified
Local time: 13:46
Member (2006)
French to English
+ ...
clarification Jan 5, 2009

Sorry, what I'm getting at is that one problem (in French to English at least) is that documents made using OCR software (for reading PDFs) often result in lots of 'unnecessary' codes in DVX, for instance one either side of a non-standard character such as an accent (so "régénérées" will come out something like "r{245}é{246}g{247}é{248}n{249}é{250}es") or for lots of formatting required to make a Word document look as similar as possible to the original PDF.

This is enough to put me off doing any such file in DVX at all, at least as it stands (there are ways round this too, but that's another topic).

This does not solve your current problem but it might be worth considering this for future projects. If it is the case that your document is a scan of a PDF, one consolation is that your client is almost sure not to require the document in docx format because that will not have been the original format of the document.


Direct link Reply with quote
 

Rossana Triaca  Identity Verified
Uruguay
Local time: 08:46
Member (2002)
English to Spanish
A quick way to confirm it's a code issue... Jan 5, 2009

is to create a copy of the project, select all the rows, and copy the source text to the target rows (F5); if the file now exports OK, then you're certain that some code is messed up along the way (BTW, this is a healthy approach to all DVX projects prior to begin work). If the file fails this export test it means the original file has some problem, in which case I would save it as .RTF before importing it (if possible), check again if it exports with the source text copied, and then pre-translate it with the previous TM as was already suggested.

If you have too much to tweak with a full pre-translation of the "new" RTF file, you can (again with a copy) try to see where the problem code is by cutting the file by half in Word, importing it and pre-translating it (this is half the same file, so it should be 100% matches), and exporting it... rinse and repeat by halves until you can isolate where the wrong code is (this obviously works for short documents, otherwise it's probably faster to tweak the codes in the matches). Of course, you may have more than one code wrong...

Regardless of the solution for this case, too many codes in an otherwise simple .docx file does not bode well in general; there are many macros out there to pre-process these files (again, if possible) and get rid of rogue codes (e.g. www.necco.ca/dv/word_macros.htm).


Direct link Reply with quote
 
David Turner  Identity Verified
Local time: 13:46
French to English
+ ...
Workaround for rogue codes Jan 5, 2009

Eutychus wrote:
Sorry, what I'm getting at is that one problem (in French to English at least) is that documents made using OCR software (for reading PDFs) often result in lots of 'unnecessary' codes in DVX, for instance one either side of a non-standard character such as an accent (so "régénérées" will come out something like "r{245}é{246}g{247}é{248}n{249}é{250}es") or for lots of formatting required to make a Word document look as similar as possible to the original PDF.


If you have codes like that around umlaut vowels or accented characters, it's worth making sure that this is not due to a change in font name (Arial to Times say) or size (10 to 11 say). I've seen this happen in some OCR'd files.
It would probably also be best to drop down to 2003, turn off smart tags and track changes, set the language to the same throughout, remove any soft hyphens and make sure that all paragraph marks are in fact real pilcrows (replace ^13 by ^p). In addition, cut and paste the whole document.
The resulting file should then be relatively rogue code free.


Direct link Reply with quote
 

Anne Spitzmueller  Identity Verified
Germany
Local time: 13:46
English to German
TOPIC STARTER
@ Rosanna Jan 5, 2009

Thanks, Rosanna, this is exactly what I did first thing this morning (copy the original segments) and I could export the English text without a problem so the source text cannot be the problem.

@ Eutychus and David, thank you for your input and advice, the information is very helpful for me.

[Edited at 2009-01-05 19:43 GMT]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Pavel Tsvetkov[Call to this topic]

You can also contact site staff by submitting a support request »

Cannot open Deja Vu X Professional .docx file in Word 2007

Advanced search






Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
Across v6.3
Translation Toolkit and Sales Potential under One Roof

Apart from features that enable you to translate more efficiently, the new Across Translator Edition v6.3 comprises your crossMarket membership. The new online network for Across users assists you in exploring new sales potential and generating revenue.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search