Mobile menu

Problem with Unicode translation memories
Thread poster: Johnson Sumpio
Johnson Sumpio
Local time: 07:33
Chinese to English
Apr 2, 2005

I am new to Wordfast. When I try to use Wordfast with a Chinese source doc loaded in MS Word, it says "The system does not support double-byte (DBCS): please use Unicode translation memories. Refer to manual." I can't find any information on how to access/install/use the "Unicode translation memories" in the manual.

I am using Wordfast 4.2 Build 43d and MS Office Word 2003 SP1 on an English WinXP Pro SP2 system. My wordprocessor can display Chinese characters.

How do I proceed? Please help. Thanks in advance.


Direct link Reply with quote
 

Robert Tucker
United Kingdom
Local time: 23:33
German to English
+ ...
DBCS Apr 2, 2005

I do not use a Microsoft O/S, Word or Wordfast myself, but have read some amount about Unicode formats elsewhere. In relation to your question, in the absence of more experienced advice, take a look at:

Q) What is a double byte character set (DBCS)?

at:
this page

16-bit languages (Chinese, Japanese, Korean)

at:

http://64.233.183.104/search?q=cache:9-fmw23nmWIJ:www.astti.ch/vault/wordfast/wordfast.doc%20double-byte%20wordfast&hl=en



[Edited at 2005-04-02 20:02]

[Edited at 2005-04-02 20:08]


Direct link Reply with quote
 
Sonja Tomaskovic  Identity Verified
Germany
Local time: 00:33
English to German
+ ...
Wordfast Unicode TM? Apr 2, 2005

Hi,

I'm not sure I understand your problem, so please bear with me if this is not what you are looking for.

Wordfast has two options for its internal TMs: save the TM as a normal txt or save it as Unicode txt. For double-byte characters the Unicode TM is mandatory, if I understand that one correctly.

When you create a new WF TM, save it as "Encoded txt". This is an option that can be chosen from the file type dropdown list in Word.

HTH.

Sonja


Direct link Reply with quote
 
Johnson Sumpio
Local time: 07:33
Chinese to English
TOPIC STARTER
Where is Unicode Text format Apr 4, 2005

Thanks for the replies.

Sonja - if I got you idea correctly - yes, that's what I've been trying to do, create and save a new TM in Unicode but I can't.

In MS Word (alone), the \File\Save As only offers the usual formats. I don't see any "Encoded txt." If I call out Wordfast and try to create a new TM (choosing TMX, TMW, or Unicode), Wordfast tells me to save it in "Unicode Text format" all right, but where do I specify that? The pull-down file save menu inside Wordfast offers the same formats as does Word; I don't see any Unicode-whatever format.


Direct link Reply with quote
 

Piotr Bienkowski  Identity Verified
Poland
Local time: 00:33
Member (2005)
English to Polish
+ ...
Choose "Plain Text" and a dialog should pop up Apr 4, 2005

mospeada wrote:

Thanks for the replies.

Sonja - if I got you idea correctly - yes, that's what I've been trying to do, create and save a new TM in Unicode but I can't.

In MS Word (alone), the \File\Save As only offers the usual formats. I don't see any "Encoded txt." If I call out Wordfast and try to create a new TM (choosing TMX, TMW, or Unicode), Wordfast tells me to save it in "Unicode Text format" all right, but where do I specify that? The pull-down file save menu inside Wordfast offers the same formats as does Word; I don't see any Unicode-whatever format.



Hi,

If your Save as dialog in Word does not have Encoded text or Unicode Text in the "Save file as type" list, then choose plain text and a dialog should pop-up where you can choose the encoding. Choose Unicode (UTF-16) from the available list of encodings.

HTH

Piotr


Direct link Reply with quote
 
Johnson Sumpio
Local time: 07:33
Chinese to English
TOPIC STARTER
Plain text Apr 4, 2005

I did. I - of course - tried saving in almost all the formats. None of them worked. The message about the Unicode translation memories kept coming up.

So, I downloaded and installed Wordfast ver. 5.0z. Created a new TM and saved it in Plain Text. No pop-up list for choosing encoding (both for versions 4 & 5) BUT no message about Unicode translation memories. Seems to be working now... ?


Direct link Reply with quote
 
Johnson Sumpio
Local time: 07:33
Chinese to English
TOPIC STARTER
Saving in encoded Plain Text Apr 5, 2005

The message about Unicode translation memories didn't come up the last time, so I thought the problem was solved. Wrong.

Although WF Ver.5.0z no longer gives me the message (as did Ver.4.2), it is unable to create a TM in encoded plain text. I can see in TM Edit that the Chinese source lang in the plain-text TM is all in garbage characters. Naturally, WF cannot function as it should.

Based on the information I gathered, Word 2003 no longer offers to save encoded plain text in its "Save As Type" menu as did the earlier version.

I am still experimenting with Word 2003. For instance, I open a blank doc and want to save it. If the default choice in the Save As Type is in Word DOC and I use the pull-down menu to choose Plain Text, and press Save, only then would Word pop a small window asking me if I want to encode in Windows (Default), MS-DOS, or Other Encoding. Here I can choose Other Encoding -> Unicode. I can add Chinese characters in this encoded plain-text file later, and read them with even the simple Notepad.

On the other hand, if I create a new TM in WF and go through with the process; at the end, the default Save As Type choice is already in the Plain Text. If I press Save, WF just creates a TM in a non-encoded plain text without complaint. Word does not pop a window asking for the encoding type. I guess herein lies my probem. Seems like WF is not making Word 2003 pop the small window asking for the encoding method, or whatever is the issue.

The above are my observation. I wonder if other people with Word 2003 and WF V5.0z combination have the same problem.


Direct link Reply with quote
 

Robert Tucker
United Kingdom
Local time: 23:33
German to English
+ ...
DBCS/Unicode Apr 5, 2005

I lost the ability to edit my first post and have notified “support”. My two references were:

http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,3954,00.html#dbcs

http://64.233.183.104/search?q=cache:9-fmw23nmWIJ:www.astti.ch/vault/wordfast/wordfast.doc%20double-byte%20wordfast&hl=en

My reason for re-posting is that it is not over evident to me from the above postings that the significance of double-byte is understood. It means that each CJK character needs to be represented by two 8-bit “words”. Thus one would expect that to save as utf-8 (made up of 8-bit words) would require additional information while saving as utf-16 (made up of two 8-bit words) should be easier. (Piotr, of course, suggests saving as utf-16)

Since your system is not CJK and is post Windows 95/98/NT4 double-byte mode is not available and Unicode must be used. Whether it saves as utf-8 or utf-16 it will need to know it is working with/saving double-byte characters.

The paragraph in my second reference:

“In Wordfast's main window, next to the translation memory path and name, you should see the (CJK) mention. This mention appears if the source language ISO code begins with either ZH-, JA- or KO-. This mention is essential for Wordfast to switch to a mode compatible with Chinese, Japanese, or Korean. TMs and glossaries must be reorganised or indexed when Wordfast displays the (CJK) mention, not before.”

seems to me to be very pertinent.


Direct link Reply with quote
 
Johnson Sumpio
Local time: 07:33
Chinese to English
TOPIC STARTER
CJK mention Apr 5, 2005

Robert Tucker wrote:

Since your system is not CJK and is post Windows 95/98/NT4 double-byte mode is not available and Unicode must be used. Whether it saves as utf-8 or utf-16 it will need to know it is working with/saving double-byte characters.

The paragraph in my second reference:

“In Wordfast's main window, next to the translation memory path and name, you should see the (CJK) mention. This mention appears if the source language ISO code begins with either ZH-, JA- or KO-. This mention is essential for Wordfast to switch to a mode compatible with Chinese, Japanese, or Korean. TMs and glossaries must be reorganised or indexed when Wordfast displays the (CJK) mention, not before.”

seems to me to be very pertinent.


I tried to understand the references in your previous post, especially the part quoted here because it seems to hold the key.

It says "Wordfast's main window." Which main window? The one popping out AFTER I press the green "f" icon on the WF menu bar inside Word? If it is, then I can't find any "translation memory path and name" in all the tabs, much less a "CJK mention."

Okay, how about... am I supposed to see the "translation memory path and name" in the process of saving a new TM? When the new TM is going to be saved, WF or Word allows me to indicate the path where the memory file will be saved, but no CJK mention here either.

The reference also says "if the source language ISO code begins with ZH-" I indicate ZH-xx for source language during creation of the new TM, but I don't see any "CJK mention" anywhere in the process.

Do I see the above EVEN BEFORE, DURING, or AFTER successfully creating my FIRST-NEW-and-WORKING TM anyway?

Again, I am using Word 2003 with Wordfast ver. 5.0z on an ENG WinXP Pro system, but my wordprocessor can read CHI characters - and my Notepad can display CHI characters inside Unicode-encoded Plain Text files (saved/edited using Word alone).

I know Word 2003 has been around for years, and the WF page says WF works with Word 2003. Makes me wonder why I am having this problem even at the starting point. Don't tell me I need a native/pure CHI, JAP, or KOR OS to run WF with Word 2003?


Direct link Reply with quote
 
Johnson Sumpio
Local time: 07:33
Chinese to English
TOPIC STARTER
Finally got it! Apr 5, 2005

Robert Tucker wrote:

“In Wordfast's main window, next to the translation memory path and name, you should see the (CJK) mention. This mention appears if the source language ISO code begins with either ZH-, JA- or KO-. This mention is essential for Wordfast to switch to a mode compatible with Chinese, Japanese, or Korean. TMs and glossaries must be reorganised or indexed when Wordfast displays the (CJK) mention, not before.”



I got what it means by "translation memory path and name" now, but I found the solution to my problem.

After creating the new TM (not encoded, yet), call out TM Editor in the menu bar. Press Tools, then choose "Rewrite TM as Unicode" for Special Filters. That's it. WF will rewrite the previously created TM in Unicode format. Now, I can see the source inputs in TM Editor are CHI and not garbage anymore, and WF is working fine.

I understand the solution is also mentioned in the WF ver. 5 doc but I got off track because it says "Glossary editor," and I searched for it literally (chuckle).

Thanks to all of you for your time and attention.




[Edited at 2005-04-05 17:34]


Direct link Reply with quote
 

Robert Tucker
United Kingdom
Local time: 23:33
German to English
+ ...
CJK Mention Apr 5, 2005

Robert Tucker wrote:

“In Wordfast's main window, next to the translation memory path and name, you should see the (CJK) mention. ...


There's:

"Current Translation Memory (Unicode) (CJK)"

followed by, presumably, a translation memory path at:

http://www.christophermayo.com/articles/2004/img/wordfast13.jpg



Christopher Mayo's Wordfast instructions:

http://www.christophermayo.com/articles/2004/wordfast.html





[Edited at 2005-04-05 16:02]


Direct link Reply with quote
 
Johnson Sumpio
Local time: 07:33
Chinese to English
TOPIC STARTER
CJK Mention in Main Window Apr 5, 2005


Robert Tucker wrote:

"Current Translation Memory (Unicode) (CJK)"

followed by, presumably, a translation memory path at:

http://www.christophermayo.com/articles/2004/img/wordfast13.jpg



Before the TM was rewritten in Unicode, there was only the Current TM in my main window; no mention of CJK like the one in the picture. After the TM was rewritten into Unicode (refer to process above), it says Current TM (Unicode) but still no mention of CJK.




[Edited at 2005-04-05 14:58]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Problem with Unicode translation memories

Advanced search


Translation news related to Wordfast





Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs