Wordfast TMs in ANSI or Unicode, and desktop search
Thread poster: Oliver Walter

Oliver Walter  Identity Verified
United Kingdom
Local time: 03:44
Member (2005)
German to English
+ ...
Apr 5, 2013

I've tried searching the internet but can't find an answer to this.

The problem: stated most briefly: How can I use a desktop search program to find text strings in Wordfast translation memories?

A bit more detail: When I use a desktop search program (I'm using Google Desktop Search (GDS), version 5.9 - no longer available from, or supported by, Google) it never finds my search word in my Wordfast translation memories, but it finds it in other text and Word files that contain it.
By using a "hex view" program (HxD in my case) to see the exact contents of the Wordfast TM, I have found that this is because the TM is in a Unicode format that takes 2 bytes per character: usually (perhaps all the time) one byte is the standard ANSI character code and the other is zero. (It's probably UTF-16, big-endian, since the first 2 bytes of the file are (in hexadecimal) FF FE - or is that the little-endian order?.)

The Wordfast userguide (wordfast.doc) states "Wordfast uses TMs in either plain text format (ANSI), or Unicode format (UTF-16 only, on Mac or PC)." but there is no information stating how to create or save a TM in the chosen one of those formats. I made a copy of one of my TMs, used a text editor (Notepad++) to convert it into ANSI (HxD confirmed that this was correctly done), then used Wordfast Classic (version 5.61k) to "reorganize" the TM. Then I looked at the TM again with HxD, and it was UTF-16 again!

Two possible solutions - can somebody inform me how to achieve either of these?
  1. (My preferred solution, if possible) Instruct Wordfast to save the TM as "plain text" (=ANSI, 1 byte per character)
  2. If that is not possible, my second-best would be a desktop-search program to use in place of GDS, that is equally happy searching for ANSI and UTF-16


Can somebody help, please? (including, if necessary, telling me where I could have easily found the answer!)
Oliver

[Edited at 2013-04-06 12:15 GMT]


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 04:44
Member (2006)
English to Afrikaans
+ ...
UTF16LE Apr 5, 2013

Oliver Walter wrote:
The Wordfast userguide (wordfast.doc) states "Wordfast uses TMs in either plain text format (ANSI), or Unicode format (UTF-16 only, on Mac or PC)." but there is no information stating how to create or save a TM in the chosen one of those formats.


I also used to have ANSI memories (because that enabled me to use programs that are not Unicode-aware) but WFC later switched them all to UTF16LE, so at this time I can only open WFC memories in programs that don't mangle it back into ANSI.

If that is not possible, my second-best would be a desktop-search program to use in place of GDS, that is equally happy searching for ANSI and UTF-16


See here for some options:
http://www.proz.com/forum/software_applications/244520-seeking_suggestions_on_free_open_source_terminology_management_software-page2.html


Direct link Reply with quote
 

Oliver Walter  Identity Verified
United Kingdom
Local time: 03:44
Member (2005)
German to English
+ ...
TOPIC STARTER
A solution Apr 6, 2013

I've found a solution, prompted by an email sent to me by David Daduc (Wordfast user support team).
He wrote:
"If you wish to keep your Wordfast Classic TM in ANSI, rather than Unicode, simply open your TM in a text editor and save it (preferably under a new name, for security reasons) as an ANSI text file. Once you select an existing ANSI TM in Classic, it will keep it in ANSI. (When you create a new TM, you can switch it to ANSI using a text editor right after you create it.)

For most users, my recommendation is to keep TMs in Unicode, as they are created by the program. Switching TMs to ANSI is good only if you have a special need to do so."


and my response is: My immediate thought when I read David's email was "I've already tried that, and it doesn't work!"

However, I have now done a further test, with this result:

1. WordFast Classic version 5.61k (which is the one I have most used until now) does not do that: I can use a text editor to save a TM in ANSI format; but then when I tell WFC (5.61k) to reorganise it, it is converted back to UTF-16.

2. However, after reading David's email I tried the same thing with version 5.92m, which I also have. When I reorganised a TM after converting it into ANSI, the result was still ANSI! Therefore, I shall use 5.92m now, not 5.61k.

My reply to David's remark about "a special need to do so" is: Yes, my special need is the fact that Google Desktop Search can find ANSI text but not UTF-16.

Thank you very much David, and I hope this is useful to somebody!

Oliver

[Edited at 2013-04-06 12:19 GMT]


Direct link Reply with quote
 
Post removed: This post was hidden by a moderator or staff member because it was not in line with site rule


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Wordfast TMs in ANSI or Unicode, and desktop search

Advanced search


Translation news related to Wordfast





SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »
PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search