importing glossaries with Russian words into WF
Thread poster: Eva Leitner

Eva Leitner  Identity Verified
Germany
Local time: 21:45
Member (2011)
Russian to German
+ ...
Jun 8, 2012

Hello,

I think that maybe many of you have gone through this when working with Russian and trying to import glossaries they have made. I really hope that somebody can help me find an answer.

I have been trying to convert excel files (RU-GER) into txt files in order to import them as WF glossaries. I use a MacIntosh (with Mac OS X version 10.7.4) and I tried to do it with TextEdit, but I got only lines instead of the Russian words.
I have tried the following already, without success:

- I changed the encoding list in "prferences" of TextEdit and added "Cyrillic (Mac OS)",
- when saving the doc itself I selected either "Cyrillic (Mac OS)" or Unicode (UTF), but neither works.
- I have downloaded a trial version of "Notepad" for Mac and copied my excel data into it. - and -Aha!!! - it shows the Russian terms correctly. But it creates two columns automatically. I chose "text" in Preferences and saved the doc like that (there is no tab delimited" option here, at all). But WF will not open this file

Now I recently learned in a webinar that there should be a way to do glossaries also in word (which is not my first choice, but anyway better than nothing), by just typing the two terms (source and target) with "=" between them and then go to "find and replace" and replace all "=" by ^t
Very easy, even for me to do...

But when I type = into the "find" field, word says the field "find" can only contain letters! (I have the version Word for Mac 14.2.2).

So I am stuck in a jungle, argh! Does somebody have advice for me?
Thanks in advance!


Direct link Reply with quote
 

Yasmin Moslem  Identity Verified
Egypt
Local time: 22:45
English to Arabic
Excel > Save as > Unicode Jun 8, 2012

Dear Eva,

Eva Leitner wrote:

I have been trying to convert excel files (RU-GER) into txt files in order to import them as WF glossaries.


In Excel, instead of selecting: Save as > Tab-delimited Text, you should have selected: Save as > Unicode.

Try that on the very first good Excel file, and let us know.

Kind regards,
Yasmin


Direct link Reply with quote
 

Eva Leitner  Identity Verified
Germany
Local time: 21:45
Member (2011)
Russian to German
+ ...
TOPIC STARTER
Thanks, I tried this one, but it also doesn't work. Jun 13, 2012

Maybe you have another idea?
Thanks in advance!


Direct link Reply with quote
 
esperantisto  Identity Verified
Local time: 23:45
Member (2006)
English to Russian
+ ...
Incomplete instruction Jun 13, 2012

Yasmin Moslem wrote:

In Excel, instead of selecting: Save as > Tab-delimited Text, you should have selected: Save as > Unicode.


This instruction is good but a bit incomplete, failing to mention that Excel saves to this format in the ANSI encoding that depends on the user locale. And unless the locale is Russian/Belarusian/Ukrainian, the Cyrillic text will be garbled or lost.

I’d advise Apache OpenOffice or LibreOffice for such a task, because their export options are more flexible as compared to Excel:
a) open your Excel file in AOO/LibO, select Save As and CSV for the format and tick Modify filter settings (or whatever it’s in your interface language);
b) select tab as the field delimiter and nothing (just hit delete to empty the respective field, do not leave ' or " there) as the text delimiter in the export settings dialog and make sure that all tickable options are not ticked;
c) select, yes, Unicode, but not Unicode UTF-8;
d) after saving the file, change its file name extension from .csv to .txt.

[Edited at 2012-06-13 09:13 GMT]


Direct link Reply with quote
 
esperantisto  Identity Verified
Local time: 23:45
Member (2006)
English to Russian
+ ...
Try changing the encoding Jun 13, 2012

Eva Leitner wrote:

I have been trying to convert excel files (RU-GER) into txt files in order to import them as WF glossaries. I use a MacIntosh (with Mac OS X version 10.7.4) and I tried to do it with TextEdit, but I got only lines instead of the Russian words.



Try importing the text file(s) as MacCyrillic and as cp1251.

glossaries also in word …, by just typing the two terms (source and target) with "=" between them and then go to "find and replace" and replace all "=" by ^t


This is quite stupid. Just type termtabtranslation and save as plain text in UTF-16. Or try copying your Excel table into clipboard and pasting as unformatted text (if TextEdit has this option, MS Word or AOO Writer do), then save as text in UTF-16.


Direct link Reply with quote
 

Dominique Pivard  Identity Verified
Local time: 22:45
Finnish to French
Yasmin's instructions do work for me Jun 13, 2012

Eva Leitner wrote:
Maybe you have another idea?

Using the same setup as you (Excel 2011 and Wordfast Pro 3.0 on a Mac) and following Yasmin's instructions to the letter, I was able to import a sample Russian-German glossary without any problem, as shown here:

http://youtu.be/6G6mGO3nfY8?hd=1


Direct link Reply with quote
 

Dominique Pivard  Identity Verified
Local time: 22:45
Finnish to French
Saving as Unicode in Excel should be sufficient Jun 13, 2012

esperantisto wrote:
Yasmin Moslem wrote:
In Excel, instead of selecting: Save as > Tab-delimited Text, you should have selected: Save as > Unicode.

This instruction is good but a bit incomplete, failing to mention that Excel saves to this format in the ANSI encoding that depends on the user locale. And unless the locale is Russian/Belarusian/Ukrainian, the Cyrillic text will be garbled or lost.

The locale on my Mac is definitely not Russian/Belarusian/Ukrainian, yet I had no problem saving from Excel 2011 as Unicode (UTF-16) and importing directly into Wordfast Pro, as shown in the small, unedited video clip I recorded:

http://youtu.be/6G6mGO3nfY8?hd=1


Direct link Reply with quote
 
esperantisto  Identity Verified
Local time: 23:45
Member (2006)
English to Russian
+ ...
Specific to Mac or Excel 2011 Jun 13, 2012

Dominique Pivard wrote:

The locale on my Mac is definitely not Russian/Belarusian/Ukrainian, yet I had no problem saving from Excel 2011 as Unicode (UTF-16)


Do you mean the tab-delimited text? Then, perhaps, it’s specific to Mac or to Excel 2011. Excel 2007 in Windows 7 Russian exports it as cp1251 without any option to change the encoding.


Direct link Reply with quote
 

Dominique Pivard  Identity Verified
Local time: 22:45
Finnish to French
tab delimited text vs. Unicode text Jun 13, 2012

esperantisto wrote:
Do you mean the tab-delimited text? Then, perhaps, it’s specific to Mac or to Excel 2011. Excel 2007 in Windows 7 Russian exports it as cp1251 without any option to change the encoding.

There are (at least) two ways of saving an Excel spreadsheet as a text file: one option is called Text (tab delimited) (*.txt) in Excel 2010 and another one is called Unicode Text (*.txt). In Excel 2011 (Mac), these options are labelled Tab Delimited Text (*.txt) and UTF-16 Unicode Text (*.txt). Yasmin specifically instructed to use the Unicode option, which is what I did. And when you do this, you end up with a text file that has the correct encoding for cyrillic characters. Moreover, that file is in fact tab delimited, so it is just fine as an import file for WFP (or as a directly as a glossary for WFC).

In a nutshell: if you want to transform an Excel file into a glossary for WFP, WFC or WFA, do not use the tab delimited option, use the Unicode option. You will then get a text file that will work fine in Wordfast, with any language (including Russian).

My guess is that Eva did not save her Excel file as Unicode text, even though she was told to do so.

[Edited at 2012-06-13 18:45 GMT]


Direct link Reply with quote
 
esperantisto  Identity Verified
Local time: 23:45
Member (2006)
English to Russian
+ ...
Indeed Jun 14, 2012

Dominique Pivard wrote:

There are (at least) two ways of saving an Excel spreadsheet as a text file: one option is called Text (tab delimited) (*.txt) in Excel 2010 and another one is called Unicode Text (*.txt). In Excel 2011 (Mac), these options are labelled Tab Delimited Text (*.txt) and UTF-16 Unicode Text (*.txt). Yasmin specifically instructed to use the Unicode option, which is what I did.


Indeed, there are two filters that are essentially the same. Another example of typical microsoftish stupidity. Anyway, I confirm that selecting the Unicode text filter does produce a file importable as Wordfast glossary.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

importing glossaries with Russian words into WF

Advanced search


Translation news related to Wordfast





SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search