ProZ.com global directory of translation services
 The translation workplace
Ideas

 
User
Thread poster: Susan Welsh
How to change default character encoding for Cyrillic

Susan Welsh  Identity Verified
United States
Local time: 20:00
Member (2008)
Russian to English
+ ...
Jul 3, 2010

I have Windows XP, and need to have a default Unicode UTF8 character encoding, instead of whatever the current default encoding is (I don't know how to tell). When I try to use text glossaries created on another machine (Linux) with a .utf8 extension, the text comes out garbled. In German, the accented characters are garbled. In Russian (Cyrillic), the whole text is garbled.

I recently had to reinstall the operating system, so whatever I had before, which was working, has been wiped out. I recall that there was a cmd command that fixed it, but I can't find where I stored that information. I am not experienced at using the command line in Windows, to put it mildly.

Sorry if this message is confused--I am not an expert!

Thanks for your help,
Susan


Direct link Reply with quote
 

Natalie  Identity Verified
Poland
Local time: 02:00
Member (2002)
English to Russian
+ ...

Moderator of this forum
Hi Susan Jul 3, 2010

Could you please specify, WHERE exactly do you need the Cyrillic encoding? In you browser(s)? Or in Win XP in general?

Natalia


Direct link Reply with quote
 

Susan Welsh  Identity Verified
United States
Local time: 20:00
Member (2008)
Russian to English
+ ...
TOPIC STARTER
I'm not quite sure... Jul 3, 2010

Hi Natalia,

Mainly I have noticed the problem in these text files, so I guess that's the answer to your question. I need it in WordPad or Notepad. I think with the browser (Firefox) it's okay.
(Except that I used to have a phonetic Russian keyboard that matches the Latin QWERTY keyboard, and now I can't figure out how to get it back. But that's an unrelated problem!)

Susan


Direct link Reply with quote
 

RieM  Identity Verified
United States
Local time: 20:00
English to Japanese
+ ...
Check the Regional and Language Options settings Jul 3, 2010

For the keyboard question:

From the Start > Control Panel > Regional and Language Options. Click on the Languages tab, and click Details. Under "Installed services" do you see Russian and English? Or do you see English only?

If you need to type something in Russian (need an input method), click on "Add" and add select Russian from the Input language. You are probably prompted to insert the XP media to the DVD/CD Drive, and you need to reboot the computer.

This alone might fix the conversion (read in the utf-8 encoded file) problem as well. If the keyboard is already installed, then there's another step to take under the Advanced tab, but I want you to check these settings first.

Let us know what you find out!

Rie

PS: To change these settings, you need to have an administrator privilege.


Direct link Reply with quote
 

Susan Welsh  Identity Verified
United States
Local time: 20:00
Member (2008)
Russian to English
+ ...
TOPIC STARTER
To Rie Jul 3, 2010

I already have the languages installed in the way you described. I can type in the two languages (Russian and German) correctly, using the alternative keyboard layouts. The problem is when I try to read something in utf8 that was created on Linux (it may have nothing to do with where it was created, but more with the encoding itself).

I looked on the "Advanced" tab but could see nothing relevant to encodings.

Susan


Direct link Reply with quote
 

RieM  Identity Verified
United States
Local time: 20:00
English to Japanese
+ ...
If so.. Jul 3, 2010

That means, you should be able to type in Russian or German in Notepad, right? Then, it might has to do with the text file you are trying to open in Notepad. Were you able to open the same file without any problem before reinstalling XP?

Also, does the Advanced tab - code conversion table have 28595 checked?

Rie

(edited)

Oh, I now understand the keyboard issue - it's a mapping/layout issue, isn't it? I'm afraid I cannot help you with this, but from the same place (Installed service - language - keyboard - Russian), can you activate the Properties button on the right and take a look? Non-English languages provide options to change keyboard layout, but I have never gone this deep with Russian.

Rie


[Edited at 2010-07-03 22:13 GMT]


Direct link Reply with quote
 
Daniel García
English to Spanish
+ ...
Windows XP supports Unicode (and UTF-8) by default Jul 3, 2010

Windows XP is Unicode compliant by default. You don't need to install or enable anything to make it work with UTF-8 files.

Did you try using the File-Open command from Notepad? (rather than double clicking).

There you can choose UTF-8 as format, which should open the files corrctly.

Here's a detailed ste-by-step procedure:

http://www.herongyang.com/Unicode/Notepad-Open-UTF-8-Text-File.html

Microsoft Word can do the same. You only need to activate the "Confirm conversions at Open" in Word and then Open the file as Encoded text. The MS Word Help should provide detailed steps.


Daniel

PS. IF you need to edit these UTF-8 and return them to your Linux colleagues, be careful. Your saved files might not be totally compatible with Linux.


Direct link Reply with quote
 

Jack Doughty  Identity Verified
United Kingdom
Local time: 01:00
Member (2000)
Russian to English
+ ...
Russian QWERTY-type keyboard Jul 3, 2010

Windows only includes two Russian Cyrillic keyboards, neither of which is based on the QWERTY layout. You can download instructions and files for one that is from Paul Gorodyansky's site - http://winrus.com/kbd_e.htm

Direct link Reply with quote
 

Susan Welsh  Identity Verified
United States
Local time: 20:00
Member (2008)
Russian to English
+ ...
TOPIC STARTER
Some progress Jul 4, 2010

Thanks to all, and to Didier who responded off-list. I have now established that:

1. Notepad is compatible with utf8, and I can open it the way Daniel suggested, BUT, it is not recognizing line endings, so my glossary is all jammed together, with little vertical boxes sprinkled all over the place. I.e., it is unusable. There may be something I'm missing, but as of now, it's unusable. (By the way, my "Linux colleagues" are myself. I normally work in Linux, but am being forced to use Windows in order to use certain programs that only work with Micro$oft products. So I am passing my own files back and forth between operating systems.)

2. Wordpad is NOT compatible with utf8.

3. I used the text editor J-Edit instead. It needs some special tweaking to make it read utf8, but it tells you what it wants, and when I did it, it worked fine.

4. HOWEVER, there is another problem I had forgotten, when Natalia asked me exactly where I was having problems. I have a Russian-English dictionary called Interpretatio, and it is coming up nothing but ????? marks, where there should be Cyrillic. Now I recall that it was for that dictionary that I had been instructed to do something from the command line in Windows, to get the encoding right. But I can't remember what it was!

5. Jack, on the question of the QWERTY keyboard, thanks for the clarification re Windows. I have used the Gorodyansky application (probably that's what I was using before I had a hard disk crash and had to reinstall everything), but I tried about a dozen times yesterday to install it, and for some reason I could never get the program to download properly. It would appear in the "download" box with its icon, but then I couldn't find it on the computer. It was not in the "downloads" folder; so I did a search of the entire computer, and it came up with nothing by that name.

I think maybe it was just a bad day, and I will try again when I regain the fortitude.

Thanks again all for your help. (As a friend of mine says, computers work great until you put programs on them.)

Susan


Direct link Reply with quote
 

Jack Doughty  Identity Verified
United Kingdom
Local time: 01:00
Member (2000)
Russian to English
+ ...
My own YaWERTY keyboard Jul 4, 2010

I had no trouble installing Paul Gorodyansky's keyboard on my desktop (XP) but couldn't do it on my laptop (Vista). In the end I created a new keyboard almost identical to Paul's using MS Keyboard Layout Manager, and installed that instead. I now have it on my desktop too. If you email me via ProZ, I could send you my keyboard and instructions for installing it.

[Edited at 2010-07-04 11:13 GMT]


Direct link Reply with quote
 

Susan Welsh  Identity Verified
United States
Local time: 20:00
Member (2008)
Russian to English
+ ...
TOPIC STARTER
QWERTY keyboard Jul 4, 2010

Thanks, Jack. I managed to install the Gordodyansky files properly on one of my two newly revamped machines (not the one I was having problems with before, which I will revisit on Tuesday, after the holiday). The main "trick" here was to RTFM more carefully. (However, on this home machine, the download file appeared in the "Downloads" folder where it was supposed to. That was not the case on the office machine. So if I still have trouble with that, I may get back to you on the other option you have up your sleeve.)

Susan




[Edited at 2010-07-04 15:50 GMT]


Direct link Reply with quote
 
Daniel García
English to Spanish
+ ...
I'm glad it worked! Jul 4, 2010


Susan Welsh wrote:

1. Notepad is compatible with utf8, and I can open it the way Daniel suggested, BUT, it is not recognizing line endings, so my glossary is all jammed together, with little vertical boxes sprinkled all over the place. I.e., it is unusable.


Ah, OK, Linux and Windows use a different characters for line ending. It seems that Notepad for XP cannot recognise the Linux line endings automatically. It should be possible to search and replace, though.

Otherwise, using a text editor with more options like JEdit is a better way...




There may be something I'm missing, but as of now, it's unusable. (By the way, my "Linux colleagues" are myself. I normally work in Linux, but am being forced to use Windows in order to use certain programs that only work with Micro$oft products. So I am passing my own files back and forth between operating systems.)


If these files are in UTF-8, be aware that Linux applications often do not include the BOM in UTF-8 files.

Windows applications do often include the BOM in UTF-8 files when you save them.




2. Wordpad is NOT compatible with utf8.


WordPad can open UTF-8 files if they have a BOM but it does not recognise them as UTF-8 if they don't have it.

Here's more info about the BOM
http://en.wikipedia.org/wiki/Byte-order_mark




4. HOWEVER, there is another problem I had forgotten, when Natalia asked me exactly where I was having problems. I have a Russian-English dictionary called Interpretatio, and it is coming up nothing but ????? marks, where there should be Cyrillic. Now I recall that it was for that dictionary that I had been instructed to do something from the command line in Windows, to get the encoding right. But I can't remember what it was!


Probably, here you need to set the language for non-unicode applications to Russian:

Control Panel -> Regional and Language Options -> Advanced -> Language for non-unicode programs (Change it to Russian).

Daniel


Direct link Reply with quote
 

Susan Welsh  Identity Verified
United States
Local time: 20:00
Member (2008)
Russian to English
+ ...
TOPIC STARTER
Hooray, it worked! Jul 4, 2010


Daniel García wrote:



4. HOWEVER, there is another problem I had forgotten, when Natalia asked me exactly where I was having problems. I have a Russian-English dictionary called Interpretatio, and it is coming up nothing but ????? marks, where there should be Cyrillic. Now I recall that it was for that dictionary that I had been instructed to do something from the command line in Windows, to get the encoding right. But I can't remember what it was!


Probably, here you need to set the language for non-unicode applications to Russian:

Control Panel -> Regional and Language Options -> Advanced -> Language for non-unicode programs (Change it to Russian).


Worked like a charm! And thanks for the info on what a BOM is, about which I was clueless. I thought there was a B missing at the end.

Susan


Direct link Reply with quote
 

Susan Welsh  Identity Verified
United States
Local time: 20:00
Member (2008)
Russian to English
+ ...
TOPIC STARTER
QWERTY keyboard problem solved Jul 10, 2010


Susan Welsh wrote:

Thanks, Jack. I managed to install the Gordodyansky files properly on one of my two newly revamped machines (not the one I was having problems with before, which I will revisit on Tuesday, after the holiday). The main "trick" here was to RTFM more carefully. (However, on this home machine, the download file appeared in the "Downloads" folder where it was supposed to. That was not the case on the office machine.



Oddly enough, I still could not get it to download on my office PC. I tried a different download altogether, and that didn't work either. I got a message saying downloads completed, but there was nothing there. I switched from Firefox to Internet Explorer and ... it worked! Go figure.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Natalie[Call to this topic]
Nilton Junior[Call to this topic]
Prachya Mruetusatorn[Call to this topic]

You can also contact site staff by submitting a support request »

How to change default character encoding for Cyrillic






SDL AutoSuggest Creator Add-on
Speed up manual translations with sub-segment matching

AutoSuggest accelerates translation editing in SDL Trados Studio 2014 through intelligent sub-segment matching suggestions while you type.

More info »
SDL MultiTerm Extract 2014
Save time by automatically extracting terms. Save 15% on ProZ.com

SDL MultiTerm Extract 2014 allows you to automatically create candidate term lists from your existing documentation. This removes the manual effort involved with traditional terminology creation, allowing you to rapidly add terms to SDL MultiTerm.

More info »