Pages in topic:   [1 2] >
Unicode status report- please post encoding issues here
Thread poster: Jason Grimes
Jason Grimes
Jason Grimes
Local time: 02:53
SITE STAFF
Mar 3, 2006

This is a summary of the status of the ongoing project to convert the site to Unicode, first announced here.

To learn more about what Unicode is and how it affects you at ProZ.com, please see these FAQs.

Unicode headers are now turned on for the entire site, which means that browsers should default to viewing every site page in Unic
... See more
This is a summary of the status of the ongoing project to convert the site to Unicode, first announced here.

To learn more about what Unicode is and how it affects you at ProZ.com, please see these FAQs.

Unicode headers are now turned on for the entire site, which means that browsers should default to viewing every site page in Unicode. All new data posted to the site should now be in Unicode.

Although most site data is not yet encoded in Unicode and would therefore appear garbled, in much of the site we are attempting to automatically convert non-Unicode data into Unicode as it is displayed. Logged-in users are then given the opportunity to confirm or correct the automated conversion. For more details about how to help the automated conversion system, please see this FAQ: What are these checkmark and arrow icons appearing by text throughout the site?

This automatic conversion is currently happening in the most active areas of the site.

There will still be some data on some pages that does not display correctly. When you find encoding-related problems, please post them in this thread and we will try to resolve them quickly.

Thanks,

Jason
Collapse


 
Jack Doughty
Jack Doughty  Identity Verified
United Kingdom
Local time: 06:53
Russian to English
+ ...
In memoriam
Checkmark and arrow icons do not always appear where they are needed Mar 3, 2006

Would it be possible for users to set checkmarks against words or passages that are wrongly encoded, and then go through the usual procedure with them?

 
KathyT
KathyT  Identity Verified
Australia
Local time: 17:53
Japanese to English
Thanks for all the hard work, Jason.... Mar 4, 2006

The improvement has been amazing in most cases, even in pages that use a combination of Asian and European fonts.

I have run into one problem, however, at the following:

http://www.proz.com/kudoz/1269980

None of the (couple of dozen of) alternative encodings are suitable, and there's no option to send a message saying so within the page (as has been the case when o
... See more
The improvement has been amazing in most cases, even in pages that use a combination of Asian and European fonts.

I have run into one problem, however, at the following:

http://www.proz.com/kudoz/1269980

None of the (couple of dozen of) alternative encodings are suitable, and there's no option to send a message saying so within the page (as has been the case when other encoding problems have arisen.)
- So just letting you know about it here.....hope that's OK!
Thanks again!
Collapse


 
Walter Landesman
Walter Landesman  Identity Verified
Uruguay
Local time: 03:53
English to Spanish
+ ...
Profile Mar 4, 2006

Some items in Profile not working properly. Personal Data information not working at all, garbled. Spanish accents are not recognized. Neither the Spanish letter ñ.

 
Özden Arıkan
Özden Arıkan  Identity Verified
Germany
Local time: 07:53
Member
English to Turkish
+ ...
Profiles II Mar 4, 2006

Hi Jason,

Garbled characters in profiles can be manually corrected by re-entering them with the browser set to UTF-8. However, in the case of Portfolio and especially personal glossaries this would be so time-consuming that manual correction is a near-impossible task. Considering that these are two very important sections for a translator, I would appreciate if they had some priority in automatic conversion.

Regards,
Özden

[Edited at 2006-03-04 16:40]


 
Jane Luther
Jane Luther
Germany
Local time: 07:53
German to English
strange search results Mar 6, 2006

I posted this elsewhere (ProZ staff in the process of migrating the forums to unicode) on 1 March, i.e. before this forum was opened, but, I think it would be better here, as I think it must have something to do with the encoding:

When doing a German - English ProZ.com term search involving special characters (e.g. ü, ä ß) I've started getting totally unrelated search results.

An example: A search for "Kundengeschäft" ('whole words only' not ticked) brings a whole r
... See more
I posted this elsewhere (ProZ staff in the process of migrating the forums to unicode) on 1 March, i.e. before this forum was opened, but, I think it would be better here, as I think it must have something to do with the encoding:

When doing a German - English ProZ.com term search involving special characters (e.g. ü, ä ß) I've started getting totally unrelated search results.

An example: A search for "Kundengeschäft" ('whole words only' not ticked) brings a whole range of answers, none of which has anything to do with "Kundengeschäft", although one or two do contain the word "geschäft" as a compound:
http://www.proz.com/?sp=ksearch

Is there anything I can do to get around the problem until the migration has been completed?

By the way, I think you're all doing a great job!

Thanks,
Jane
Collapse


 
Victor Dewsbery
Victor Dewsbery  Identity Verified
Germany
Local time: 07:53
German to English
+ ...
Mixed results Mar 6, 2006

http://www.proz.com/powwow/698
The coding problems in the messages from potential powwow participants are now fixed (on my system), but the general text at the top of the page is still garbled (i.e. special German characters do not display properly). Imaginative phrases:
besonders f? langes Berlin-Wochenende
Das ist zwar zugegebenermaߥn eher f?tangestellte relevant
Gelegenheit f?e
... See more
http://www.proz.com/powwow/698
The coding problems in the messages from potential powwow participants are now fixed (on my system), but the general text at the top of the page is still garbled (i.e. special German characters do not display properly). Imaginative phrases:
besonders f? langes Berlin-Wochenende
Das ist zwar zugegebenermaߥn eher f?tangestellte relevant
Gelegenheit f?e Nicht-Berliner
eine zweit䧩ge Konferenz


Another point I've found under Community>Forums>recent posts:
The list of forum names in the left column includes [D骠Vu support].
On my system, the only thing between the D and the V is a symbol that looks (to my uneducated eye) like a Chinese character.
Collapse


 
Jane Luther
Jane Luther
Germany
Local time: 07:53
German to English
a line in the search results that can't be corrected Mar 6, 2006

http://www.proz.com/?sp=ksearch

Second to last personal glossary result:
-> "&Tile Full Pages - Flä&che besteht aus ganzen Seiten"
German side should read "Fläche besteht aus ganzen Seiten". None of the suggestions made take the & out of Fläche.


 
Jason Grimes
Jason Grimes
Local time: 02:53
SITE STAFF
TOPIC STARTER
Where specifically should checkmark/arrow buttons be added? Mar 7, 2006

Jack Doughty wrote:

Would it be possible for users to set checkmarks against words or passages that are wrongly encoded, and then go through the usual procedure with them?


Hi Jack,

We're adding more of these Unicode conversion buttons all the time. Where specifically would you like to see them added?

Thanks,

Jason


 
Jason Grimes
Jason Grimes
Local time: 02:53
SITE STAFF
TOPIC STARTER
Improved conversion routines in KudoZ Mar 7, 2006

Hi Kathy,

We have improved the automatic Unicode conversion routines in KudoZ. Are you still seeing the problem you reported?

Thanks,

Jason


 
Jason Grimes
Jason Grimes
Local time: 02:53
SITE STAFF
TOPIC STARTER
Added more auto-conversion to profiles, portfolios, personal glossaries Mar 7, 2006

Hi Walter and Özden,

We have improved the automatic Unicode conversion in profiles, portfolios, and personal glossaries. Please let me know if you still see problems in these areas.

Thanks,

Jason


 
Jason Grimes
Jason Grimes
Local time: 02:53
SITE STAFF
TOPIC STARTER
Search results improved (but still not perfect) Mar 7, 2006

Hi Jane,

Thanks for the report.

We have improved search results for non-ASCII characters. Searches should no longer contain unrelated results. Unfortunately, to accomplish this it was necessary to force the "whole words only" option to be selected when searching for non-ASCII characters. After more terms in the database have been converted to Unicode, we will convert our database indexes to Unicode so that you can again search for partial words containing non-ASCII char
... See more
Hi Jane,

Thanks for the report.

We have improved search results for non-ASCII characters. Searches should no longer contain unrelated results. Unfortunately, to accomplish this it was necessary to force the "whole words only" option to be selected when searching for non-ASCII characters. After more terms in the database have been converted to Unicode, we will convert our database indexes to Unicode so that you can again search for partial words containing non-ASCII characters.

An option has also been added to "Search for all likely character encodings". Selecting this option will cause your search to automatically be performed multiple times using the several most common character encodings for the source and target language you specify. For example, when searching for a Japanese to English term, the search will be performed using the UTF-8, Shift_JIS, ISO-8859-1, EUC-JP, and Windows-1252 encodings.

Thanks,

Jason
Collapse


 
KathyT
KathyT  Identity Verified
Australia
Local time: 17:53
Japanese to English
Update from Kathy Mar 7, 2006

Jason Grimes wrote:
Hi Kathy,
We have improved the automatic Unicode conversion routines in KudoZ. Are you still seeing the problem you reported?


Hi Jason,
Yes, I'm afraid. I just took a look and the specific example I posted before is exactly as it was then - none of the suggested alternative encodings are correct...

On the upside, it's the only time I've encountered that problem. In every other case, the correct encoding has been among the alternatives proposed instantly.

Thanks again, Kathy.


 
Jason Grimes
Jason Grimes
Local time: 02:53
SITE STAFF
TOPIC STARTER
Added attempts to convert powwow data to Unicode Mar 7, 2006

Hi Victor,

I have added auto-conversion routines to the powwow pages. Please let me know if you continue to see problems. Thanks for the report.

Thanks,

Jason


 
Walter Landesman
Walter Landesman  Identity Verified
Uruguay
Local time: 03:53
English to Spanish
+ ...
better but.... Mar 7, 2006

Jason Grimes wrote:

Hi Walter and Özden,

We have improved the automatic Unicode conversion in profiles, portfolios, and personal glossaries. Please let me know if you still see problems in these areas.

Thanks,

Jason


Hi Jason,

Profile and portfolio is OK. Personal data (street in my case) is not rigth yet.

Walter


 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Unicode status report- please post encoding issues here






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »