Garbled accented characters in source
Thread poster: Samuel Keating

Samuel Keating  Identity Verified
Canada
Local time: 14:40
English to French
+ ...
Jun 4, 2012

When opening htm files written in French (all properly spelled, with no weird fonts or formatting), this is how the text in the "source" column of WF Pro 3 looks:

Cr{ut1}ation de soci{ut2}t{ut3} en Belgique

La Belgique est situ{ut1}e au c{ut2}ur politique de l'Union Europ{ut3}enne et se trouve {ut4}tre le pays le plus cosmopolite.

Etc.

This of course impacts the MT in the target column. For instance "Cr {ut1} ation of soci {ut2} t {ut3} in Belgium"

Why is this? And how can I get it to read text properly? It doesn't seem like that should be difficult, and none of the other users I know have this problem.

Meanwhile, when trying to upload the .htm file to WFA, I get a long error string ending with: "The system cannot find the file specified." --This, despite the fact that .htm files are listed as allowed file formats! (I did already try many times over, re-signed in, etc.)

I'm able to upload the htm.txml file that WFP created of the htm file, but that doesn't solve the issue (in WFA, source reads: "Cration de socit en Belgique")

This is all not very cool...


[Edited at 2012-06-04 21:54 GMT]


Direct link Reply with quote
 

Yasmin Moslem  Identity Verified
Egypt
Local time: 23:40
English to Arabic
No garbled Jun 5, 2012

Dear Samuel,

As far as I can see, it is not garbled, but rather coded. To double-check, open the HTML file in NotePad and let us know how the same sentences are written.

Kind regards,
Yasmin


Direct link Reply with quote
 

Dominique Pivard  Identity Verified
Local time: 23:40
Finnish to French
Source HTML Jun 5, 2012

Yasmin Moslem wrote:
As far as I can see, it is not garbled, but rather coded. To double-check, open the HTML file in NotePad and let us know how the same sentences are written.

The page in question appears to be this one:

http://www.france-offshore.fr/societe-belgique

If you look at the source code, you will see this:



The website is probably CMS-based. Translatable texts in it should be supplied in some other format than HTML.


Direct link Reply with quote
 

Samuel Keating  Identity Verified
Canada
Local time: 14:40
English to French
+ ...
TOPIC STARTER
TTX? Jun 5, 2012

Thanks.

I did get .ttx files as well, but I get the "this file contains no translatable segments" error.

I had some success with copying and pasting into Word and working from that, but surely that's a clunky way to do things?

I also got a TM with .iix, .mdf, .mtf, .mwf, and .tmw files. However, when trying to add it, WF doesn't see any of the files in the folder.

[Edited at 2012-06-05 07:04 GMT]


Direct link Reply with quote
 
esperantisto  Identity Verified
Local time: 00:40
Member (2006)
English to Russian
+ ...
Obsolete thing Jun 5, 2012

What you see is WFP’s way of presenting HTML entities that were widely used to code accented Latin, as well as non-Latin letters and many other characters in the pre-Unicode era. Today, those entities are obsolete, as all common browsers and operating systems support Unicode, but some stupid web designers still stick to them.

To rectify the issue, open the file(s) you need to translate in any text editor capable of properly rendering the entities and re-save them as UTF-8 without entities (don’t forget to modify the HTML headers). I’d recommend UniRed for such a task.

[Edited at 2012-06-05 06:54 GMT]


Direct link Reply with quote
 

Dominique Pivard  Identity Verified
Local time: 23:40
Finnish to French
TTX, Trados TM Jun 5, 2012

Samuel Keating wrote:
I did get .ttx files as well, but I get the "this file contains no translatable segments" error.

If you wish to translate TTX files in Wordfast Pro, you need to follow the procedure:
http://www.wordfast.net/?kb=138-35

Samuel Keating wrote:
I also got a TM with .iix, .mdf, .mtf, .mwf, and .tmw files.

These files constitute a TM in the Trados Workbench native (and proprietary) format. You can't do anything with them if you don't have a copy of Trados. You need to request the TM in the TMX format.


Direct link Reply with quote
 

Samuel Keating  Identity Verified
Canada
Local time: 14:40
English to French
+ ...
TOPIC STARTER
Thanks Jun 5, 2012

It seems to be all sorted out

Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Garbled accented characters in source

Advanced search


Translation news related to Wordfast





Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search