Trados Professional 2011 SP2 HTML 5 import / accent character problem
Thread poster: eTouristGuides
eTouristGuides
Spain
Local time: 12:07
Member (2010)
English to German
Aug 31, 2012

We have Trados professional 2011 SP2 and are having problems preparing a translation project file. Our files our HTML 5.0 files, UTF8 encoded all characters are in their native form i.e not entity forms. We are importing these files into our project file however any accented characters in our source files are not being properly displayed in the Trados project file. Please can someone give us some guidance on what we need to do to correct this.

These are our tests:
The following characters are not being processed correctly when imported into the HTML5 files in a Trados
project.

#ÁáÉéÍíÑñÓóÚúÜü«»¿¡€₧
#ÄäÉéÖöÜüß«»°€£
#ÀàÂâÆæÇçÈèÉéÊêËëÎîÏïÔôŒœÙùÛûÜü«»€₣
#ÀàÁáÈèÉéÌìÍíÒòÓóÙùÚú«»€£

When the above characters are added in an HTML5 file and then the page
is imported to a Trados project and we prepare the Trados project, the
texts above are being converted to these:

#ÁáÉéÍíÑñÓóÚúÜü«»¿¡€₧
#ÄäÉéÖöÜüß«»°€£
#Àà ÂâÆæÇçÈèÉéÊêËëÎîÏïÔôŒœÙùÛûÜü«»€₣
#Àà ÁáÈèÉéÌìÍíÒòÓóÙùÚú«»€£

Here are the details on the software.

SOFTWARE VERSION: SDL Trados Studio 2011 Professional SP2 - 10.2.3001.0

SOFTWARE SETTINGS:

(OPTION 1) We tried it using the default settings for trados. No changes
done.

(OPTION 2) Then we created a new HTML5 setting and these are the
information on the HTML5 Settings.

File type name: HTML5
File Type identifyer: HTML
Name of individual document: HTML5 Template Document
Name of document category: HTML5 Template Documents
File dialog wildcard expression:
*.htm;*.html;*.jsp;*.asp;*.aspx;*.ascx;*.inc;*.php;*.hhk;*.hhc

Elements and Attributes:

Entities: Convert Entities is ticked/checked. For the Convert Entities
settings, the "HTML Special" is ticked/checked. For the Entity Mapping
for "HTML Special", we ticked/checked the "amp", "gt", "lt", "quot"

Writers: tags is set to "Do not change charset value"
then the Unicode UTF8 byte Order mark (BOM) is set to "Remove if
present"

ACTIONS WHEN CREATING THE TRADOS PROJECT:
1. In the Project Type, we select SDL Trados
2. In the next window we enter the information needed such as the
project name, location and customer.
3. In the language selection, just select English (UK) as the source
language and the traget language is "French (France)"
4. In the Project Files selection, select the file that is attached in
this support question (sample-html5-page.html). You can take a look at
the page so you will know all settings e.g. the charset is UTF8. The
accented characters are in Native format and not in entity format.
5. Just click next until you reach the Project Summary screen.
6. In the Project Summary screen, there is a "Project Settings" at the
bottom right side of the screen. So click that.
7. For the file types, make sure that only the SDL XLIFF and HTML5
settings are selected then press ok.
8. Click finish to create the Project file.
9. Then open the (sample-html5-page.html) in the project and you will
see that the accented texts.

#ÁáÉéÍíÑñÓóÚúÜü«»¿¡€₧
#ÄäÉéÖöÜüß«»°€£
#ÀàÂâÆæÇçÈèÉéÊêËëÎîÏïÔôŒœÙùÛûÜü«»€₣
#ÀàÁáÈèÉéÌìÍíÒòÓóÙùÚú«»€£

were changed to

#ÁáÉéÍíÑñÓóÚúÜü«»¿¡€₧
#ÄäÉéÖöÜüß«»°€£
#Àà ÂâÆæÇçÈèÉéÊêËëÎîÏïÔôŒœÙùÛûÜü«»€₣
#Àà ÁáÈèÉéÌìÍíÒòÓóÙùÚú«»€£

We tested the same characters in an XHTML file and then use the XHTML to import and the accented characters imported OK File Type default settings in Trados however the accented characters
were appearing correctly when the xhtml file is added in a Trados
project file. Therefore the issue may be:
1. The settings that we have for HTML5 is incorrect
2. Trados is not fully HTML5 compatible.


Does anyone have any advice how we can fix this issue please.


Direct link Reply with quote
 
esperantisto  Identity Verified
Local time: 13:07
Member (2006)
English to Russian
+ ...
Encoding declaration? Aug 31, 2012

Is the UTF-8 encoding declared properly in the HTML header?

Direct link Reply with quote
 
eTouristGuides
Spain
Local time: 12:07
Member (2010)
English to German
TOPIC STARTER
Yes UTF8 declaration is correct: Aug 31, 2012

Hello yes we have declared UTF8 correctly in our HTML 5 file. Here is the declaration:

Direct link Reply with quote
 
pdstubbs
United Kingdom
Trados Studio 2011 doesn't seem to handle the charset correctly Nov 7, 2014

I've been having a similar problem. I have found that adding a byte Order Mark to the utf-8 encoded files allows correct handling. My attempts at adding the HTML 4 declaration made no difference.

Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Trados Professional 2011 SP2 HTML 5 import / accent character problem

Advanced search







Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search