Mobile menu

Accents not displayed when cleaning xml in TagEditor
Thread poster: iamsara

iamsara
Local time: 05:58
English to Spanish
Jan 10, 2008

Hi, I have a XML file and the client didn´t provide the DTD file.

My problem is when I clean the document, accents are shown as ´. I want them to remain as accents (as they would if I were to edit the document in Dreamweaver) because the character is not recognized when trying to open the translated XML file. I´ve been trying different options in the default DTD (under the tab Entities) but no luck.

Anyone could help? Thanks in advance


Direct link Reply with quote
 

ViktoriaG  Identity Verified
Canada
Local time: 23:58
English to French
+ ...
Off the top of my head Jan 11, 2008

I may be completely off the track here, but let's try this anyway.

Maybe it isn't the DTD that's causing you troubles. It is possible that it is simply the encoding of your document that is not the correct one. If you simply change the encoding of the XML document and translate it again using your existing TM, that may fix it.

All the best!


Direct link Reply with quote
 

iamsara
Local time: 05:58
English to Spanish
TOPIC STARTER
Not working :( Jan 11, 2008

Thanks Viktoria, but I´ve tried that. There´s no heading in the XML nor anything refering to the encoding. I tried apening the xml in notepad and saving as the differents encodings possible, but without any luck.

Direct link Reply with quote
 

Marek Buchtel  Identity Verified
Czech Republic
Local time: 05:58
Member (2005)
English to Czech
+ ...
Fiddle with DTD Jan 11, 2008

Hello,

You have probably already tried to change the DTD, but maybe you skipped a step in the procedure.

So here is what I would do:

When the file is opened in TE, go to Tools->Tag Settings and check, which setting is ticked off (has a red mark on it). That's the one used for the file.
Then CLOSE all files in TE, go to Tag Setting, select the respective file, go to Properties->Entities, and:
a) uncheck the box "Convert entities" - no entities will be converted (not even non-breaking space), if you find that you need some entities to be converted, check "Convert entities" and:
b) in "Entity sets", go to Added Latin 1 and in the right-hand panel, uncheck all entities, which you don't want to be converted.

Press OK, then again OK in the next window.

Now open the XML FILE - i.e. NOT the ttx file you have created before, but the original XML.
Translate a few segments, then save as target, to see if it works

HTH


Direct link Reply with quote
 

iamsara
Local time: 05:58
English to Spanish
TOPIC STARTER
UTF-8 Jan 13, 2008

Hi, Marek, thanks for your help. I´m trying that but still no luck.

The thing is the original document´s encoding is UTF-8. Therefore, if I edit it in Dreamweaver or Notepad and enter special characters, there´s no prolem.

The problem arises when I edit it in TagEditor: it saves the cleaned document in Windows Western European (Windows). If I then open the documet in Dreamweaver and try to change the encoding back to UTF-8, it does not work, since TagEditor has chaged the accents to "acute;" and they stay like that.


BTW, I´ve checked in Dreamweaver and there´s no DTD in the document.


Direct link Reply with quote
 

Wojciech Froelich  Identity Verified
Poland
Local time: 05:58
English to Polish
It's not a change of encoding Jan 14, 2008

Dubloc wrote:

The problem arises when I edit it in TagEditor: it saves the cleaned document in Windows Western European (Windows). If I then open the documet in Dreamweaver and try to change the encoding back to UTF-8, it does not work, since TagEditor has chaged the accents to "acute;" and they stay like that.


TagEditor is not that stupid
It checks the encoding (it should be defined in the header of the file) and it will keep it or adjust it in target file (it will surely keep utf-8).

I guess the problem is the conversion of the entities. TagEditor will strictly follow the settings from INI file (you can access these settings from TagEditor, try modifying INI file with no document open) – all you have to do is to check the settings for entity conversions and leave it only for the XML-specific characters (usually it's also switched on for Latin-1 accented characters). Then you simply save target or clean the document.


Direct link Reply with quote
 

Wojciech Froelich  Identity Verified
Poland
Local time: 05:58
English to Polish
No heading? Jan 14, 2008

Dubloc wrote:

There´s no heading in the XML nor anything refering to the encoding.


To get the target file in the same encoding (utf-8 in this case), you have to define the encoding in the source file and then open it again in TE. Of course you also have to take care of the appropriate entities conversion (INI file settings).


Direct link Reply with quote
 

iamsara
Local time: 05:58
English to Spanish
TOPIC STARTER
got it Jan 14, 2008

Hi Wojciech



TagEditor is not that stupid
It checks the encoding (it should be defined in the header of the file) and it will keep it or adjust it in target file (it will surely keep utf-8).



The file I’m translating does not have a heading of the type: ?xml version="1.0"?.

If I open it with notepad to Save as and change the encoding to UTF-8, it says the document is in ANSI. However, if I open them in DW, it says the encoding is UTF-8. But, if I open it in TagEditor, translate it and then check the document properties, it shows windows-1252 as the original encoding.

What I´ve done is to change the encoding by means of opening the document in notepad, save as UTF-8 so that TE says when it´s already translated that the original encoding was UTF-8 and not windows-1252 and uncheck entities conversion. It seems to work ok.

Thanks all!


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Accents not displayed when cleaning xml in TagEditor

Advanced search


Translation news related to SDL Trados





Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
WordFinder
The words you want Anywhere, Anytime

WordFinder is the market's fastest and easiest way of finding the right word, term, translation or synonym in one or more dictionaries. In our assortment you can choose among more than 120 dictionaries in 15 languages from leading publishers.

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs