Can we convert a web page from ISO-8859-1 to UTF-8? Thread poster: Medworks
| Medworks United States Local time: 14:24 Italian to English + ...
Hello! An agency that gives me large volume of localization work is no longer accepting rtf format and is asking me to use Trados, so I am. I actually like Trados better now, and have been working fine with their TM and DTD in html format, and am able to upload the completed translation files in .ttx format. Yet, they also asked if I know how to convert a web page from ISO-8859-1 to UTF-8. So far, I noticed that the pages I've received said UTF... See more Hello! An agency that gives me large volume of localization work is no longer accepting rtf format and is asking me to use Trados, so I am. I actually like Trados better now, and have been working fine with their TM and DTD in html format, and am able to upload the completed translation files in .ttx format. Yet, they also asked if I know how to convert a web page from ISO-8859-1 to UTF-8. So far, I noticed that the pages I've received said UTF-8 in Trados TagEditor. Can I convert format with trados TagEditor? ▲ Collapse | | |
Martha wrote: Hello! An agency that gives me large volume of localization work is no longer accepting rtf format and is asking me to use Trados, so I am. I actually like Trados better now, and have been working fine with their TM and DTD in html format, and am able to upload the completed translation files in .ttx format. Yet, they also asked if I know how to convert a web page from ISO-8859-1 to UTF-8. So far, I noticed that the pages I've received said UTF-8 in Trados TagEditor. Can I convert format with trados TagEditor? Open the web page in UltraEdit, and save as UTF-8. You will also have to change the: <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-2"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> This tag can be found near the beginning of the file. For sure there are automatic converters for html files but I can't think of any off the top of my head right now. HTH Piotr | | | Microsoft Word should also do the trick | Sep 27, 2007 |
For sure there are automatic converters for html files but I can't think of any off the top of my head right now. Although using UltraEdit is an excellent tip, I would do it with MS Word for the very simple reason that I feel more confortable with Word's macro language than with UltraEdit's macros. Because the job is likely to involve a big number of files, I would try to use some automated solution. Daniel | | | Medworks United States Local time: 14:24 Italian to English + ... TOPIC STARTER
Thank you Daniel!! Your response was brilliant in it's simplicity! I opened up the web files with Word and followed these steps: ----------------------------------- Tools> Options> General> Web Options> Encoding ------------------------------------ It asked for target code to reload and save! That's it! Very simple.. To t... See more | |
|
|
esperantisto Local time: 00:24 Member (2006) English to Russian + ... SITE LOCALIZER Do not do that! | Sep 27, 2007 |
I would do it with MS Word For very simple reason that Microsoft Word notoriously inserts lots of shit into HTML code, and your client may not be happy with it. Beside UltarEdit, there are bunches of text/HTML editors capable of performing the task (jEdit, UniRed are my favorite) without harming any tag. | | | Medworks United States Local time: 14:24 Italian to English + ... TOPIC STARTER Even better!! :) | Sep 27, 2007 |
esperantisto wrote: I would do it with MS Word For very simple reason that Microsoft Word notoriously inserts lots of shit into HTML code, and your client may not be happy with it. Beside UltarEdit, there are bunches of text/HTML editors capable of performing the task (jEdit, UniRed are my favorite) without harming any tag. Hehe... you're right! When I opened the file with notepad earlier, I immediately saw the extra code. It even included my name, the time and date of last modification, etc. No, that's not good at all! Too bad, it seemed so nice and simple! I looked in to www.jedit.org and saw the program. It's even a free opensource program! Thanks! I'll download it and give it a try (and compare the code) All my Italian files have had the right UTF-8 tag, but it's good to know what to do, just in case. Thanks again!... | | | Medworks United States Local time: 14:24 Italian to English + ... TOPIC STARTER It couldn't open Trados file.. | Sep 27, 2007 |
I downloaded and ran jEdit and it looks like a good program went to utilities>global options> encoding Though it opened my sample html files, it could not open the files I receive from the agency which are formated for Trados TagEditor... In Trados TagEditor, I noticed that at the very top of the file when I'm translating it says: ---------------------------------------------------- meta...content= text/html;charset=utf-8 ----------... See more I downloaded and ran jEdit and it looks like a good program went to utilities>global options> encoding Though it opened my sample html files, it could not open the files I receive from the agency which are formated for Trados TagEditor... In Trados TagEditor, I noticed that at the very top of the file when I'm translating it says: ---------------------------------------------------- meta...content= text/html;charset=utf-8 ----------------------------------------------------- If there were other type of character codes, couldn't I just write utf-8 when needed and everything would be alright???? Of course, preview to double-check that the writing appears correctly. ▲ Collapse | | | Jaroslaw Michalak Poland Local time: 23:24 Member (2004) English to Polish SITE LOCALIZER
This program, Unifier, seems to do exactly what you want: http://www.melody-soft.com/html/unifier.html It converts batches, adds the appropriate tags, etc. Note that I haven't tried it, so cannot tell how good it is (better backup your files first...). Also, there might be some freeware tools, but haven't found them yet... Edit: ... See more This program, Unifier, seems to do exactly what you want: http://www.melody-soft.com/html/unifier.html It converts batches, adds the appropriate tags, etc. Note that I haven't tried it, so cannot tell how good it is (better backup your files first...). Also, there might be some freeware tools, but haven't found them yet... Edit: I have just noticed that you are trying to convert the ttx files themselves instead of html files. I am not sure it is a good idea, as the encoding in ttx itself (i.e. xml) might be different than the input/output html files.
[Edited at 2007-09-27 11:38] ▲ Collapse | |
|
|
Sorry! I meant "open as encoded text" with MS-Word! :-( | Sep 27, 2007 |
Martha wrote: esperantisto wrote: I would do it with MS Word For very simple reason that Microsoft Word notoriously inserts lots of shit into HTML code, and your client may not be happy with it. Beside UltarEdit, there are bunches of text/HTML editors capable of performing the task (jEdit, UniRed are my favorite) without harming any tag. Hehe... you're right! When I opened the file with notepad earlier, I immediately saw the extra code. It even included my name, the time and date of last modification, etc. No, that's not good at all! Too bad, it seemed so nice and simple! I looked in to www.jedit.org and saw the program. It's even a free opensource program! Thanks! I'll download it and give it a try (and compare the code) All my Italian files have had the right UTF-8 tag, but it's good to know what to do, just in case. Thanks again!... Sorry! I meant open the HTML files from MS Word as "encoded text" to do the conversion, not as HTML. I should have explained more clearly... Of course, sperantiso is right and you have seen it yourself. Opening and HTML as HTML will insert a lot of code... Apologies for the confusion again... Daniel | | | esperantisto Local time: 00:24 Member (2006) English to Russian + ... SITE LOCALIZER Glad that you like jEdit | Sep 27, 2007 |
Forgot to say, download and use the latest 4.3 pre10 version. Although it's said to be unstable, that seems to be just developers' caution: I find it rock stable and fairly improved compared to 4.2. Note also that there are lots of plugins that can make your life a bit more comfortable. Just explore the respective section of their site. | | | Robert Tucker (X) United Kingdom Local time: 22:24 German to English + ... | No macros needed for that in UltraEdit | Sep 27, 2007 |
dgmaga wrote: For sure there are automatic converters for html files but I can't think of any off the top of my head right now. Although using UltraEdit is an excellent tip, I would do it with MS Word for the very simple reason that I feel more confortable with Word's macro language than with UltraEdit's macros. Because the job is likely to involve a big number of files, I would try to use some automated solution. Daniel If Word does that without making any additional problems, then it's alright with me, but Word is known for causing problems with HTML files and introducing its own peculiar markup. In UltraEdit it's a simple open and Save As (F12) operation with changing a few characters in the meta tag. No macros/scripts required. Of course if you want to convert a lot of files, you need a batch converter. Regards, Piotr
[Edited at 2007-09-27 19:57] | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Can we convert a web page from ISO-8859-1 to UTF-8? Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
| Trados Business Manager Lite | Create customer quotes and invoices from within Trados Studio
Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |