Filtering out Chinese/English only
Thread poster: Krzysztof Pawliszak

Krzysztof Pawliszak
Poland
Local time: 02:35
English to Polish
+ ...
Aug 11, 2014

Hello all,
I received a fairly badly formatted Word document with a text in two languages, namely Chinese and English, where English translation follows Chinese source paragraph by paragraph.
I would like to translate the document from English (or rather use English as the source) in Trados. Which means I have to get rid of Chinese. I tried to use "Select text with similar formatting", but as I said the document is formatted badly so if I select some part of it, then it may select English bits as well. I tried to change formatting for the entire file, but it doesn't work either. Should I do it paragraph by paragraph, I'd would take me a lot of time to do it manually, as the doc is pretty long.

I was wondering whether there is any filter (or macro) that would help me do it in MS Word or perhaps a Trados filter that would skip Chinese (or English)?


Direct link Reply with quote
 

Tony M  Identity Verified
France
Local time: 02:35
Member
French to English
+ ...
Formatting as table? Aug 11, 2014

Is the separation between the paragraphs even reliable enough to be abel to 'convert text to table', splitting it at pargraph breaks, so that you end up with an EN column and a Chinese one, and then simply delete the Chinese column?
If it would work at least MOST of the time, it might save you a bit of time...


Direct link Reply with quote
 

jyuan_us  Identity Verified
United States
Local time: 20:35
Member (2005)
English to Chinese
+ ...
Why do you need to translate it? Aug 11, 2014

Krzysztof Pawliszak wrote:

namely Chinese and English, where English translation follows Chinese source paragraph by paragraph.
I would like to translate the document from English (or rather use English as the source) in Trados.


You mentioned "where English translation follows Chinese source paragraph by paragraph", why do you need to translate it?


Direct link Reply with quote
 

Phil Hand  Identity Verified
China
Local time: 08:35
Chinese to English
There's a great filter in your eyes... Aug 12, 2014

Sorry to be snarky, but sometimes it's quicker just to do something than worry about the technology. Unless the document is more than about 30 pages long, going through deleting the Chinese paragraphs would take all of about two minutes.

Or you can just put the whole thing into Trados and just translate the English parts.


Direct link Reply with quote
 

Orrin Cummins  Identity Verified
Japan
Local time: 09:35
Japanese to English
+ ...
regex Aug 12, 2014

jyuan_us wrote:

Krzysztof Pawliszak wrote:

namely Chinese and English, where English translation follows Chinese source paragraph by paragraph.
I would like to translate the document from English (or rather use English as the source) in Trados.


You mentioned "where English translation follows Chinese source paragraph by paragraph", why do you need to translate it?


My guess is that he wants to translate it into Polish.

I'm not that knowledgeable about regular expressions but I think you start with something like this:

[!a-z A-Z 0-9]


This will find all non-Latin alphanumeric characters. If you put that in Word's Find/Replace feature and leave the Replace field blank, it should delete all non-Latin alphanumeric characters when you click "Replace All." You may need to add values to this to delete any special Chinese punctuation or whatever else shouldn't be in the English text.


Direct link Reply with quote
 

Krzysztof Pawliszak
Poland
Local time: 02:35
English to Polish
+ ...
TOPIC STARTER
Manually Aug 12, 2014

Orrin Cummins wrote:

jyuan_us wrote:

Krzysztof Pawliszak wrote:

I would like to translate the document from English (or rather use English as the source) in Trados.

why do you need to translate it?


My guess is that he wants to translate it into Polish.

That's correct. Either from Chinese (original) or English into Polish.
Orrin Cummins wrote:
I'm not that knowledgeable about regular expressions but I think you start with something like this:

[!a-z A-Z 0-9]

I was considering regex. I'm not too familiar with it either. But the problem with regex was the issue with names written in Chinese and left in English (or the other way round).

Phil wrote:There's a great filter in your eyes...

Eventually I did as Phil suggested. Although the document was over 30 pages long and it took me some time I managed to delete Chinese only and the bits I missed I just left untranslated in Trados.

Thank you all!


Direct link Reply with quote
 

Mikhail Zavidin
Ukraine
Local time: 03:35
English to Russian
Find/Replace Aug 12, 2014

Hi.
In the Find/Replace dialogue in the Find field you should click the Format button and choose Chinese language to replace with an empty string which you will place in the Replace With field.
Then you press Replace All button and that's it.

P.S. Make sure you have a copy of the original file.

[Редактировалось 2014-08-12 16:19 GMT]


Direct link Reply with quote
 

Krzysztof Pawliszak
Poland
Local time: 02:35
English to Polish
+ ...
TOPIC STARTER
Clearns the entire document Aug 12, 2014

Mikhail Zavidin wrote:

Hi.
In the Find/Replace dialogue in the Find field you should click the Format button and choose Chinese language to replace with an empty string which you will place in the Replace With field.
Then you press Replace All button and that's it.

P.S. Make sure you have a copy of the original file.

[Редактировалось 2014-08-12 16:19 GMT]


For some reason it clears the entire document. When I put the cursor in a Chinese paragraph Word recognises "Chinese" as Chinese in Language bar (at the bottom of the screen). English text appears as English USA. When I do as you, Mikhail, suggested after accepting changes, the entire doc is replaced with whatever I put in the Replace With.
As I said earlier, perhaps it is due to the formatting of the document.


Direct link Reply with quote
 

Orrin Cummins  Identity Verified
Japan
Local time: 09:35
Japanese to English
+ ...
This doesn't delete everything, unfortunately Aug 12, 2014

Mikhail Zavidin wrote:

Hi.
In the Find/Replace dialogue in the Find field you should click the Format button and choose Chinese language to replace with an empty string which you will place in the Replace With field.
Then you press Replace All button and that's it.

P.S. Make sure you have a copy of the original file.

[Редактировалось 2014-08-12 16:19 GMT]


This catches a lot of it, but it doesn't get everything for some reason (at least not with Japanese). Unfortunately, neither does regex.

One thing that did work for me on a test document I just tried was performing the Find using the font of the Japanese text rather than trying to select it by language. That got all of it, even the stubborn places that the regex expression couldn't erase. If all of the foreign text has the same font (and that is different from the English font, as is usually the case with Asian scripts), this might be a viable option.


Direct link Reply with quote
 

Mikhail Zavidin
Ukraine
Local time: 03:35
English to Russian
Well, then Find/Raplace again Aug 14, 2014

Then, if it's not too late though, try next in the Find/Replace dialogue:

1. Find:^0013[!^0013]@([!^0013-^0255]@)[!^0013]@^0013
2. Replace With:^p
3. Tick Wildcards checkbox
4. Press Replace All

This may help.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Filtering out Chinese/English only

Advanced search






BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »
LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search