Extracting the source text
Thread poster: Noha Kamal, PhD.

Noha Kamal, PhD.  Identity Verified
Local time: 20:43
English to Arabic
+ ...
Aug 30, 2007


I am currently working on a regular English-Arabic text. When I am done, I will clean up the file (which will of course delete the English source text) and leave me the Arabic target. Now my question is: is there any way I could extract the English source text in a separate file? I mean the client wants a clean version where there is a column on the left containing the English and another on the right containing the Arabic. So, how do I get the English (all clean and tidy) after the clean up?



Shouguang Cao
Local time: 02:43
English to Chinese
+ ...
A common problem. Aug 30, 2007

Noha, here's what I do.

1. Draw a table with two columns.
2. Copy source English words into these two columns (That is two identical copies).
3. Translate the left column.
4. Clean up the file.


Noha Kamal, PhD.  Identity Verified
Local time: 20:43
English to Arabic
+ ...
I am actually doing proofreading, Dallas Aug 30, 2007

Hi Dallas,

Thanks for your reply. Well, I guess it is a bit more complicated than that. Because I am actually proofreading a text that has already been translated. So the problem is that I do not have the source text alone. I only have the bilingual file with both languages intermingled. Any hope?

thanks again,


Narcis Lozano Drago  Identity Verified
Local time: 20:43
Member (2007)
English to Spanish
+ ...
Try this, please Aug 30, 2007

Have you tried opening the file (I assume it's a .doc with all the tags from Workbench) with SDLX?. In the test I have done it correctly separated the text in two columns: source and translation. You will only have to copy the text source.

Hope it also works for you,


Noha Kamal, PhD.  Identity Verified
Local time: 20:43
English to Arabic
+ ...
Do not have SDLX at the moment Aug 30, 2007

Unfortunately. I do not have SDLX at the moment. Anyway, I could do that using Trados?


Eugene Gulak  Identity Verified
Local time: 21:43
Member (2007)
English to Russian
+ ...
Use "Replace" command Aug 30, 2007

Hi, Noha!

If you are working with a .doc file, try using the following procedure.

The trick is to eliminate the target text which is always enclosed in Trados tags and is non-hidden.

1. Save a copy of the file.
2. Select Edit - Replace (names of software options could be a little different - I use Russian version of Word) or press Ctrl+H.
3. Click More to expand the Find/Replace dialog.
4. Be sure to check the box Use wildcards.
5. Enter the following string into "Find" field - without commas and spaces (I can't put the strings directly here because they are confused with HTML-tags and I am not a great specialist in HTML):

back slash, left curly bracket, back slash, greater-than sign, asterisk, back slash, less-than sign, 0, back slash, right curly bracket

Left the "Replace with" field empty.

The trick here is to replace target text (asterisk) between opening and closing Trados tags with nothing (back slash is needed before certain characters so that Word doesn't confuse them with wildcards: the key for back slash is usually to the left of backspace key on the keyboard).

6. Click Replace all.

This should delete your target text and some Trados tags. Next you'll have to eliminate the remaining Trados tags. They are always of purple color. So do the following.

7. Clear "Find" string.
8. Clear the check box "Use wildcards".
9. Click in the "Find" field.
10. In the lower part of Find/Replace dialog click Format - Character (or Font - I am not sure which is the English option).
11. In the Text color field (or Character color or Font color - not sure again) select the color of Trados tags - it should be in the 3 row, 7 column (to be sure of exact tag color just highlight any tag in the text and select Format - Character (Font) in the main menu: you'll see it in the color matrix).
12. Click OK. You return to Find/Replace dialog, where you are now going to replace text of purple (or whatever it is) color with nothing.
13. Click Replace All.

Now you got rid of all the tags and target text. The last step is to make the text non-hidden.

14. Click Edit - Select All or press Ctrl+A.
15. Select Format - Character (Font).
16. Click the check box "hidden" twice so it is empty (niether green nor checked).
17. Click OK.
18. Save the file.

Now you should have nice source text. Actually the procedure is fairly simple. I hope it helps. Just be sure to enter the exact string at step 5.

Good luck!


Noha Kamal, PhD.  Identity Verified
Local time: 20:43
English to Arabic
+ ...
You are incredible!!! Aug 30, 2007

Hi Eugene,

You are a life-saver! Has anyone ever told you that? Yes, the procedure, though long, makes perfect sense. I will follow it to the letter. Thanks a zillion times, friendicon_smile.gif


To report site rules violations or get help, contact a site moderator:

You can also contact site staff by submitting a support request »

Extracting the source text

Advanced search

Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »

  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search