Cleaning up a file
Thread poster: A_Fangrath

A_Fangrath  Identity Verified
Local time: 07:38
English to German
+ ...
Sep 8, 2010

Hello,

sorry if I sound like an amatuer, I'm not using Trados yet.
I have a txt. file - in English, the translation has been written just below the
english text.

Is there a way to "clean up" this file and remove the original text?


Direct link Reply with quote
 

freddy7  Identity Verified
France
Local time: 07:38
English to French
+ ...
python Sep 8, 2010

I don't know if there is an automated tool to do that.

If the text file is one line English, one line other language, and if it's too long to be processed by hand, maybe a python script could do the job quickly.


Direct link Reply with quote
 

A_Fangrath  Identity Verified
Local time: 07:38
English to German
+ ...
TOPIC STARTER
that's what I need Sep 8, 2010

the text is too long for manual processing. I'm not sure what python script is...I'd appreciate if you could give me a short explanation.
Otherwise I'll be stuck with deleting for couple of days....


Direct link Reply with quote
 

Jaroslaw Michalak  Identity Verified
Poland
Local time: 07:38
Member (2004)
English to Polish
More information Sep 8, 2010

Is the file you need to clean up a pure txt file? Are the languages marked somehow?

Direct link Reply with quote
 

A_Fangrath  Identity Verified
Local time: 07:38
English to German
+ ...
TOPIC STARTER
pure txt Sep 8, 2010

it's a pure txt,

the segments are not marked,

one line English

one line German

what a joy...


Direct link Reply with quote
 

Jaroslaw Michalak  Identity Verified
Poland
Local time: 07:38
Member (2004)
English to Polish
Word Sep 8, 2010

In this case I think your best bet would be copying the contents to Word with the option "Detect language" turned on. It does not always work perfectly, but still it is better than manual deletion.

After that just search and replace the given language with e.g. a space. You will get extra paragraphs, naturally, but this can be dealt with, too.

[Edited at 2010-09-08 18:58 GMT]


Direct link Reply with quote
 

A_Fangrath  Identity Verified
Local time: 07:38
English to German
+ ...
TOPIC STARTER
good idea Sep 8, 2010

I'll try that thanks

Direct link Reply with quote
 

Soonthon LUPKITARO(Ph.D.)  Identity Verified
Thailand
Local time: 13:38
Member (2004)
English to Thai
+ ...
MS Excel sorting macro Sep 9, 2010

Copy and paste the entire texts into MS Excel cells. By sorting the entire data for every other line, you can get source paragraphs and target paragraphs in 2 successive groups that you can delete/save in another file or save as a Word/txt/Unicode document etc.. Excel sorting macro is in Excel Help or Excel knowledgebase of Microsoft.

Soonthon Lupkitaro


[Edited at 2010-09-09 01:17 GMT]


Direct link Reply with quote
 

Jaroslaw Michalak  Identity Verified
Poland
Local time: 07:38
Member (2004)
English to Polish
Yet another way... Sep 9, 2010

Yet another way is to open the file in Word, select all the content and use the option "Convert text to table". Select paragraph mark as the separator and specify the number of columns as "2".

This, however, requires that the text structure is very strict, i.e. every second paragraph (or "line" in text file) is in another language (i.e. no line breaks within the same language).


Direct link Reply with quote
 

freddy7  Identity Verified
France
Local time: 07:38
English to French
+ ...
python Sep 9, 2010

install python
http://www.python.org/download/releases/2.6.6/

copy the following into a file called txtclean.py
# txtclean.py
#
# text cleaning
# fc 09/09/10

orig = open('toto.txt')
lang1 = open('toto1.txt', 'w')
lang2 = open('toto2.txt', 'w') # open original file for reading, and 2 files to save cleaned versions

while True:
line1 = orig.readline()
if line1=="": break
line2 = orig.readline() # read the 2 successive lines
if line2=="": break # stop if end of file
lang1.write(line1)
lang2.write(line2)

orig.close()
lang1.close()
lang2.close() # clean everything


and replace toto.txt by your file name
Add 2 spaces at the beginning of the 6 lines following the "while" statement (they disappear when I copy the programm in this forum).
Save everything in the same directory and double-click on the .py file.

That's all!

Hope that helps.

Fred.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Cleaning up a file

Advanced search







Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search