Cleaning up a file
Thread poster: Anna Fangrath

Anna Fangrath  Identity Verified
Local time: 14:31
Sep 8, 2010

Hello,

sorry if I sound like an amatuer, I'm not using Trados yet.
I have a txt. file - in English, the translation has been written just below the
english text.

Is there a way to "clean up" this file and remove the original text?


 

freddy7  Identity Verified
France
Local time: 14:31
English to French
+ ...
python Sep 8, 2010

I don't know if there is an automated tool to do that.

If the text file is one line English, one line other language, and if it's too long to be processed by hand, maybe a python script could do the job quickly.


 

Anna Fangrath  Identity Verified
Local time: 14:31
TOPIC STARTER
that's what I need Sep 8, 2010

the text is too long for manual processing. I'm not sure what python script is...I'd appreciate if you could give me a short explanation.
Otherwise I'll be stuck with deleting for couple of days....


 

Jaroslaw Michalak  Identity Verified
Poland
Local time: 14:31
Member (2004)
English to Polish
More information Sep 8, 2010

Is the file you need to clean up a pure txt file? Are the languages marked somehow?

 

Anna Fangrath  Identity Verified
Local time: 14:31
TOPIC STARTER
pure txt Sep 8, 2010

it's a pure txt,

the segments are not marked,

one line English

one line German

what a joy...


 

Jaroslaw Michalak  Identity Verified
Poland
Local time: 14:31
Member (2004)
English to Polish
Word Sep 8, 2010

In this case I think your best bet would be copying the contents to Word with the option "Detect language" turned on. It does not always work perfectly, but still it is better than manual deletion.

After that just search and replace the given language with e.g. a space. You will get extra paragraphs, naturally, but this can be dealt with, too.

[Edited at 2010-09-08 18:58 GMT]


 

Anna Fangrath  Identity Verified
Local time: 14:31
TOPIC STARTER
good idea Sep 8, 2010

I'll try that thanks

 

Soonthon LUPKITARO(Ph.D.)  Identity Verified
Thailand
Local time: 19:31
Member (2004)
English to Thai
+ ...
MS Excel sorting macro Sep 9, 2010

Copy and paste the entire texts into MS Excel cells. By sorting the entire data for every other line, you can get source paragraphs and target paragraphs in 2 successive groups that you can delete/save in another file or save as a Word/txt/Unicode document etc.. Excel sorting macro is in Excel Help or Excel knowledgebase of Microsoft.

Soonthon Lupkitaro


[Edited at 2010-09-09 01:17 GMT]


 

Jaroslaw Michalak  Identity Verified
Poland
Local time: 14:31
Member (2004)
English to Polish
Yet another way... Sep 9, 2010

Yet another way is to open the file in Word, select all the content and use the option "Convert text to table". Select paragraph mark as the separator and specify the number of columns as "2".

This, however, requires that the text structure is very strict, i.e. every second paragraph (or "line" in text file) is in another language (i.e. no line breaks within the same language).


 

freddy7  Identity Verified
France
Local time: 14:31
English to French
+ ...
python Sep 9, 2010

install python
http://www.python.org/download/releases/2.6.6/

copy the following into a file called txtclean.py
# txtclean.py
#
# text cleaning
# fc 09/09/10

orig = open('toto.txt')
lang1 = open('toto1.txt', 'w')
lang2 = open('toto2.txt', 'w') # open original file for reading, and 2 files to save cleaned versions

while True:
line1 = orig.readline()
if line1=="": break
line2 = orig.readline() # read the 2 successive lines
if line2=="": break # stop if end of file
lang1.write(line1)
lang2.write(line2)

orig.close()
lang1.close()
lang2.close() # clean everything


and replace toto.txt by your file name
Add 2 spaces at the beginning of the 6 lines following the "while" statement (they disappear when I copy the programm in this forum).
Save everything in the same directory and double-click on the .py file.

That's all!

Hope that helps.

Fred.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Cleaning up a file

Advanced search







SDL Trados Studio 2017 only €435 / $519
Get the cheapest prices for SDL Trados Studio 2017 on ProZ.com

Join this translator’s group buy brought to you by ProZ.com and buy SDL Trados Studio 2017 Freelance for only €435 / $519 / £345 / ¥63000 You will also receive FREE access to Studio 2019 when released.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search