TMX import takes 10 hours?
Thread poster: Gyula Erdész

Gyula Erdész
Hungary
Local time: 07:44
Member (2005)
English to Hungarian
+ ...
Nov 7, 2011

Dear colleagues,

Yesterday I tried to import a middle-sized TMX file with 220.000 segments into an internal memory of Swordfish 3.0.6. As I did not know the capabilities of the software I intentionally chose the faster import option and left my computer alone for all day. But to my astonishment, Swordfish was not able to import the memory in 12 hours. At that point I interrupted the import process.

Is there any possibility to make the import faster or more effective? As this translation memory size is quite average in my practice, plus I receive memory updates for almost every translation projects, I really need to find a solution for a faster and more effective import.

Shall I use a third party database engine (e.g. Oracle or SQL) memory instead of the internal memory of the software? Would it make the import process faster? If the import itself is so slow, how fast can Swordfish handle huge memories during translation?

Thank you in advance for sharing your ideas, opinions and experience.

Best regards,

Gyula Erdész


 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 07:44
Member (2005)
English to Spanish
+ ...
Just one idea Nov 7, 2011

I am NOT a user of Swordfish, but to me, this looks more like a setup/performance problem in your computer than a problem with Swordfish handling large memories. With any reasonably fast CAT system, and I assume Swordfish is one of them, importing 220K segments should not take more than 10-15 minutes.

Just for the sake of the experiment, how about installing a more professional-level database management system, like MySQL (the Community Server is free) in your machine and use Swordfish with it to see whether this improves the situation?

As per Swordfish's User Guide you can use MySQL with the software.


 

Wolfgang Schoene  Identity Verified
France
Local time: 07:44
English to German
+ ...
TMX import takes 10 hours Nov 7, 2011

Hi
I am an occasional user of Swordfish. The other day I imported a very large TMX (257 MB) into a new empty TM. It "only" took 3 hours ...
iMac 2,7 GHz,
Lion OSX
8 GB of RAM

I really would like to install MySQL but don't know how to do it so step by step instruction would be welcome.

Regards
Wolfgang


 

Gyula Erdész
Hungary
Local time: 07:44
Member (2005)
English to Hungarian
+ ...
TOPIC STARTER
hardly a setup question Nov 7, 2011

Tomás Cano Binder, CT wrote:

With any reasonably fast CAT system, and I assume Swordfish is one of them, importing 220K segments should not take more than 10-15 minutes.



Dear Tomás,

After the unsuccessful import with Swordfish, I imported the TMX file into memoQ within 6 minutes. Based on this, I do not think that a setup/performance problem slows down Swordfish.

Thanks for the hint with MySQL database. It definitely wort a try.

Regards,

Gyula


 

Gyula Erdész
Hungary
Local time: 07:44
Member (2005)
English to Hungarian
+ ...
TOPIC STARTER
And then? Nov 7, 2011

Wolfgang Schoene wrote:

I am an occasional user of Swordfish. The other day I imported a very large TMX (257 MB) into a new empty TM. It "only" took 3 hours ...
iMac 2,7 GHz,
Lion OSX
8 GB of RAM


Thank you for your feedback, Wolfgang.

My computer has got the same parameters as yours, but I use Linux OS.

How Swordfish behaves when you use the imported huge TM? Do you need to wait several seconds when Swordfish does TM lookup?

Regards,

Gyula


 

Wolfgang Schoene  Identity Verified
France
Local time: 07:44
English to German
+ ...
And then ... Nov 7, 2011

Hi Gyula,
once the TM has been imported, lookup during the translation process is fast and fast means immediate. What is very slow, though, is Concordance search.

Wolfgang


 

Rodolfo Raya  Identity Verified
Local time: 02:44
English to Spanish
Split the file Nov 7, 2011

Hi,

Split the TMX file in several smaller pieces and import the pieces to the database.

Regards,
Rodolfo


 

Piotr Bienkowski  Identity Verified
Poland
Local time: 07:44
Member (2005)
English to Polish
+ ...
Some optimization? Nov 7, 2011

Rodolfo Raya wrote:

Hi,

Split the TMX file in several smaller pieces and import the pieces to the database.

Regards,
Rodolfo


Maybe the TMX could use some optimization as well, like removing duplicates, "noise" segments that do not contain anything useful, etc?

PB


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 06:44
Member (2009)
Dutch to English
+ ...
Hi Piotr Jul 5, 2013

Piotr Bienkowski wrote:

Rodolfo Raya wrote:

Hi,

Split the TMX file in several smaller pieces and import the pieces to the database.

Regards,
Rodolfo


Maybe the TMX could use some optimization as well, like removing duplicates, "noise" segments that do not contain anything useful, etc?

PB


You wouldn't happen to have any tips on how to go about this would you?

I am trying to move a large number of TMXs generated by memoQ to CafeTran and have managed to remove all of the useless codes (I think so, at least, I did it with Olifant). However, I notice that my TMXs contain quite a few duplicates and empty segments.

Michael


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Maya Gorgoshidze[Call to this topic]

You can also contact site staff by submitting a support request »

TMX import takes 10 hours?

Advanced search






SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search