Length of segments enteres into a translation memory
Thread poster: Dénia A. Amon (X)

Dénia A. Amon (X)
Local time: 00:38
English to German
+ ...
Aug 16, 2010

Dear all,
I am just starting to work with CAT tools and I was wondering about the best way to create a translation memoy. I have thousands of words ready which I could enter into the database.
However, before I start, I would like to ask some experienced CAT tool users the following:

1)
How long should segments entered into the TM be? Does it make sense to enter very long sentences? I used to do legal translations and sometimes sentences are much longer than 40 words,...
Or is it better to split the sentence into much smaller, reasonable segments?

Example: Where interim payments had been made subsequent to judgement being entered as to liability, but before the court's decision as to quantification of damages, then any interest accrued by the claimant on the money paid was not to be taken account of through subtraction from the sum allowed for interest in the final determination of quantum.

- How would you save this sentence into a TM?
- And if I save long sentences: I assume that slows down the matching process and the hit rate?

2)
How do you back up your TM? I mean it is a quiete precious treasure, so I would like to make sure that I back it up correctly?! Any hints on that?

Thanks for your advice.

[Edited at 2010-08-16 13:12 GMT]


 

wsetters
Local time: 00:38
French to English
As your CAT does it Aug 16, 2010

Dénia A. Amon wrote:

Dear all,
I am just starting to work with CAT tools and I was wondering about the best way to create a translation memoy. I have thousands of words ready which I could enter into the database.
However, before I start, I would like to ask some experienced CAT tool users the following:

1)
How long should segments entered into the TM be? Does it make sense to enter very long sentences? I used to do legal translations and sometimes sentences are much longer than 40 words,...
Or is it better to split the sentence into much smaller, reasonable segments?

Example: Where interim payments had been made subsequent to judgement being entered as to liability, but before the court's decision as to quantification of damages, then any interest accrued by the claimant on the money paid was not to be taken account of through subtraction from the sum allowed for interest in the final determination of quantum.

- How would you save this sentence into a TM?
- And if I save long sentences: I assume that slows down the matching process and the hit rate?

2)
How do you back up your TM? I mean it is a quiete precious treasure, so I would like to make sure that I back it up correctly?! Any hints on that?

Thanks for your advice.

[Edited at 2010-08-16 13:12 GMT]
For me, you need to look at how your CAT tool will segment any source documents. I've never manually entered source words (it sounds like a very long exercise and I think it's a better use of time just to work in there and let the memory build up "naturally", within reason). If your CAT tool will break segments at full stops, then that seems to be how you need to do it !

[Edited at 2010-08-16 13:44 GMT]


 

Adam Łobatiuk  Identity Verified
Poland
Local time: 00:38
Member (2009)
English to Polish
+ ...
Alignment Aug 16, 2010

The process of creating TM's from existing translations is called "alignment", although some tools might use other terms. You usually need sets of the same files in language A and B. Your tool segments them as it usually segments text and matches segments in both languages. Normally, you need to correct the automatic matching manually, and it is tedious work.

In Trados 2007, the tool is called WinAlign. Wordfast has an alignment tool in Plus Tools. Most other tools have such features as well.

The length of strings doesn't matter - in fact, longer strings tend to be more useful. If you find a sentence like the one in your example but 1-2 words are different, you'll get a high match. If you have a 2-3 word sentence or heading and 1-2 words change, you won't get a match at all. If you split your sentence into smaller segments, when you translate a new file, you are not likely to find such subsegments on their own, but rather as part of other sentences. What you might want to do is enter common phrases in the glossary instead.

Backing up TM's: they are just files, so you can use any approach that you take with other files. I don't think there is anything specific about them. I like to burn them on CD's from time to time, for example.


 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 00:38
Member (2005)
English to Spanish
+ ...
Sentences to a memory, phrases to a termbase Aug 16, 2010

Each tool has its ways of doing this, but what I think would work for you would be:

A) Put all full sentences in the translation memory. Maybe your current CAT tool does not do that, but there are tools out there (like MemoQ) that will automatically search and propose reusable chunks and phrases in a flash.

B) Put frequent phrases and expressions in a termbase, so that you can easily reuse them as part of your terminology resources.


 

Dénia A. Amon (X)
Local time: 00:38
English to German
+ ...
TOPIC STARTER
Thank you for your comments,- very helpful! Aug 16, 2010

icon_smile.gif

 

FarkasAndras
Local time: 00:38
English to Hungarian
+ ...
Sentences in full, backups in cloud Aug 16, 2010

Dénia A. Amon wrote:


1)
How long should segments entered into the TM be? Does it make sense to enter very long sentences? I used to do legal translations and sometimes sentences are much longer than 40 words,...
Or is it better to split the sentence into much smaller, reasonable segments?


Splitting sentences would be hugely time-consuming and not very useful. Just leave them as they are and put terms and phrases in a termbase. This is what everyone tends to do.

Dénia A. Amon wrote:
2)
How do you back up your TM? I mean it is a quiete precious treasure, so I would like to make sure that I back it up correctly?! Any hints on that?

Thanks for your advice.

In my personal opinion, the only good solution is at least one cloud-based backup and at least one local backup. If your computer gets stolen along with your backup HDD or there's a fire at your house, all your data will still be safe somewhere in a secure data centre and you can download it wherever in the world you are.
Say, put all your work in one folder and set up mozy or dropbox or any other cloud-backup service to upload it to web-based storage.
Then take the same folder and copy it to a hard drive, flash drive or DVD. An external HDD is nice, especially if it comes with software that does automatic incremental backups.
Then, every couple of months, you could make a backup image of your whole system drive.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Length of segments enteres into a translation memory

Advanced search







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running and helps experienced users make the most of the powerful features.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search