Pages in topic:   [1 2] >
Importing DGT memory to Trados
Thread poster: James Greenfield

James Greenfield  Identity Verified
United Kingdom
Local time: 15:37
French to English
+ ...
Sep 22, 2013

Hi, I need help in downloading and importing the DGT memory to Trados. I have found this website, http://ipsc.jrc.ec.europa.eu/index.php?id=197 but am confused by the instructions. I'm not sure about how to create an autosuggest dictionary or TM, or even which would be the best to do. It would be for the French to English language pair. Also would the resut be too big a file for trados to cope with? I mainly work with private legal texts so I thought it would be useful. Any advice would be useful, thanks

[Edited at 2013-09-22 10:26 GMT]


 

Emma Goldsmith  Identity Verified
Spain
Local time: 16:37
Member (2010)
Spanish to English
A couple of links Sep 22, 2013

You could start here:
http://multifarious.filkin.com/2012/08/10/making-the-most-of-your-resources/
or here:
http://www.youtube.com/watch?v=GNj07W2ZqhQ

I personally find it too big (i.e. too slow) to use, but maybe you've got a super-fast machine and will have more success.


 

James Greenfield  Identity Verified
United Kingdom
Local time: 15:37
French to English
+ ...
TOPIC STARTER
Thanks Sep 22, 2013

Thanks for that information. That's what I feared, my machine isn't super fast so I think it would be too slow to use. I just found an interesting site though called www.mymemorytranslated.net which offers free translation memory files. They also have a free search engine for their memory database, which from my initial testing, is pretty good. I'll see what translation memory files they have.

 

Lorenzo Bermejo
Local time: 16:37
English to Spanish
+ ...
interesting! Sep 22, 2013

nice discovery James!
the instructions to build a TM with one language pair seem clear, but to create an Autosuggest dictionary, you would have to follow SDL rules.

I'll try myself to create an En>Es TM this afternoon with the materials from the website and will let you know about the results.

regards.


 

Lorenzo Bermejo
Local time: 16:37
English to Spanish
+ ...
FYI Sep 22, 2013

I've created a .tmx with the .zip files corresponding to the 2011, 2012, and 2013 releases of the DGT documentation, and it contains in total 2,6 million TUs! A .tmx over 1,8 GB. It would of been more reasonable to create one .tmx out of each year release.

this .tmx imports nicely into a Studio TM with en-GB as source language and es-Es as target, that is my usual language pair.

I will also try to import it to an en-US source language TM.

regards. hope the time spent in this proves useful soon.


 

Lorenzo Bermejo
Local time: 16:37
English to Spanish
+ ...
good news Sep 23, 2013

Studio 2011 has taken around 4 hours to import the 1.8GB .tmx file with the 2.6 million TUs.

I've made several term searches with the TM open in the Translation Memory pane and the good news is that each search only took around 2-3 minutes.

the TM seems really useful for legal and economy terminology.

I'm using Windows 7 Pro 64bits with a i5 processor and 8GB of RAM.

I'd say the effort of making up your own TM should prove useful!


 

FarkasAndras
Local time: 16:37
English to Hungarian
+ ...
Minutes? Sep 23, 2013

Lorenzo Bermejo wrote:

Studio 2011 has taken around 4 hours to import the 1.8GB .tmx file with the 2.6 million TUs.

I've made several term searches with the TM open in the Translation Memory pane and the good news is that each search only took around 2-3 minutes.

the TM seems really useful for legal and economy terminology.

I'm using Windows 7 Pro 64bits with a i5 processor and 8GB of RAM.

I'd say the effort of making up your own TM should prove useful!

2-3 minutes or 2-3 seconds? I would hope it's seconds.
Studio uses SQLite databases for its TMs so it's not unreasonable to expect lookup times of under a second on a single-word lookup on 2.6 million TUs.
It's good to hear that the import was done in a somewhat reasonable time frame. Ideally, I would expect a modern CAT tool to import TMX files at more than 1 million TU/hour, but it seems that none of them can do that... as things stand, we have to be happy that Studio can handle a TMX with over 2 million segments at all.

[Edited at 2013-09-23 08:25 GMT]


 

Lorenzo Bermejo
Local time: 16:37
English to Spanish
+ ...
minutes, not seconds! Sep 23, 2013

wow, I couldn't imagine how the search in such a vast TM would last only seconds!
anyway, I haven't tried to do searches with the TM connected to a project. it might be quicker.

regards.


 

James Greenfield  Identity Verified
United Kingdom
Local time: 15:37
French to English
+ ...
TOPIC STARTER
attempt to create an autosuggest dictionary from DGT Sep 23, 2013

Ok thanks all, I'm going to have a go today at creating an autosuggest dictionary from the resulting TMX.

 

Lorenzo Bermejo
Local time: 16:37
English to Spanish
+ ...
great! Sep 23, 2013

James Greenfield wrote:

Ok thanks all, I'm going to have a go today at creating an autosuggest dictionary from the resulting TMX.


good idea! please post your results. the process for creating the .tmx is very well explained in the DGT website.
I'm going to have a great resource for legal and financial terminology with the TM I made, although I don't usually do such a type of jobs. who knows!

cheers


 

FarkasAndras
Local time: 16:37
English to Hungarian
+ ...
db performance Sep 23, 2013

Lorenzo Bermejo wrote:

wow, I couldn't imagine how the search in such a vast TM would last only seconds!
anyway, I haven't tried to do searches with the TM connected to a project. it might be quicker.

regards.

Yes, that's definitely possible and Studio should be able to do it. I get hits from my DGT-TM in a second or two (for single-word searches). The first search is slow (some kind of index table gets loaded into memory I'm sure) but subsequent searches are quick. If Studio consistently takes minutes on single-word concordance searches for you, something is not right. What's your hardware (what cpu, what kind of hard drive, how much RAM)?

BTW I have written software that uses the same basic database type as Studio. I have benchmarks from that software: single-word searches on a 5-million TU TM generally take 0.1 second to 1 second. Of course my searches are more primitive (and thus faster) than what Studio does, but then again I'm just a guy with some free time on his hands while SDL has vast resources to spend on optimization.


 

Lorenzo Bermejo
Local time: 16:37
English to Spanish
+ ...
Specs Sep 23, 2013

750 GB HDD
i5 CPU
8GB RAM

it's not impressive but everything works nicely.
I'll see about the search speed in a few days.
I'm stuck with a many thousand words job due for Thursday.

best regards!


 

James Greenfield  Identity Verified
United Kingdom
Local time: 15:37
French to English
+ ...
TOPIC STARTER
Error message using DGT memory Sep 24, 2013

I have added the DGT memory file but when it comes to using it I get the following error message:
could not load file or assembly Oracle. DataAccess The assembly is built by a runtime newer than the currently loaded runtime and cannot be loaded
Does anyone know what this means and how to fix it? Thanks


 

FarkasAndras
Local time: 16:37
English to Hungarian
+ ...
try it Sep 24, 2013

Lorenzo Bermejo wrote:

750 GB HDD
i5 CPU
8GB RAM

A high-quality SSD would bring a marked improvement over the HDD, but otherwise you're fine.
Maybe you only did one test and that's why you got such a slow response time? Do several tests next time you play with the TM. At 3M TUs, you shouldn't notice any major slowdowns when doing simple concordance searches.


 

Lorenzo Bermejo
Local time: 16:37
English to Spanish
+ ...
errors in my case also Sep 24, 2013

James Greenfield wrote:

I have added the DGT memory file but when it comes to using it I get the following error message:
could not load file or assembly Oracle. DataAccess The assembly is built by a runtime newer than the currently loaded runtime and cannot be loaded
Does anyone know what this means and how to fix it? Thanks


I haven't encountered the same error as you, but a different one.
I use project packages I get sent for translation, and I've tried adding the DGT TM to one project setup using both ways you can add a TM.
in one case I get an error message box in Spanish saying that the database file is locked and that the TM cannot be opened, and in the other case I get an error message in the information pane of the Editor saying there has been an error using the TM.
I really don't know how to solve this and I don't have time to research the problem till Thursday.


 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Importing DGT memory to Trados

Advanced search







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search