JRC-Acquis, how to convert to tmx
Thread poster: Magdalena Kowalska

Magdalena Kowalska  Identity Verified
United Kingdom
Local time: 01:15
Polish to English
+ ...
Dec 13, 2015

Hi,

I've downloaded the already aligned files from https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis. Now, I need to convert those xml files to nything my memoq can process, like csv, if not directly to tmx.

How do I go about it? I've tried all online xml-csv converters I could find, but the files are too big for them to work.

Has anyone succeeded in using the JRC texts with their cat tool?


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 00:15
Member (2009)
Dutch to English
+ ...
two tips Dec 13, 2015

Magdalena Kowalska wrote:

Hi,

I've downloaded the already aligned files from https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis. Now, I need to convert those xml files to nything my memoq can process, like csv, if not directly to tmx.

How do I go about it? I've tried all online xml-csv converters I could find, but the files are too big for them to work.

Has anyone succeeded in using the JRC texts with their cat tool?


I suggest getting Andras Farkas’s collection. For a small fee, he will supply you with the ultimate EU collection of TMXs, or in any other format you might want: http://www.farkastranslations.com/eu_translation_memories.php

The best place to start if you want to get the DGT/JRC stuff directly from the EU is here:

https://ec.europa.eu/jrc/en/language-technologies/dgt-translation-memory

[Edited at 2015-12-13 17:36 GMT]

[Edited at 2015-12-13 17:37 GMT]


Direct link Reply with quote
 

Emma Goldsmith  Identity Verified
Spain
Local time: 01:15
Member (2010)
Spanish to English
And a third tip Dec 13, 2015

Dominique Pivard posted a useful video on the DGT TM here:

https://www.youtube.com/watch?v=GNj07W2ZqhQ


Direct link Reply with quote
 

Blaž Košir
Slovenia
Local time: 01:15
English to Slovenian
+ ...
Try here Dec 13, 2015

Try here: http://www.ttmem.com/terminology/download-translation-memory/european-commission-translation-memory/

Direct link Reply with quote
 

Magdalena Kowalska  Identity Verified
United Kingdom
Local time: 01:15
Polish to English
+ ...
TOPIC STARTER
Thanks Dec 16, 2015

I actually did that already a few years ago.. downloading, aligning with that tool, etc. Jut wasn't sure it is still the same TM. It is worth to add the 2015 additions though, which I'm doing right now.

Direct link Reply with quote
 

Milan Condak  Identity Verified
Local time: 01:15
English to Czech
Extract and split Dec 16, 2015

Magdalena Kowalska wrote:

How do I go about it? I've tried all online xml-csv converters I could find, but the files are too big for them to work.


TMXs are ready in multilingual Translation Memory.

Since November 2007 the European Commission's Directorate-General for Translation has made its multilingual Translation Memory

https://ec.europa.eu/jrc/en/language-technologies/dgt-translation-memory

How to produce bilingual extractions

The multilingual extraction has English as the source language. Users can extract any language pair as follows, using the extraction tool TMXtract:
For the Windows Operating System:
Download the TMXtract.jar file;

After extraction I use Heartsome TMX Editor for merging and splitting of TMXs.

http://www.condak.cz/nove/2015-12/08/cs/04.html

Another solution: use a CAT with server for TMs. Felix-cat is now open-source, a server is included.

Milan


Direct link Reply with quote
 
CafeTran Training
Netherlands
Local time: 01:15
DGT-Translation Memory: different generations Jun 7

For the DGT-Translation Memory different generations can be downloaded (2007, 2011 etc.). Do these generations overlap (contain identical TUs)?

Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

JRC-Acquis, how to convert to tmx

Advanced search







Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search