Comprehensive list of translation memory (TM) file formats
Thread poster: Kevin Dias

Kevin Dias
Local time: 22:30
SITE STAFF
Oct 6, 2015

Hi all,

I was recently reading through this thread and there were mentions of "millions" and "hundreds" of formats for translation memory files. While those might have been exaggerations, I definitely became curious to learn more about some of the more obscure formats I don't know (or possibly even main stream formats that I am not aware of). So I decided to start this thread and see if we can come up with a comprehensive list from the community (please keep it to translation memory files only, I'll start a separate thread for terminology/termbase file formats). I'll start it off. If we get a lot of good responses from this thread I'll make a resource page with all of the formats.

Please try to match the following format:

Name:
File extension:
Type: (open or proprietary)
Link(s):

NB: for purposes of this thread I define 'open' to mean that the specifications are published on the Internet and 'proprietary' if the specifications are not published.

--------------

Name: Translation Memory eXchange
File extension: .tmx
Type: open
Link(s): Wikipedia | TMX 1.4b Specification

Name: XLIFF (XML Localisation Interchange File Format)
File extension: .xlf (.xliff also found in wild but not compliant with spec)
Type: open
Link(s): Wikipedia | XLIFF Version 1.2 Specification | XLIFF Version 2.0

Name: SDLXLIFF
File extension: .sdlxliff
Type: proprietary
Link(s): SDL Product Help Description

Name: SDLTM
File extension: .sdltm
Type: proprietary
Link(s): Wikipedia

Name: Wordfast Translation Memory
File extension: .txt
Type: proprietary Note: Wordfast claims in their documentation that the format is open. It is a tab-limited text file, but I have yet to find a specification with any more detail than the link below. If someone can point me to a true specification for the format I will change this to open
Link(s): Wordfast Support Specifications

Name: WordFast TXML
File extension: .txml
Type: proprietary
Link(s): OmegaT Documentation Description

Name: Trados TTX
File extension: .ttx
Type: proprietary
Link(s): What's a TTX file? (ProZ forums)

Name: Trados TMW
File extension: .tmw (also includes .mdf, .mtf, .mwf, .iix as neural network files)
Type: proprietary
Link(s): Wikipedia


EDIT: Update XLIFF file extension

[Edited at 2015-10-07 10:57 GMT]


 

Meta Arkadia
Local time: 20:30
English to Indonesian
+ ...
Bilingual files, project files, translation memories, termbases... Oct 7, 2015

...it seems you want to include them all. Not that I have any problems with it, but you will indeed end up with "hundreds" of file formats.

DejaVu uses Access databases for project files, translation memories (segments), and termbases. Very consistent (though it also uses a project-specific .txt file for terms). I still think that's the way to go, although MS Access may not be the best choice. On the other hand, how many viable choices did they have, late last century?

Cheers,

Hans


 

Jorge Payan  Identity Verified
Colombia
Local time: 08:30
Member (2002)
German to Spanish
+ ...
Not all of them are translation memory file formats ... Oct 7, 2015

Kevin Dias wrote:

... (please keep it to translation memory files only, I'll start a separate thread for terminology/termbase file formats)...



As far as I know TTX , XLIFF, SDLXLIFF, and TXML are formats intended for file interchange and not for translation memory (TM).

Maybe you would like to open a different thread for file interchange formats ...


 

Kevin Dias
Local time: 22:30
SITE STAFF
TOPIC STARTER
I intend to include those here Oct 7, 2015

For all intents and purposes what I am referring to when I say "translation memory file formats" are bilingual file formats that contain both the source document text and the translation text (e.g. translation memory data).

In other words - practically speaking what are file formats used by CAT tools to store and pass translation memory data (aligned source and target text data). Whether intended or not, I think TTX , XLIFF, SDLXLIFF, and TXML all fit this category.


 

Stepan Konev  Identity Verified
Russian Federation
Local time: 16:30
English to Russian
For the beginning... Oct 7, 2015

1. Across & Across personal edition (freeware)
2. Alchemy Catalyst
3. Anaphraseus (open source - based on OpenOffice macro set, so you require OpenOffice)
4. AnyMem
5. Cafetran
6. CatsCradle (for web pages)
7. Deja Vu
8. Ecco
9. Fluency Translation Suite
10. Fortis Translation Suite
11. GlobalSight
12. Glossy
13. Google Translator Kit (freeware)
14. gtranslator
15. Heartsome Translation Studio
16. IBM Translation Manager
17. Idiom
18. Logoport
19. Lokalize
20. MateCat
21. memoQ (free & pro versions)
22. Memsource
23. MetaTexis
24. MultiTrans
25. Oddjobs
26. OmegaT (freeware)
27. SDL Trados Passolo
28. SDL Trados Studio
29. SDLX
30. Similis (Freeware)
31. Smartcat
32. Snowball
33. Swordfish Translation Editor
34. Trados Workbench
35. Transit
36. WebBudget
37. Wordbee translator
38. Wordfast (free and paid versions)
39. Wordfisher (freeware)
40. Xliff editor
41. XTM


 

Kevin Dias
Local time: 22:30
SITE STAFF
TOPIC STARTER
File formats - not CAT tools Oct 7, 2015

Hi Stepan,

Thanks for your reply. It seems like that is more a list of CAT tools though - not file formats. Yes, some CAT tools will have their own proprietary format(s), but I don't think all of them in that list do. For example - does MateCat have a unique proprietary file format for translation memories?


 

Samuel Murray  Identity Verified
Netherlands
Local time: 15:30
Member (2006)
English to Afrikaans
+ ...
@Kevin Oct 7, 2015

Kevin Dias wrote:
Type: proprietary Note: Wordfast claims in their documentation that the format is open. It is a tab-limited text file, but I have yet to find a specification with any more detail than the link below. If someone can point me to a true specification for the format I will change this to open


The guys from Virtaal also ran into this problem when they created a WF2PO filter -- what's written in the "specifications" is not the whole story. Perhaps you can get a hold of their filter (it's Python) to see what adjustments they made). You may have to look at old repositories, or use e-mail.

Name: WordFast TXML


Don't forget Wordfast TXLF.

Oh, and then there's the various dialects of PO, and the variations of LNG/INI type files (key=value files). Gettext PO does have an official specification, but various programs that work with PO/POT files deviate from it.


==

Kevin Dias wrote:
For all intents and purposes what I am referring to when I say "translation memory file formats" are bilingual file formats that contain both the source document text and the translation text (e.g. translation memory data).


In that case, the dialects of "uncleaned RTF" should also be included, right? I know of the Trados 2007 dialect, the Wordfast dialect and the Anaphraseus dialect (they all share nearly identical marking delimiters). Then some CAT tools have similar-looking uncleaned formats that aren't really dialects of the original uncleaned RTF format, e.g. Metatexis, I think (it uses different marking delimiters).


 

Rodolfo Raya  Identity Verified
Local time: 10:30
English to Spanish
Wrong extension for XLIFF Oct 7, 2015

Kevin Dias wrote:

Name: XLIFF (XML Localisation Interchange File Format)
File extension: .xliff or .xlif



The only extension you can use with XLIFF files is ".xlf"

Anything else, including ".xliff", makes the file not compliant with the XLIFF standard.

Regards,
Rodolfo


 

Kevin Dias
Local time: 22:30
SITE STAFF
TOPIC STARTER
XLIFF Oct 7, 2015

Hi Rodolfo,

Thanks for the info. ".xlif" was a typo on my part, I'll fix that. ".xliff" does exist out in the wild though.

I see you are one of the editors of the XLIFF specification. Any reason that naming guideline has been removed from the XLIFF 2.0 Specification? In the 1.2 Specification it is clear:


D.4. XLIFF File Extension

XLIFF documents use the .xlf extension. No other extension is recommended by the specification.


however I can't find that in the 2.0 Specification.


 

Kevin Dias
Local time: 22:30
SITE STAFF
TOPIC STARTER
@Samuel Oct 7, 2015

Thanks! Great stuff!

Samuel Murray wrote:
In that case, the dialects of "uncleaned RTF" should also be included, right?


Yes, I think so. They seem to fit in the "bilingual file that holds aligned source and target data" category.


 

xxxDorothyX
France
Local time: 15:30
Don't forget Oct 7, 2015

.xlz which is a bilingual file in xliff format special for Idiom and Translation Workspace Xliff editor.
I don't know if it is directly compatible with .xlf or can be opened by Trados Studio but in each case it can easily be converted into other formats.
As for Stepan's list, Logoport, does not exist anymore for years now but the software can still used in some cases by those who own an access to Translation Workspace (for Word files only).

As for the Wordfast TMs, they are real tab delimited .txt files which can be read by each office application especially Excel and Word.

.po files are not always compatible between several applications.
There are clearly several flavours.
I managed to open several POEdit files only with Pootling.



[Edited at 2015-10-07 11:48 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Comprehensive list of translation memory (TM) file formats

Advanced search







SDL Trados Studio 2017 only €435 / $519
Get the cheapest prices for SDL Trados Studio 2017 on ProZ.com

Join this translator’s group buy brought to you by ProZ.com and buy SDL Trados Studio 2017 Freelance for only €435 / $519 / £345 / ¥63000 You will also receive FREE access to Studio 2019 when released.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search