Is it possible to import a Déjà Vu terminology database (TDB) into a MemoQ term base? Thread poster: Olaf Reibedanz
|
Olaf Reibedanz Colombia Local time: 01:43 Member (2003) English to German + ...
Dear colleagues, Does anybody know how I can import a Déjà Vu terminology database (TDB) into a MemoQ term base? Thanks in advance for your help! Olaf | | |
Export as CSV | May 21, 2010 |
Olaf Reibedanz wrote: Does anybody know how I can import a Déjà Vu terminology database (TDB) into a MemoQ term base? Yes. In DVX, export as CSV using Unicode otr UTF-8 for all languages (don't use the code pages DVX proposes by default). Then import as CSV in MemoQ. For advanced fuzzy terminology recognition, you may use pipes (|) and asterisks (*). E.g. you may insert pipes using regular expressions. Cheers GG | | |
Olaf Reibedanz Colombia Local time: 01:43 Member (2003) English to German + ... TOPIC STARTER Great, thanks! | May 21, 2010 |
I followed your advice and exported the TDB into a .txt file using commas as separators and it worked perfectly! Only problem: In the MemoQ term base, the German special characters (ä, ö, ü, ß) have not come out correctly. Is there anything I can do about it? | | |
Encoding... column names... | May 21, 2010 |
Olaf Reibedanz wrote: I followed your advice and exported the TDB into a .txt file using commas as separators and it worked perfectly! I use rather tabs because my entries may contain commas. As DVX uses quotes for the term entries, it should have no impact on the data import but it's easier to manipulate the file, e.g. in MS Excel (the tabs will be recognized idependently of the Windows locale) Only problem: In the MemoQ term base, the German special characters (ä, ö, ü, ß) have not come out correctly. Is there anything I can do about it? You have an encoding problem. Normally MQ recognizes the encoding but sometimes it defaults to UTF-8 (AFAIK) when you change some settings in the TB import window. Probably you should force Unicode Little Endian in MQ if you selected Unicode in DVX. It works well here One more thing. In DVX, name your columns like English, German, French, Spanish, Polish etc. Case sensitive. It will speed up the import definition in MQ Cheers GG
[Edited at 2010-05-21 13:11 GMT] | |
|
|
Olaf Reibedanz Colombia Local time: 01:43 Member (2003) English to German + ... TOPIC STARTER
Thanks Gregorz, I had forgotten to choose "UTF-8" when exporting the text file from the TDB. Now it worked | | |
Selcuk Akyuz Türkiye Local time: 09:43 English to Turkish + ... CSV files and Excel | May 21, 2010 |
It is easy if you have only the term and translation fields. But importing other fields is not so easy, and there are some limitations. In DVX, Part of Speech field have the following: noun verb adjective adverb article preposition pronoun conjunction interjection However, in MemoQ you have: Noun Adjective Adverb Verb Other So you have to replace article, preposition, pronou... See more It is easy if you have only the term and translation fields. But importing other fields is not so easy, and there are some limitations. In DVX, Part of Speech field have the following: noun verb adjective adverb article preposition pronoun conjunction interjection However, in MemoQ you have: Noun Adjective Adverb Verb Other So you have to replace article, preposition, pronoun, conjunction and interjection with Other. You can not add new fields to MemoQ termbases, but it is possible in both DVX and MultiTerm. For instance, in my DVX termbase I have created a new text field (other than the Context) to add dictionary definitions. It is possible to import this field into the Definition field of MemoQ, but if you have additional fields in your DVX termbase you will not be able to import them. CSV files can be edited with Excel, but there are problems related to the number of characters in a single cell. So, if the context field has long sentences then you will have to use another CSV editor, e.g. CSVed or uniCSVed (for unicode). In any case, conversion of a DVX terminology database will result in some data loss. ▲ Collapse | | |
I was wondering, how does the export/import work with synonyms and achronyms to CSV? How about other text fields like Contexts, Definitions or Comments? How are they handled if each term may have a different number of text fields? I have been wanting to try this with DVX and MemoQ but I never have the time. Daniel | | |
Synonyms and table structure... MQ vs Multiterm... | May 22, 2010 |
Daniel García wrote: I was wondering, how does the export/import work with synonyms and achronyms to CSV? Basically, it's the same approach Multiterm Convert uses for XLS or CSV files. The difference is the export also works. How about other text fields like Contexts, One field per synonym. Definitions One field per language. or Comments? One field at entry level. How are they handled if each term may have a different number of text fields? The MQ approach is very simple, so you can't add/multiply fields and the problem doesn't exist at this level. But when you add a synonym, a set of three fields (e.g. Polish, Term_Info, Term_Example) is added internally, so the exported table structure is always valid, unlike in Multiterm. I have been wanting to try this with DVX and MemoQ but I never have the time. In fact now, for MQ, you should only understand the wildcard functions (pipes and asterisks) and the matching parameters (coded in the Term_Info field) and it's all. As you see, it's very simple but terribly effective and bulletproof at the entry level i.e. for at least 90% of translators. If you export a TDB once, see the column names and respect the naming conventions, the import is done automatically, otherwise you always able to map correctly the names when you make a mistake, unlike in Multiterm Convert. A more sophisticated solution is planned but no timeframe is given. Cheers GG
[Edited at 2010-05-22 08:55 GMT] | |
|
|
Lost in conversion... | May 22, 2010 |
Selcuk Akyuz wrote: It is easy if you have only the term and translation fields. But importing other fields is not so easy, and there are some limitations. Well, you're absolutely right, the conversion to MQ may be lossy because the number of fields MQ handles is limited and rigid. In DVX, Part of Speech field have the following (...) However, in MemoQ you have (...) So you have to replace article, preposition, pronoun, conjunction and interjection with Other. True. But, in fact, during the translation I almost never add these attributes. It's too slow and DVX doesn't use 'em in the AutoAssemble. So, when I need a dictionnary which purpose is only to speed me up, I don't need a lot of infe DVX/Multiterm TBs may contain. IMO it's exactly the MQ TB purpose. As the Kilgray guys wrote in MQ help: memoQ offers simple and very down-to-Earth terminology management: its term base support is tailored to the needs of translators and reviewers. In the terms of features a terminologist needs, it can't be compared to the heavy Multiterm monster. You can not add new fields to MemoQ termbases, but it is possible in both DVX and MultiTerm. True. For instance, in my DVX termbase I have created a new text field (other than the Context) to add dictionary definitions. This field exists in MQ by default. The possble problem it exist at the language level and not at the synonym level. CSV files can be edited with Excel, but there are problems related to the number of characters in a single cell. So, if the context field has long sentences then you will have to use another CSV editor, e.g. CSVed or uniCSVed (for unicode). . Yes, it's a possible problem but I doubt it affects more than 5% of translators. I think 80% of Trados users I know don't use Multiterm at all. Some of them don't know Multiterm exist. Selcuk, you're an advanced user and you have advanced problems In any case, conversion of a DVX terminology database will result in some data loss. In fact, the data losses due to different data structures are somehow normal. E.g. you can't transpose DVX guaranteed matches to MQ 101% without a very heavy programming work (if even). Neither DVX is able to leverage MQ 101%. So why one must be careful and conscient of the choices he makes. IMO it's not frequent... Cheers GG
[Edited at 2010-05-22 09:31 GMT] | | |
Thanks Grzegorz and Selcuk for your insights on DV and MemoQ! I see they are quite a differen type of fish altogether in comparison to MultiTerm. Daniel | | |
MQ vs Multiterm... revisited... | May 24, 2010 |
Daniel García wrote: Thanks Grzegorz and Selcuk for your insights on DV and MemoQ! I see they are quite a differen type of fish altogether in comparison to MultiTerm. Let's say, DVX is indeed completely different (in fact, it's a network of relations but few people use it in this way) but when you take a closer look, MQ is a kind of Multiterm Light, you have a lot of similar concepts e.g. a flat entry structure with ID (including the synchronization using ID). Not so flexible but you have all the basic functions, a far better automation in the import features (e.g. when your import file contains more languages than you termbase definition, you need only one click) and a more effective sublanguage leverage (e.g. when you have EN-GB and EN-US in you termbase, both are suggested, unlike in MT which is always a 1:1 tool). Generally, I think it's the way Multiterm should have adopted to be user friendly and efective in the standard user cases... If you add a homogeneous programming environnement in MQ (.Net and SQL) against .Net, MS Jet (which is slowly getting outdated) and Java the Multiterm guys are unable to use in a stable way (it takes 10 years now...), I think the MQ basic terminology handling will improve in the future but Multiterm programmers will never be able to fix their ethernal bugs and flaws in the concept. My beloved example is the CSV export which never worked in complex scenarios. Multiterm is crushing under its own weight. BTW. I missed one thing, you can also add images in the MQ termbases, DVX TB is a text only tool. Cheers GG
[Edited at 2010-05-24 08:42 GMT] | | |