how to convert a plain text glossary into a TermBase recognized by memoQ
Thread poster: Angel Llacuna

Angel Llacuna  Identity Verified
Spain
Local time: 00:04
English to Spanish
May 11

I created a glossary of terms using a simple plain text editor such as TextPad.

There is the English term followed by an equal sign, and then the Spanish translation.
For some words, several translations are given, separated by asterisks.

An extract of this glossary looks like this :



How I can convert this plain text glossary into a TermBase that can be recognized by memoQ while performing a translation with that tool ?


 

Anthony Green  Identity Verified
Italy
Local time: 00:04
Italian to English
+ ...
often quite a lot of steps May 11

Angel what I would do if I were you would be to upload the first, say, 100 items and then we could see how to deal with all those synonyms and tags you have in there.
Off the top of my head I wouldn't like to say "just do X, Y and Z" but there is no doubt that it can be done


 

Samuel Murray  Identity Verified
Netherlands
Local time: 00:04
Member (2006)
English to Afrikaans
+ ...
@Angel May 11

Angel Llacuna wrote:
How I can convert this plain text glossary into a TermBase that can be recognized by memoQ while performing a translation with that tool?


I'm not a MemoQ user but I checked the MemoQ help file:
http://kilgray.com/memoq/2015-100/help-en/index.html?term_base_csv_import_settings_.html

It would seem that you need to import it as "CSV". I think you should first replace all " = " with a tab, and then replace all " * " with e.g. a semicolon. When importing, select Tab as the delimiter. Tick the option "Split alternatives in field by" and put a semicolon in the box. I'm not sure if renaming your file to something.csv is required.

I'm not 100% sure how "Split alternatives in field by" works but it looks as if it would what you'd be looking for. If it turns out that we misunderstood what that feature does, then I'm afraid you're going to have to turn those multi-translation items into separate items. In other words, change this:

acquire[tab]adquirir;obtener;conseguir;recoger

into this:

acquire[tab]adquirir
acquire[tab]obtener
acquire[tab]conseguir
acquire[tab]recoger

for each term. Can you figure out how to do this? I'm not 100% sure but it from videos it looks to me as if MemoQ would be okay with term bases that contain multiple entries for one source term.

By the way, it looks as if your glossary needs a bit of additional tweaking, e.g. you have "-tech" and "(data)" which I believe would have to go into a third column, if you want MemoQ to realise that they are not part of the target text.


 

Angel Llacuna  Identity Verified
Spain
Local time: 00:04
English to Spanish
TOPIC STARTER
Thank you very much, Samuel and also Anthony for replying ... May 11

That bit of info, in the form of -tech , on my glossary excerpt, is context information.
Can I define a third column for it on my csv file ?

accoustic noise declaration = declaración de ruidos -tech


 

Samuel Murray  Identity Verified
Netherlands
Local time: 00:04
Member (2006)
English to Afrikaans
+ ...
@Angel May 11

Angel Llacuna wrote:
That bit of info, in the form of -tech, on my glossary excerpt, is context information.
Can I define a third column for it on my csv file?


Yes, it is very common in plaintext CAT tool glossary formats that the third column is for extra information (the "comment" column). This means that you should add a tab (or whatever your separator character is) between the target term and the context information.

In my own CAT tool, WFC, I sometimes have separate entries for each translation, but sometimes I have just one entry, and then put the other translations in the comment column, depending on how important it is for the CAT tool to be able to automatically recognise each translation.

My comments also sometimes contain information about where the term came from, etc. In a proper term base, different types of contextual information would be in separate fields (i.e. parts of speech, origin, antonyms, synonyms, definitions, etc) but in CAT tools with a simpler glossary display, all of that can be written in just one column. I also add information in the comment field if a term was changed from a previous version of the term, but I know that some CAT tools have more advanced version tracking capabilities (I would not be surprised if MemoQ can do this), although maintaining all this information takes time, and sometimes you just need a quick-and-dirty glossary import.


 

Stepan Konev  Identity Verified
Russian Federation
Local time: 01:04
English to Russian
Just a minor addition May 11

It also may be *.txt (not exceptionally csv).
All other steps are the same as described by Samuel.

acquire[tab1]adquirir;obtener;conseguir;recoger[tab2]-tech

Assign 'Import as term' both to F0 (Source language) and F1 (Target language).
Assign 'Import as definition' to F2 (Target language).

F0=text before the first tab
F1=text between the first and the second tab
F2=text after the second tab

'Import as definition' is the simplest way, but you can fiddle with 'Import as other field' and select anything else if you like...




[Edited at 2018-05-11 10:32 GMT]


 

Anthony Rudd

Local time: 00:04
Import glossary May 11

Import terminology as CSV TB, UTF-8
specify = as delimiter
specify F0 and F1 as "term"
voilà


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

how to convert a plain text glossary into a TermBase recognized by memoQ

Advanced search






PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »
SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search