Looking to exchange Dutch-English glossaries.
Thread poster: Michael Beijer

Michael Beijer  Identity Verified
United Kingdom
Local time: 03:37
Member (2009)
Dutch to English
+ ...
Aug 22, 2010

I am presently working on a Dutch-English Translator's Glossary and am in the process of collecting data. If you happen to have any Dutch-English term lists, glossaries, dictionaries, or term bases that you would be willing to share, please let me know.

I have approximately 50,000 (rough) entries worth of glossaries I could offer you in exchange. Even if you would like me to send you what I have already collected and do not want to offer anything in return, that would be OK too.

Regards,

Michael


 

Frank van Thienen  Identity Verified
Canada
Local time: 19:37
Dutch to English
term bases available Aug 23, 2010

Hello Michael,

That's a very generous offer! I'll gladly share my term bases with you. As I've been translating part-time until recently I don't have anywhere near the 50,000 terms you're talking about, but you can take look at them and do as you please. I have a legalese TB (very minimal), an academic TB (I've translated several theses and scientific articles) and a "standard" TB that I use for everything else.

What's a good way to get you the files? And which format would you like?
I can send them as memoQ term bases (1.2MB zipped), or export to csv - whatever you like.

Frank icon_smile.gif


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 03:37
Member (2009)
Dutch to English
+ ...
TOPIC STARTER
Thanks Frank! Aug 23, 2010

I have just sent you an email via the Proz system.

Michael


 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 04:37
Member (2005)
English to Spanish
+ ...
How did you get to that amount? Aug 23, 2010

Michael J.W. Beijer wrote:
I have approximately 50,000 (rough) entries worth of glossaries I could offer you in exchange. Even if you would like me to send you what I have already collected and do not want to offer anything in return, that would be OK too.

For sheer curiosity, may I ask you how to came about to having 50,000 entries? Did you research, classify, and define each term individually, or did you just make a collection of glossaries you found around.

I find it hard to believe that someone would give away the results of hard work, and also a definite edge if the material is good, which makes me immediately think that maybe you are not 100% sure of the quality of the entries you plan to give.


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 03:37
Member (2009)
Dutch to English
+ ...
TOPIC STARTER
By being a good googler;) Aug 23, 2010

Tomás,

What I have is (the beginning of a) very large collection of glossaries from all kinds of different fields. At present, I am collecting data, and of course trying to arrive at a good way of classifying and organising the entire mess.

The idea is to create a vast database consisting of Dutch and English terms in order to share them with fellow translators and/or eventually make them available via a website where people can upload/download, edit and maintain them collaboratively. The concept is still very much in development however, and I would be interested to hear if anyone would like to help or form a team of some sort. There are of course already many places on the internet with great Dutch-English terminology resources, but there must be a better way to organise all of this information. I think there are a few very interesting projects online, and I am talking to some of them about ways to improve shared language data.

The problem of quality is of course a very real issue, which I hope can be solved by means of a clever (online) collaborative editing framework, combined with a small team of people willing to help. Basically, my aim is to create a centralised platform where all of the Dutch-English terminology resources that are available can be shared, improved and standardised.

On a more practical level however, and in spite of still containing a veritable host of errors, duplicates, etc, ... these glossaries are already very useful if used (with care;) in memoQ as individual termbases (and maintained in tab-delimited UTF-8 text files).


Michael


 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 04:37
Member (2005)
English to Spanish
+ ...
A word of caution Aug 23, 2010

Michael J.W. Beijer wrote:
The problem of quality is of course a very real issue, which I hope can be solved by means of a clever (online) collaborative editing framework, combined with a small team of people willing to help. Basically, my aim is to create a centralised platform where all of the Dutch-English terminology resources that are available can be shared, improved and standardised.

Hm... Well, quality with glossaries is THE issue. A vast majority of glossaries freely available out there are not good enough to spend time converting them into a termbase. They hardly give an idea of where to start researching a new term.

Personally, I dislike the idea of "collaboratively editing", because:
A) You can never be sure that someone editing an entry does know what he/she's doing.
B) I would not want to share my knowledge that freely, since it is one of my competitive edges. Why should I fix other people's mistakes for free?
C) Editing other people's entries will lead to neverending discussions about what term is best, and nobody has the time for that.

I would be really careful with such glossaries grabbed from the web. Good glossaries are sitting in the computers of individual translators who can be considered experts in their field. Information really worth using is certainly not on the web.


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 03:37
Member (2009)
Dutch to English
+ ...
TOPIC STARTER
Information really worth using is certainly not on the web? Aug 23, 2010

Tomás,

"A vast majority of glossaries freely available out there are not good enough to spend time converting them into a termbase."


I beg to differ. It all depends on what you are intending to use it for.

"A) You can never be sure that someone editing an entry does know what he/she's doing."


I agree, the original author might not have known what they were doing, but in your own small team of fellow translators and linguists, there would of course be a certain amount of control over this.

"B) I would not want to share my knowledge that freely, since it is one of my competitive edges. Why should I fix other people's mistakes for free?"


I believe that this is based on the mistaken idea that sharing is not economically viable. It also misses what I believe is a very important point about the possibility of the creation of standards with regard to common business/industrial/government, etc terms in the various European languages and English.

"C) Editing other people's entries will lead to neverending discussions about what term is best, and nobody has the time for that."


Isn't this exactly what is already happening, quite successfully I might add, on the in the Proz KudoZ term questions?

"Good glossaries are sitting in the computers of individual translators who can be considered experts in their field. Information really worth using is certainly not on the web."


Then why are we all here, now, reading this?
Have you had a look here lately: http://eurovoc.europa.eu/drupal/?q=download/list_pt&cl=en ?
Or here: https://www.tausdata.org/index.php/language-search-engine ? There are some very good sources starting to appear throughout the internet.

Michael


 

FarkasAndras
Local time: 04:37
English to Hungarian
+ ...
Good info on the web Aug 23, 2010

Tomás Cano Binder, CT wrote:


I would be really careful with such glossaries grabbed from the web. Good glossaries are sitting in the computers of individual translators who can be considered experts in their field. Information really worth using is certainly not on the web.


I agree with your point in general, but I definitely do not agree with the last sentence. A lot of good info is out there: glossaries, term lists, dictionaries, parallel texts, ready-made TMs etc. Publicly funded entities (universities, ministries, EU institutions, international organizations of all kinds) tend to publish their resources as they have no financial interest in keeping them for themselves, and some of these are really good.
Of course you have to choose carefully and take every source with a grain of salt.

Apart from the quality issue, there are a couple of potential problems with collecting glossaries from the web. One is the issue of IP rights (can you grab and republish anything that's on a public website?), another is keeping up with the updates of your source glossaries.

Personally, I'd probably drop the collaborative editing idea completely, and concentrate on finding high-quality glossaries and sharing their URLs as well as the glossaries themselves, converted to a unified format everyone can easily import to whatever tools they use. IMO UTF-8 encoded tab-delimited files are the only really good format, with XLS a reasonable alternative for smaller glossaries.
Make sure it's easy to download glossaries of one specific area of specialization, and also that all entries are properly formatted (synonyms!) and documented (source site, date of collection).
Things like the ISI statistical glossary, Electropedia, Eurovoc and the CPV glossary are probably worth the effort, and I'm sure there are more than a few high quality Dutch-specific glossaries as well.

IMO the whole "ultimate collaborative, do-anything, go-anywhere doomsday glossary for all fields" is a pipe dream not worth chasing. The best goal to aim for is probably a good resource that makes other resources easier to find and easier to use.

Hit me up if you need pointers on how to convert HTML and other formats to tab delimited.


 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 04:37
Member (2005)
English to Spanish
+ ...
I meant glossaries of course! Aug 23, 2010

FarkasAndras wrote:
Tomás Cano Binder, CT wrote:
I would be really careful with such glossaries grabbed from the web. Good glossaries are sitting in the computers of individual translators who can be considered experts in their field. Information really worth using is certainly not on the web.

I agree with your point in general, but I definitely do not agree with the last sentence. A lot of good info is out there: glossaries, term lists, dictionaries, parallel texts, ready-made TMs etc. Publicly funded entities (universities, ministries, EU institutions, international organizations of all kinds) tend to publish their resources as they have no financial interest in keeping them for themselves, and some of these are really good.

OK, OK! I meant to say glossaries and somehow wrote "information". Sorry! I sustain the rest.


 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 04:37
Member (2005)
English to Spanish
+ ...
Some answers Aug 23, 2010

Michael J.W. Beijer wrote:
"A vast majority of glossaries freely available out there are not good enough to spend time converting them into a termbase."

I beg to differ. It all depends on what you are intending to use it for.

Being a translator I can only think of using glossaries for translation. What other lines of business were you thinking of? Selling them to translators? That brings us again to translation. Only good glossaries (made by bilingual experts) are worth using.

Michael J.W. Beijer wrote:
"A) You can never be sure that someone editing an entry does know what he/she's doing."

I agree, the original author might not have known what they were doing, but in your own small team of fellow translators and linguists, there would of course be a certain amount of control over this.

Exactly. It's the "small team of fellow translators" stage I was thinking of in fact!

Michael J.W. Beijer wrote:
"B) I would not want to share my knowledge that freely, since it is one of my competitive edges. Why should I fix other people's mistakes for free?"

I believe that this is based on the mistaken idea that sharing is not economically viable.

I never said that it was not viable. It is very viable... for those who receive!

Michael J.W. Beijer wrote:
"C) Editing other people's entries will lead to neverending discussions about what term is best, and nobody has the time for that."

Isn't this exactly what is already happening, quite successfully I might add, on the in the Proz KudoZ term questions?

Yes, and what do you think about the results? Would you use Kudoz as a solid source of terminology information? It's just an orientation help really, exactly because there is collaboration...icon_smile.gif

Michael J.W. Beijer wrote:
"Good glossaries are sitting in the computers of individual translators who can be considered experts in their field. Information really worth using is certainly not on the web."

Then why are we all here, now, reading this?

Do you mean that we are all interested in sharing glossaries and collaborating in terminology? I am here because I don't agree with it.icon_smile.gif

Michael J.W. Beijer wrote:
Have you had a look here lately: http://eurovoc.europa.eu/drupal/?q=download/list_pt&cl=en ?

An institution. Not collaboration.


An association of companies with a business interest. Not collaboration.


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 03:37
Member (2009)
Dutch to English
+ ...
TOPIC STARTER
To share or not to share... Aug 25, 2010

... so that seems to be the question.

Hmm, I started this post as a way of asking around here in the forums if anyone would like to help out creating a Dutch-English / English-Dutch glossary, or collection of glossaries. Not to debate whether we as translators should or shouldn't do so. I obviously have my beliefs, and others have theirs. I am a frim believer in sharing and that it is actually possible to share and collaborate while at the same time doing good business. TAUS is a very good example of this. I would urge everyone to take a few moments to go over and take a look at what they have to say on their websites*. Tomás calls them "An association of companies with a business interest. Not collaboration." Hmm. I agree, they are an association of companies with a business interest. But then, aren't we translators, whether gathered together here on Proz or other online places to get some help with a difficult term, or perhaps just to socialise, ... aren't we exactly the same?

I would prefer comments about ways of helping out, rather than criticism, albeit on the surface constructive criticism.

So, once more:

I am presently working on a Dutch-English Translator's Glossary and am in the process of collecting data. If you happen to have any Dutch-English term lists, glossaries, dictionaries, or term bases that you would be willing to share, please let me know.

I have approximately 50,000 (rough) entries worth of glossaries I could offer you in exchange. Even if you would like me to send you what I have already collected and do not want to offer anything in return, that would be OK too.

Michael

p.s. Thanks to Frank for sharing, and FarkasAndras: I might just take you up on that HTML -> tab delimited offer sometime! I am currently in the process of learning all of the strange and wonderful things about Column Editing (in UltraEdit + MadEdit) and what that can help you do to tab delimited text files with glossaries.


* http://www.translationautomation.com/
http://www.tausdata.org/


 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 04:37
Member (2005)
English to Spanish
+ ...
And in the depth! Aug 26, 2010

Michael J.W. Beijer wrote:
I would prefer comments about ways of helping out, rather than criticism, albeit on the surface constructive criticism.

Not only on the surface Michael! I mean it for the good of translators. I sincerely think we trust shared glossaries in excess, and that they can lead us to tremendous mistakes (thus discrediting not just yourself, but the profession) if we are not very careful and do our research. It is not easy to fight the temptation to use a glossary instead of researching, and where there is weakness, there is danger...icon_smile.gif


 

FarkasAndras
Local time: 04:37
English to Hungarian
+ ...
Go right ahead Aug 26, 2010

Michael J.W. Beijer wrote:

p.s. Thanks to Frank for sharing, and FarkasAndras: I might just take you up on that HTML -> tab delimited offer sometime! I am currently in the process of learning all of the strange and wonderful things about Column Editing (in UltraEdit + MadEdit) and what that can help you do to tab delimited text files with glossaries.

Any time.
I don't know much about how the two tools you mention work/can be used for such purposes, though.
I use tools that basically do really advanced search and replace with regular expressions (regex): sed and Perl.
Sed is a little bit less hassle, perl is a little bit more advanced.

While some text editors are almost as powerful in terms of features as sed & perl, they all fail above a certain file size*. It's surprisingly easy to reach the file size where these tools cease to be a feasible option.
My text editor of choice is Notepad++, which has solid regex features, but it's still not quite as advanced as sed & perl, and it can't handle as much of a workload and it's nowhere near as fast.

BTW, these tools can also help you automatically convert tab delimited glossaries to TBX, other sorts of XML (such as the Multiterm import format), HTML or even xls.
So you can just store your 1000 glossaries in txt and generate 1000 excel spreadsheets with a couple of clicks if needed.


* except for vim, which isn't what you would normally think of as a text editor

[Edited at 2010-08-26 06:52 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Looking to exchange Dutch-English glossaries.

Advanced search







BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search