ProZ.com global directory of translation services
 The translation workplace
Ideas

 
Pages in topic:   < [1 2 3 4 5 6 7 8 9 10] >
User
Thread poster: Henry Dotterer
Clients / large translation companies now talking about pooling linguistic data. Should we be there?

Tomás Cano Binder, CT  Identity Verified
Spain
Local time: 04:48
Member (2005)
English to Spanish
+ ...
We don't own the IP Aug 1, 2008


AFLSSInc wrote:
The TAUS model specifically distinguishes "content contributors" from "content users" and "end-customers". Of course only content owners are able to contribute content.

Nobody would expect a user to contribute intellectual property that is not his own of course.


Which brings us back to the fact that this statement might be completely out of the question in a TAUS or similar situation:


Henry D wrote:
The current survey shows that over 20% of respondents are already sharing TMs privately and getting good results (like >20% increase in throughput). Nothing is stopping anyone here from doing that, and if the data is any indication, experimenting with it may not be a bad idea.


I.e. we as translators cannot contribute to the memories as the stuff we translate does not belong to us...


Direct link Reply with quote
 
RoyMarie
United States
Local time: 19:48
Why TAUS does not make sense and why data sharing does? Aug 2, 2008

While I have moved away from actual translation work into technology and yes even translation automation, I would like to offer a perspective of somebody who is involved with language technology in various ways, and as one who can see the rationale for translation automation at a high level.

TAUS is an organization made up of localization people involved in "large" corporate translation projects of mostly very static information i.e. manuals, documentation, web pages about products, software interfaces etc... All important to do but not really high profile in terms of the overall corporation, when was the last time you read the manual of a car you bought or a mixer or a blender. Business line managers who run international business divisions in sales/marketing/production are the real power players here and people involved with "localization" are generally much lower in stature and influence. The TAUS approach and modus operandi is very much localization culture based.

To my observation, the forces really driving these sharing and efficiency initiatives are the core business imperatives that arise from the increasingly flat world. From a global marketplace enabled by Web 2.0+ technologies e.g. I have children who routinely buy designer clothes from China at a fraction of the cost at a local store. We all know of many examples that show the world is opening up in this way, though this may be a silly example.

This direct contact among globally scattered buyers and sellers means that a lot more information needs to be translated to enable and facilitate trade. As countries across the world grow online populations, the thirst for information beyond shopping also grows. Why is the Indonesian Wikipedia only 20,000 pages when the English Wikipedia is 3M+ pages? 260M people live in Indonesia.

There are many such discrepancies across the world. Knowledge is concentrated in G7 languages or maybe just English, German, French & Japanese where the bulk (90%+) of the worlds patents come from.

It is not possible to convert the knowledge of the world with just a human translation effort. Automation is necessary but automation alone cannot succeed, it needs competent human guidance for this to work. Global corporations now understand that they really need to make all kinds of information available to build loyal and satisfied global customer bases. Sharing general linguistic assets and intelligent man-machine collaboration can enable global enterprises to convert huge masses of knowledge content to many languages. Truly make knowledge a universal resource.

So back to TAUS -- the approach is perhaps outdated. It is a Web 1.0 approach.

Google informed us that while there are 700,000 human translators who might be considered professionals across the globe there are actually 600M+ competent bilingual people on the Internet who are capable of doing some translation or help clean up automated translation. Is it useful to get these people involved? How could this happen?

The TAUS Data Association (TDA) does not make sense for the following reasons:

The TAUS approach has a predominantly localization focus and so will not draw many professional translators who are perhaps afraid of being marginalized and is too far of the beaten path to draw the 600M+ that could help with massive translation projects.

Even more specifically:

1) The technology platform for the pooled data is undefined and so it is very unclear what the benefit will be based on the very meager definitions that have been presented on the platform to date. There is also no clear definition or even discussion of standards that could be the foundation that would drive intelligent data collection and aggregation.

2) Anybody who has pooled TM from disparate sources or played with Open Source SMT (Moses) is aware that just pooling data does not automatically lead to benefits. A standardization and normalization process needs to take place to make the data equivalent and compatible. Initial TAUS tests have shown that there was very little benefit from just throwing data into a common bucket.

Again this is not recognized as an issue by TDA and this should give ProZ reason to hesitate. I have worked with many differing pools of data with Moses and understand that a significant amount of work needs to be invested in data cleaning, normalization and preparation for the leverage to be meaningful. There is very little awareness of this process at this point among the people who have joined who all assume that more is better.

3) The costs are somewhat high for the relative value and will discourage the many smaller players who could also benefit and contribute. The business model encourages high volume contribution but has no mention of quality as a consideration for benefit. So it is possible that large amounts of crap will be collected. Old TM that has been lying around is "donated" to make a muddy soup.

4) Much of the data that will be made available can be downloaded without trouble from the websites anyway and could easily be aligned for a lower cost, than incurred by joining the association and buying the data access. Remember that most of the data they plan to put into the consortium is already on the website of the contributor. In the case of the EU, all the data can simply be downloaded as TMX files with no problem at all. So why would anybody want to go there to get it?

5) There is no outreach to the little man, the freelance translator, the 600M+ who under the right conditions could be encouraged to contribute 5 sentences each. The TDA is basically an old boys network. Web 2.0 is all about empowerment of the masses, engaging hundreds of thousands to change the world. Yes, only a few really contribute but the Wikipedia is a good example of open collaboration where the process really does produce usable quality. Tens of thousands do a little and maybe an elite 1000 is responsible for the huge bulk of the work. This is true for many social network based collaboration.

So if data sharing makes so much sense, why am I beating on the TDA?

Actually I think it is an admirable first effort that at least raises the possibility of shared action. The committee should be commended for coming up with the idea of sharing. However, I think it is flawed enough in its current form, to be a truly bad investment of ProZ time and money

I think it is worth raising these issues as they may get attention and perhaps a few of these issues can even be addressed.

So what could ProZ do?

ProZ could be a real force in an initiative that was a collaboration of experts that are globally scattered that guided and managed intelligent data pooling. The KudoZ system is an example of how they could lead on building broadly leverage able linguistic assets. Maybe make this more open, package it and find new ways to monetize this effort by selling subscriptions to corporations and LSPs. The ProZ Living Dictionary.

ProZ could work with a standards focused organization to develop a more useful data aggregation strategy that considers how the data could be normalized and standardized. In exchange for this, maybe contributing members get access to the super data at no cost and also can get consulting contracts with corporations who want to do this behind firewalls. I am sure that LISA and OSCAR would welcome a collaboration and ongoing dialogue.

ProZ could help members develop new kinds of professional services that focus on translation related but not purely translation work, e.g.Translation Corpus development, Linguistic Consulting services, Data Normalization strategies etc.. and services from virtual expert panels that could e.g. advise global enterprises on the best way to convert a 100,000 page knowledge base into 10 languages, pulling in ProZ membership to help with the post-editing (for a fee of course).

ProZ could save the money they would most likely waste going to TAUS meetings and develop educational programs that help members understand , Basic NLP concepts, SMT, RbMT, Advanced Leveraging, Corpus Preparation, Post-Editing MT efficiently. These are all technologies gaining momentum and that companies will use in future. If the members understand these technologies and start getting involved, develop some expertise, they may find that there is plenty of work available as the world tries to move to a model where anything that exists in a source language should also exist in 30 other languages. I believe that world is coming soon.

ProZ could help members devise time and skill contribution based service business models for services rendered rather than the hated (3) cents per word model.

ProZ could make technology investments that would facilitate member engagement with these next generation technologies. Build a collaboration platform so that members can engage and experiment and develop expertise in a ProZ envrionment

ProZ could form some technology partnerships with Web 2.0 companies so that their expert members could draw on the 600M+ online bilingual population as a resource. Managed crowdsourcing -- find ways to engage peopel to build up some linguitic assets that become critical for people to consider

Having said all this, I am not a translator and so I cannot say I feel your pain. I see translation as a fundamentally human activity, one that will never be completely replaced by computers, but one where humans can use machines to leverage themselves. Like singers with microphones.

The coming wave of needed translation of enterprise and world knowledge is truly going to be too much for humans to handle alone, and ProZ should lead in a community collaboration approach to help change the world of translation and the role of translators, rather than follow this unclear TDA mission.


Direct link Reply with quote
 

Siegfried Armbruster
Germany
Local time: 04:48
Member (2004)
English to German
+ ...
Let's stop talking and get some work done. Aug 2, 2008

I am fed up with the discussion on how the use of TMs is going to interfere with the "pure" quality of translations or who owns the IP on what.

Fact is, a lot of the material I'm translating is standardized. What should be wrong in producing a TM from the EMEA guidelines for PILs and SPCs or the "R & S sentences" and sharing it. What should be wrong in producing lists of standardized translations for "adverse effects" or "limited warranty" sections. All these text segments exist several hundered or thousand times in the WWW and are reused by many translators after researching the internet.

I am not breaking anybodys IP when using the standard translation of the first 10 sentences of a PIL. In fact these translations have to comply with the EMEA standards. I can imagine that there are many areas where such relatively small, specialized TM could be created that would be helpful to others, and I would be the first person to use them, if the source of the translation could be tracked. The quality issue would play no role, if the translations are standardized and approved by an organization or relevant body.

Sure this would give "translators" who are not fit or willing to search the internet a competitive advantage and close part of the gap between this group and the group that does the research and produces the TMs, but this could be addressed by a bussiness scheme, where the person/body producing the TM would benefit (in form of money) from the download by a third party.

All in all, the time has come where we (as translators) have to realize that our industry is changing, it is no longer paper and pencil, it is no longer a PC and just Word, we are on the edge of automation and industrialization of large parts of the translation industry.

Huge amounts of source texts are perfect for an "industrialized" translation approach. It will happen (it is already happening), and it is up to us to be part of it. "Industrialization" not only means that we use tools such as Word, Wordfast, Trados, Alchemy or whatever, it also includes a market for "semi-finished goods" and "components" that are used to produce the end product (in our case the translation).

The TAUS TDA initiative is interesting, to me, the most interesting part is the information which company (or companies) are not part of it. As an example, as far as I know, Google is not part of it. Why not, because in my opinion Google has developped a technology based on the contents of the internet (who owns the IP on the contents Google is harvesting) and uses a statistical translation approach that is better than all other solutions we have seen up to now. This was in my opinion also one of the main reasons why TAUS TDA was initiated, the members of TAUS are motivated to produce their own solution to protect their interests.

The other companies that are not involved are highly successful translation agencies (I'm not naming any of them here), which in my opinion are already using a collaborative, highly industrialized translation approach and their customers. They are not interested, since at least some of them realized 140 % growth in 2007. Why should they give away their competitive advantage.

TAUS TDA is a initiative we should closely monitor, but we should get our act together and react on the "industrialization" issues we are facing as freelancers. Proz could be the place to implement a working TM market place, but I'm not convinced that this will happen since in my opinion Proz already missed several opportunities in the past (e.g. KOG) and is getting more and more outsourcer-centric.

Ok, this is my view on things. Now it is back to business.
Siegfried


Direct link Reply with quote
 

Henry Dotterer
United States
Local time: 22:48
Member

SITE FOUNDER
TOPIC STARTER
I agree, Siegfried Aug 2, 2008

Good post. Very insightful and fair (except the outsourcer-centric part; that is a misconception). I also agree with you about Google, and about the possibility of having a TM solution of our own here at ProZ.com.

Thanks for answering the surveys, everyone. This will be a topic for the upcoming conferences.


Direct link Reply with quote
 
Max2Zam
German to French
+ ...
There's no future in translation Aug 2, 2008

I am a student specialised in translation and this is exactly the type of evolution in the tranlation's sphere that pushes me not to keep on getting specialised in that profession.
I see that prices are low and that the biggest translation agencies now gather to have even more power concerning prices over translators.
What can we do against such things? An easy answer: nothing!
Translation was something I dreamed about when I was younger. Next week I'll turn 25 and I know that my professional future does not belong to translation anymore.


Direct link Reply with quote
 

Gerard de Noord  Identity Verified
France
Local time: 04:48
Member (2003)
German to Dutch
+ ...
Blame your professors Aug 2, 2008


Max2Zam wrote:

I am a student specialised in translation and this is exactly the type of evolution in the translation's sphere that pushes me not to keep on getting specialised in that profession.
I see that prices are low and that the biggest translation agencies now gather to have even more power concerning prices over translators.
What can we do against such things? An easy answer: nothing!
Translation was something I dreamed about when I was younger. Next week I'll turn 25 and I know that my professional future does not belong to translation anymore.


By all means, call your professors to account.


As Siegfreid wrote:

All in all, the time has come where we (as translators) have to realize that our industry is changing, it is no longer paper and pencil, it is no longer a PC and just Word, we are on the edge of automation and industrialization of large parts of the translation industry.


You should have been informed about this when you started your linguistic studies but the people who taught you probably were my age. They recorded or pasted their first Word macro in 2004 and were proud they finally could paint Word styles. Many of those paper-and-pencil professors overlooked the importance of Word, PCs and CAT and I'm sure they'll miss out on next step too.

Max2Zam, stay pragmatic. Your language pairs will be among the first to suffer from 'industrialization', so keep away from the specialisations that will most likely be affected first, like - let me guess - Automotive, EU, IT, Legal, Medicine. Specialise in subjects that have to stay fresh to sell.

ProZ.com is a global venue and has to address global challenges, but to put it all into perspective: I translate into Dutch, a language too small to put any effort into developing serious MT applications and probably too small to 'leverage pooled data'.

ProZ.com is a community of individuals facing dissimilar challenges. I feel a bond with all my colleagues over here but when I read that the profession or the industry is going in a certain direction, my first thought is: poor Spanish/French/Russian/Hindi into English etc. translators, they already earn half of what I do.

Regards,
Gerard


Direct link Reply with quote
 

Janet Rubin  Identity Verified
Australia
Member (2008)
German to English
An aside on legal Aug 3, 2008


Gerard de Noord wrote:

so keep away from the specialisations that will most likely be affected first, like - let me guess - Automotive, EU, IT, Legal, Medicine. Specialise in subjects that have to stay fresh to sell.



I wouldn't necessarily lump legal in there just yet.

It's true that a lot of contracts can be (and have been) standardized, but I do a lot of litigation translation, and I don't believe that litigation will be easily automated any time in the near future! Case in point, it's pretty hard to get decent "TM matches" out of the argumentation used by lawyers (especially the ones that prefer sentences with 100+ words).


Direct link Reply with quote
 

Tomás Cano Binder, CT  Identity Verified
Spain
Local time: 04:48
Member (2005)
English to Spanish
+ ...
How much is standardized contents? Aug 3, 2008


Siegfried Armbruster wrote:

I am fed up with the discussion on how the use of TMs is going to interfere with the "pure" quality of translations or who owns the IP on what.

Fact is, a lot of the material I'm translating is standardized. What should be wrong in producing a TM from the EMEA guidelines for PILs and SPCs or the "R & S sentences" and sharing it. What should be wrong in producing lists of standardized translations for "adverse effects" or "limited warranty" sections. All these text segments exist several hundered or thousand times in the WWW and are reused by many translators after researching the internet.

I am not breaking anybodys IP when using the standard translation of the first 10 sentences of a PIL.


Yes, I completely agree: many standard sentences and statements exist in our translations. Ok. Let's put them together in a TM to be shared around anybody interested. Now:

1. Does a TM with standardized sentences and expressions make a difference to the world of translation? Not really: It will only mean something in the range of 5-10% of a new PIL translation, and a nominal fraction of the new translation of a patent. The rest will have to be translated.

2. Does such a TM reduce the required level of expertise of the next translator having to work on a new PIL or patent? Not really: The same expertise is needed for the non-standard part. If you are not an expert, you should not translate the next PIL or patent, and if you are an expert, you already know the standard sentences and expressions by heart.

So... in my opinion, privacy is still a problem here, because what we are discussing here are not standard lines and headings: we are talking about full translation memories. Please do not minimise that fact talking about standard sentences and terminology anybody can already find in Eurlex and other sources.

The promotion in this forum of the violation of our customers' rights and expectations about us as a profession really strikes me. Also, my questions to Henry about "privately" sharing TMs are still unanswered.


Direct link Reply with quote
 

Henry Dotterer
United States
Local time: 22:48
Member

SITE FOUNDER
TOPIC STARTER
Answer to Tomás Aug 3, 2008


Tomás Cano Binder wrote:
The promotion in this forum of the violation of our customers' rights and expectations about us as a profession really strikes me. Also, my questions to Henry about "privately" sharing TMs are still unanswered.

RobinB was among those who answered you on that. Obviously, we are limiting ourselves to instances where sharing would be possible and legitimate.


Direct link Reply with quote
 

Tomás Cano Binder, CT  Identity Verified
Spain
Local time: 04:48
Member (2005)
English to Spanish
+ ...
Asking the customer? Aug 3, 2008


Henry D wrote:

Tomás Cano Binder wrote:
The promotion in this forum of the violation of our customers' rights and expectations about us as a profession really strikes me. Also, my questions to Henry about "privately" sharing TMs are still unanswered.

RobinB was among those who answered you on that. Obviously, we are limiting ourselves to instances where sharing would be possible and legitimate.


You mean asking the customers who ─except for the standard sentences and headings to be found in international standards─ are the owners of what we translate?


Direct link Reply with quote
 

Tomás Cano Binder, CT  Identity Verified
Spain
Local time: 04:48
Member (2005)
English to Spanish
+ ...
"having a TM solution of our own here at ProZ.com" Aug 3, 2008


Henry D wrote:
Good post. Very insightful and fair (except the outsourcer-centric part; that is a misconception). I also agree with you about Google, and about the possibility of having a TM solution of our own here at ProZ.com.


I think this, and no other reason, is why we are discussing TAUS' initiatives for so many days. Henry, if your intention is to create a TM or TU sharing tool in Proz.com, why don't you plainly explain your motives and goals, so that we can tell whether we can ─or feel allowed and entitled to─ use it? We are grown-ups, as somebody said here. We can take it.

In my opinion, Proz.com is doing a great job at comoditizing translation, with no lower limit to the rates in pairs where setting a lower limit would not be a problem, and a job bank that is no longer a job bank but an eBay auction where our rates go lower and lower.

Creating a TM sharing method here in Proz will probably take us to the next step: by allowing translators ─or those translators who are not too worried about privacy of their work─ to put their TUs in Proz.com, you will bring comoditization of translation a step further, as customers will have the feeling that anybody can translate a patent, a PIL, a manual, or a contract. This will again lower our rates in the near future.

Cross your heart Henry: Are you sure that your interest is to promote our well-being as translators? Or are you trying to promote Proz.com, your income and traffic? In a sense, these two goals look somewhat incompatible to me if the only way you see of promoting our well-being is commoditizing translation further and further.


Direct link Reply with quote
 

Henry Dotterer
United States
Local time: 22:48
Member

SITE FOUNDER
TOPIC STARTER
Doubting Tomás Aug 3, 2008


Tomás Cano Binder wrote:

Henry D wrote:
Good post. Very insightful and fair (except the outsourcer-centric part; that is a misconception). I also agree with you about Google, and about the possibility of having a TM solution of our own here at ProZ.com.

I think this, and no other reason, is why we are discussing TAUS' initiatives for so many days.

You had a different theory, quite the opposite of this one, just yesterday. You perhaps think that everyone but you, having taken this topic at face value, has been fooled?

Cross your heart Henry: Are you sure that your interest is to promote our well-being as translators?

To know what motivates me, and my team, you can just look around this site. I hope you find things that are valuable for your business.


Direct link Reply with quote
 
Bucherre  Identity Verified
United States
Local time: 19:48
English to French
+ ...
good form, Henry Aug 3, 2008


Henry D wrote:

Hi all,


TAUS Data Association Incorporated by Group of 40 Founding Members

Amsterdam, June 30, 2008: Forty organizations active in buying and supplying translation services and technologies have jointly established a new industry association aimed at sharing parallel language data with the objective to stimulate innovation and automation of translation activities. The TAUS Data Association (TDA)...


For the whole thing, see: http://www.translationautomation.com/joomla/index.php?option=com_content&view=article&catid=45:news_archive&id=173:press-release&Itemid=46

It strikes me that translators, and to some extent small companies, are not yet represented, even though we might benefit from, and also contribute to, efforts such as these.

Two questions:

1. Does this topic matter to your business? How?
2. Should ProZ.com join as a company, maybe with a designated attendee or two (translator and small company) from among our members?


Direct link Reply with quote
 

Madeleine MacRae Klintebo  Identity Verified
United Kingdom
Local time: 03:48
Swedish to English
+ ...
Name calling is too childish for a professional (?) site Aug 3, 2008


Henry D wrote: Your mother named you well!

Whether or not you agree with Tomas, references to his name are a bit childish. My parents (father's choice actually) named me Madeleine. Check out the etymology of that name and decide what career he might have envisaged for me...

While I'm at it, can I direct everyone to an interesting posting by RoyMarie on page 7 of this thread. As his postings (like mine) have to be vetted, but then get posted chronologically, I think quite a few of our colleagues might have missed it.


Direct link Reply with quote
 

Henry Dotterer
United States
Local time: 22:48
Member

SITE FOUNDER
TOPIC STARTER
No offense, Tomás Aug 3, 2008


Madeleine MacRae Klintebo wrote:

Henry D wrote: Your mother named you well!

Name calling is too childish for a professional (?) site.
Whether or not you agree with Tomas, references to his name are a bit childish.

Thanks for letting me know that that my post must have come off harsher than I meant it. No offense was intended, Tomás. In fact, your name runs in my family. It is one of my favorites.


Direct link Reply with quote
 
Pages in topic:   < [1 2 3 4 5 6 7 8 9 10] >


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Jared[Call to this topic]
Lucia Leszinsky[Call to this topic]

You can also contact site staff by submitting a support request »