Track this topic  Nearly 90% of members realize a cash return on their membership investment Join ProZ.com |
| Pages in topic: < [1 2 3 4 5 6 7 8 9 10] > | | User | Thread poster: Henry D Clients / large translation companies now talking about pooling linguistic data. Should we be there? | Tomás Cano Binder, CT Spain Local time: 07:42
 Member (2005) English to Spanish + ... |
Henry D wrote:
The current survey shows that over 20% of respondents are already sharing TMs privately and getting good results (like >20% increase in throughput). Nothing is stopping anyone here from doing that, and if the data is any indication, experimenting with it may not be a bad idea.
|
|
But Henry, aren't all or most people here bound by non-disclosure agreements with the customers? How could we be sharing memories? No way we can do that. Privacy and confidentiality of our customers sounds like something to me...  | | | | Tomás Cano Binder, CT Spain Local time: 07:42
 Member (2005) English to Spanish + ... | | Sharing TMs "privately"? | Aug 1, 2008 |
Henry D wrote:
The current survey shows that over 20% of respondents are already sharing TMs privately and getting good results (like >20% increase in throughput). Nothing is stopping anyone here from doing that, and if the data is any indication, experimenting with it may not be a bad idea.
|
|
I beg your pardon? Are you saying that 20% of translators (or Proz users, or freelancers, or whatever, for that matter) are sharing the text their customers have entrusted to them with an explicit or customarily assumed requirement for confidentiality? I hope I misunderstood. If I was a customer here, the thought of this would make me shiver!
What does privately mean here, if the materials we are translating do not belong to us, but to our customers? Does privately mean that it is OK to give away our customers' belongings and IP to other people, as long as we don't get caught? | | | | RobinB Germany Local time: 07:42 German to English | | Don't overdo the privacy issue... | Aug 1, 2008 |
| Tomás Cano Binder wrote: I beg your pardon? Are you saying that 20% of translators (or Proz users, or freelancers, or whatever, for that matter) are sharing the text their customers have entrusted to them with an explicit or customarily assumed requirement for confidentiality? I hope I misunderstood. If I was a customer here, the thought of this would make me shiver! |
|
As we don't "share" TMs with anybody, I have no proprietary interest in this issue. But I would point out that:
a) TMs for texts that are in the public domain following translation can't be classed as confidential. The shelf-life of most translations is very limited (which may in itself have a negative impact on the cost/benefit of any TM sharing).
b) I doubt very much that even the most soft-headed translator is going to pass on TMs containing segments relating to texts that are in fact deemed to be confidential. I understand "privately" to mean by way of a private arrangement, not an open market transaction.
Confidentiality is an overarching issue that affects the sharing of any resources, so I wouldn't overstate its specific importance to the topic at hand.
Robin | | | | Tomás Cano Binder, CT Spain Local time: 07:42
 Member (2005) English to Spanish + ... | | Shouldn't we ask the customers/end customers about "private" sharing | Aug 1, 2008 |
RobinB wrote:
a) TMs for texts that are in the public domain following translation can't be classed as confidential. The shelf-life of most translations is very limited (which may in itself have a negative impact on the cost/benefit of any TM sharing).
|
|
Yes, but who is going to check whether something we are translating is already in the public domain or not? I foresee some huge work maintaining just that piece of information in the memories.
b) I doubt very much that even the most soft-headed translator is going to pass on TMs containing segments relating to texts that are in fact deemed to be confidential. I understand "privately" to mean by way of a private arrangement, not an open market transaction.
|
|
Yes, but even if it is a "private arrangement" as you say, shouldn't our customers know about it? And how about their end customers if we work for an agency? I know their answer already: "No way, José! No sharing of our materials with people outside our control, no use of our memories in jobs for competing companies, or any other companies watsoever, without our express approval. And thanks for the hint; we will look for some other translator who is serious about our NDAs!" | | | | Oleg Prots Ukraine Local time: 08:42
Member (2003) English to Ukrainian + ... | | Inspired by the project at hand... )) | Aug 1, 2008 |
Just some thoughts on another aspect of the problem:
In order to gain the most out of "pooled" TM resources, big companies will also need to standardize their authoring processes. This, along with some benefit, will also bring more and more of poorly digestable "most context neutral" constructions that translators will be required to translate in an equally "most context neutral" way so that they could be reused over for different products...
Just now I am editing a manual where the client is intentionally avoiding to call a PC monitor a "monitor" - it is referred to as "product" instead. And although a lot of "product", "product", "product" look ugly in the translated text, I can't use the word "monitor", as "it will corrupt the client's TM".
I would not want to see this happening on a greater scale, to be honest.  | | | | James O'Reilly Germany
Member (2007) German to English + ... | | Power Shifts in Web-Based Translation Memory | Aug 1, 2008 |
This paper addresses some issues about TM ownership:
"Web-based translation memory (TM) is a recent and little-studied development that is changing the way localisation projects are conducted. This article looks at the technology that allows for the sharing of TM databases over the internet to find out how it shapes the translator’s working environment."
"The study finds that, while the interests of most stakeholders in the localisation process are well served by this web-based arrangement, it can involve drawbacks for freelancers. Once an added value, technical expertise becomes less of a determining factor in employability, while translators lose autonomy through an inability to retain the linguistic assets they generate. Web-based TM is, therefore, seen to risk disempowering and de-skilling freelancers, relegating them from valued localisation partners to mere servants of the new technology."
http://www.springerlink.com/content/5618063651089kj8
This means Terms of Trade should expressively include that only the core translation is within the delivery scope, and that the TM as a derived support entity is owned by the freelancer as a separate good to be purchased.
The ownership issue is clearly put forward in this TAUS paper:
The End of Old School Localization Thinking, Jaap van der Meer, TAUS, 2008
http://www.scribd.com/doc/4392144/The-End-of-Old-School-Localization-Thinking
Thus leading to this...
2008-02-20 Lionbridge's Logoport Web-Based Translation Memory Used by More Than 700 Clients and Over 14,000 Translators Around the Globe
http://www.finanznachrichten.de/nachrichten-2008-02/artikel-10157833.asp
[Edited at 2008-08-01 18:26] | | | | RobinB Germany Local time: 07:42 German to English | | NDAs, take two | Aug 1, 2008 |
| Tomás Cano Binder wrote: Yes, but who is going to check whether something we are translating is already in the public domain or not? |
|
You mean you don't do that already? That should be on every translator's checklist. It forms part of the translator's responsibility to ascertain the nature and purpose of the text they're going to translate.
| Yes, but even if it is a "private arrangement" as you say, shouldn't our customers know about it? And how about their end customers if we work for an agency? I know their answer already: "No way, José! No sharing of our materials with people outside our control, no use of our memories in jobs for competing companies, or any other companies watsoever, without our express approval. And thanks for the hint; we will look for some other translator who is serious about our NDAs!" |
|
I don't know about your NDAs, but our NDAs always contain a clause about materials in the public domain, i.e. that confidentiality doesn't apply. We don't work for agencies, BTW, but we trust the few solo freelances who work with us to respect confidentiality where it's appropriate, and to use the resources we provide to them judiciously. We take the view that we're grown-ups, dealing with other grown-ups, and if we provide them with something that they might use for other clients, then that's our risk. And we accept it, possibly because that's the most sensible business approach to take.
Robin | | | | Tomás Cano Binder, CT Spain Local time: 07:42
 Member (2005) English to Spanish + ... | | Possible use for other clients | Aug 1, 2008 |
RobinB wrote:
We take the view that we're grown-ups, dealing with other grown-ups, and if we provide them with something that they might use for other clients, then that's our risk. And we accept it, possibly because that's the most sensible business approach to take.
|
|
Yes, you might give freelancers a memory they might use for other customers, but what we are discussing here is whether you think your freelancers are entitled to giving away your memory to other freelancers out there without you or your customer knowing. Is that still "judicious"?
[Edited at 2008-08-01 14:45] | | | | Jeff Whittaker United States Local time: 01:42
 Member (2002) German to English + ... | | Sharing TMs already exists | Aug 1, 2008 |
I wonder how many people already contribute their TM's without realizing that they are breaching their (specified or implied) obligation of confidentiality not to mention copyrights (the original source text belongs to the client).
http://tse.elanex.com/ | | | | Paul Greer United States Local time: 22:42 English to Arabic + ... | | No privacy issues | Aug 1, 2008 |
Tomás Cano Binder wrote:
Yes, you might give freelancers a memory they might use for other customers, but what we are discussing here is whether you think your freelancers are entitled to giving away your memory to other freelancers out there without you or your customer knowing. Is that still "judicious"? |
|
Tomás:
The TAUS model specifically distinguishes "content contributors" from "content users" and "end-customers". Of course only content owners are able to contribute content.
Nobody would expect a user to contribute intellectual property that is not his own of course.
Best regards
Paul | | | | Madeleine MacRae Klintebo United Kingdom Local time: 06:42
Member Swedish to English + ... | | You seem to know a lot about what TAUS will/intends to do | Aug 1, 2008 |
AFLSSInc wrote:
Nobody would expect a user to contribute intellectual property that is not his own of course.
|
|
You seem to have an inside track when it comes to TAUS intentions (or at least assume to have such knowledge). Could thus please enlighten the rest of us as to TAUS ultimate aims. | | | | Elizabeth Lyons United States Local time: 22:42
Member (2004) French to English + ... | | TAUS looks unwieldy to me | Aug 1, 2008 |
Aside from the liabilities that sharing proprietary TM's may bring, and the fact that the TM's would likely have uneven quality surveillance, the project is too ambitious.
Whenever a large group of organizations come together to launch a new project, they either set up committees with no authority or influence, which bog down in squabbles about miniutiae or a power broker seizes control and the other players are either reduced to minority status or walk away completely.
I don't think the translation community has anything to fear and certainly not in the immediate future. That said, if there were to be TM database access (in some legally sanitized form) that could be added to our annual fee here as an option, I would consider sampling it. Why not? It could be just one more resource, another arrow in the quiver no different materially from glossaries, dictionaries and the like that everyone shares right now. Far from hindering our processes, it can only enhance them, if properly done -- about which prospects I am highly skeptical. | | | | Tomás Cano Binder, CT Spain Local time: 07:42
 Member (2005) English to Spanish + ... | | We don't own the IP | Aug 1, 2008 |
AFLSSInc wrote:
The TAUS model specifically distinguishes "content contributors" from "content users" and "end-customers". Of course only content owners are able to contribute content.
Nobody would expect a user to contribute intellectual property that is not his own of course.
|
|
Which brings us back to the fact that this statement might be completely out of the question in a TAUS or similar situation:
Henry D wrote:
The current survey shows that over 20% of respondents are already sharing TMs privately and getting good results (like >20% increase in throughput). Nothing is stopping anyone here from doing that, and if the data is any indication, experimenting with it may not be a bad idea. |
|
I.e. we as translators cannot contribute to the memories as the stuff we translate does not belong to us... | | | | RoyMarie United States Local time: 22:42 | | Why TAUS does not make sense and why data sharing does? | Aug 2, 2008 |
While I have moved away from actual translation work into technology and yes even translation automation, I would like to offer a perspective of somebody who is involved with language technology in various ways, and as one who can see the rationale for translation automation at a high level.
TAUS is an organization made up of localization people involved in "large" corporate translation projects of mostly very static information i.e. manuals, documentation, web pages about products, software interfaces etc... All important to do but not really high profile in terms of the overall corporation, when was the last time you read the manual of a car you bought or a mixer or a blender. Business line managers who run international business divisions in sales/marketing/production are the real power players here and people involved with "localization" are generally much lower in stature and influence. The TAUS approach and modus operandi is very much localization culture based.
To my observation, the forces really driving these sharing and efficiency initiatives are the core business imperatives that arise from the increasingly flat world. From a global marketplace enabled by Web 2.0+ technologies e.g. I have children who routinely buy designer clothes from China at a fraction of the cost at a local store. We all know of many examples that show the world is opening up in this way, though this may be a silly example.
This direct contact among globally scattered buyers and sellers means that a lot more information needs to be translated to enable and facilitate trade. As countries across the world grow online populations, the thirst for information beyond shopping also grows. Why is the Indonesian Wikipedia only 20,000 pages when the English Wikipedia is 3M+ pages? 260M people live in Indonesia.
There are many such discrepancies across the world. Knowledge is concentrated in G7 languages or maybe just English, German, French & Japanese where the bulk (90%+) of the worlds patents come from.
It is not possible to convert the knowledge of the world with just a human translation effort. Automation is necessary but automation alone cannot succeed, it needs competent human guidance for this to work. Global corporations now understand that they really need to make all kinds of information available to build loyal and satisfied global customer bases. Sharing general linguistic assets and intelligent man-machine collaboration can enable global enterprises to convert huge masses of knowledge content to many languages. Truly make knowledge a universal resource.
So back to TAUS -- the approach is perhaps outdated. It is a Web 1.0 approach.
Google informed us that while there are 700,000 human translators who might be considered professionals across the globe there are actually 600M+ competent bilingual people on the Internet who are capable of doing some translation or help clean up automated translation. Is it useful to get these people involved? How could this happen?
The TAUS Data Association (TDA) does not make sense for the following reasons:
The TAUS approach has a predominantly localization focus and so will not draw many professional translators who are perhaps afraid of being marginalized and is too far of the beaten path to draw the 600M+ that could help with massive translation projects.
Even more specifically:
1) The technology platform for the pooled data is undefined and so it is very unclear what the benefit will be based on the very meager definitions that have been presented on the platform to date. There is also no clear definition or even discussion of standards that could be the foundation that would drive intelligent data collection and aggregation.
2) Anybody who has pooled TM from disparate sources or played with Open Source SMT (Moses) is aware that just pooling data does not automatically lead to benefits. A standardization and normalization process needs to take place to make the data equivalent and compatible. Initial TAUS tests have shown that there was very little benefit from just throwing data into a common bucket.
Again this is not recognized as an issue by TDA and this should give ProZ reason to hesitate. I have worked with many differing pools of data with Moses and understand that a significant amount of work needs to be invested in data cleaning, normalization and preparation for the leverage to be meaningful. There is very little awareness of this process at this point among the people who have joined who all assume that more is better.
3) The costs are somewhat high for the relative value and will discourage the many smaller players who could also benefit and contribute. The business model encourages high volume contribution but has no mention of quality as a consideration for benefit. So it is possible that large amounts of crap will be collected. Old TM that has been lying around is "donated" to make a muddy soup.
4) Much of the data that will be made available can be downloaded without trouble from the websites anyway and could easily be aligned for a lower cost, than incurred by joining the association and buying the data access. Remember that most of the data they plan to put into the consortium is already on the website of the contributor. In the case of the EU, all the data can simply be downloaded as TMX files with no problem at all. So why would anybody want to go there to get it?
5) There is no outreach to the little man, the freelance translator, the 600M+ who under the right conditions could be encouraged to contribute 5 sentences each. The TDA is basically an old boys network. Web 2.0 is all about empowerment of the masses, engaging hundreds of thousands to change the world. Yes, only a few really contribute but the Wikipedia is a good example of open collaboration where the process really does produce usable quality. Tens of thousands do a little and maybe an elite 1000 is responsible for the huge bulk of the work. This is true for many social network based collaboration.
So if data sharing makes so much sense, why am I beating on the TDA?
Actually I think it is an admirable first effort that at least raises the possibility of shared action. The committee should be commended for coming up with the idea of sharing. However, I think it is flawed enough in its current form, to be a truly bad investment of ProZ time and money
I think it is worth raising these issues as they may get attention and perhaps a few of these issues can even be addressed.
So what could ProZ do?
ProZ could be a real force in an initiative that was a collaboration of experts that are globally scattered that guided and managed intelligent data pooling. The KudoZ system is an example of how they could lead on building broadly leverage able linguistic assets. Maybe make this more open, package it and find new ways to monetize this effort by selling subscriptions to corporations and LSPs. The ProZ Living Dictionary.
ProZ could work with a standards focused organization to develop a more useful data aggregation strategy that considers how the data could be normalized and standardized. In exchange for this, maybe contributing members get access to the super data at no cost and also can get consulting contracts with corporations who want to do this behind firewalls. I am sure that LISA and OSCAR would welcome a collaboration and ongoing dialogue.
ProZ could help members develop new kinds of professional services that focus on translation related but not purely translation work, e.g.Translation Corpus development, Linguistic Consulting services, Data Normalization strategies etc.. and services from virtual expert panels that could e.g. advise global enterprises on the best way to convert a 100,000 page knowledge base into 10 languages, pulling in ProZ membership to help with the post-editing (for a fee of course).
ProZ could save the money they would most likely waste going to TAUS meetings and develop educational programs that help members understand , Basic NLP concepts, SMT, RbMT, Advanced Leveraging, Corpus Preparation, Post-Editing MT efficiently. These are all technologies gaining momentum and that companies will use in future. If the members understand these technologies and start getting involved, develop some expertise, they may find that there is plenty of work available as the world tries to move to a model where anything that exists in a source language should also exist in 30 other languages. I believe that world is coming soon.
ProZ could help members devise time and skill contribution based service business models for services rendered rather than the hated (3) cents per word model.
ProZ could make technology investments that would facilitate member engagement with these next generation technologies. Build a collaboration platform so that members can engage and experiment and develop expertise in a ProZ envrionment
ProZ could form some technology partnerships with Web 2.0 companies so that their expert members could draw on the 600M+ online bilingual population as a resource. Managed crowdsourcing -- find ways to engage peopel to build up some linguitic assets that become critical for people to consider
Having said all this, I am not a translator and so I cannot say I feel your pain. I see translation as a fundamentally human activity, one that will never be completely replaced by computers, but one where humans can use machines to leverage themselves. Like singers with microphones.
The coming wave of needed translation of enterprise and world knowledge is truly going to be too much for humans to handle alone, and ProZ should lead in a community collaboration approach to help change the world of translation and the role of translators, rather than follow this unclear TDA mission. | | | | Siegfried Armbruster Germany Local time: 07:42
Member (2004) English to German + ... | | Let's stop talking and get some work done. | Aug 2, 2008 |
I am fed up with the discussion on how the use of TMs is going to interfere with the "pure" quality of translations or who owns the IP on what.
Fact is, a lot of the material I'm translating is standardized. What should be wrong in producing a TM from the EMEA guidelines for PILs and SPCs or the "R & S sentences" and sharing it. What should be wrong in producing lists of standardized translations for "adverse effects" or "limited warranty" sections. All these text segments exist several hundered or thousand times in the WWW and are reused by many translators after researching the internet.
I am not breaking anybodys IP when using the standard translation of the first 10 sentences of a PIL. In fact these translations have to comply with the EMEA standards. I can imagine that there are many areas where such relatively small, specialized TM could be created that would be helpful to others, and I would be the first person to use them, if the source of the translation could be tracked. The quality issue would play no role, if the translations are standardized and approved by an organization or relevant body.
Sure this would give "translators" who are not fit or willing to search the internet a competitive advantage and close part of the gap between this group and the group that does the research and produces the TMs, but this could be addressed by a bussiness scheme, where the person/body producing the TM would benefit (in form of money) from the download by a third party.
All in all, the time has come where we (as translators) have to realize that our industry is changing, it is no longer paper and pencil, it is no longer a PC and just Word, we are on the edge of automation and industrialization of large parts of the translation industry.
Huge amounts of source texts are perfect for an "industrialized" translation approach. It will happen (it is already happening), and it is up to us to be part of it. "Industrialization" not only means that we use tools such as Word, Wordfast, Trados, Alchemy or whatever, it also includes a market for "semi-finished goods" and "components" that are used to produce the end product (in our case the translation).
The TAUS TDA initiative is interesting, to me, the most interesting part is the information which company (or companies) are not part of it. As an example, as far as I know, Google is not part of it. Why not, because in my opinion Google has developped a technology based on the contents of the internet (who owns the IP on the contents Google is harvesting) and uses a statistical translation approach that is better than all other solutions we have seen up to now. This was in my opinion also one of the main reasons why TAUS TDA was initiated, the members of TAUS are motivated to produce their own solution to protect their interests.
The other companies that are not involved are highly successful translation agencies (I'm not naming any of them here), which in my opinion are already using a collaborative, highly industrialized translation approach and their customers. They are not interested, since at least some of them realized 140 % growth in 2007. Why should they give away their competitive advantage.
TAUS TDA is a initiative we should closely monitor, but we should get our act together and react on the "industrialization" issues we are facing as freelancers. Proz could be the place to implement a working TM market place, but I'm not convinced that this will happen since in my opinion Proz already missed several opportunities in the past (e.g. KOG) and is getting more and more outsourcer-centric.
Ok, this is my view on things. Now it is back to business.
Siegfried | | | | | Pages in topic: < [1 2 3 4 5 6 7 8 9 10] > | To report site rules violations or get help, contact a site moderator | Clients / large translation companies now talking about pooling linguistic data. Should we be there? |