TM anonymization (GDPR) Thread poster: Production SA
| Production SA Belgium Local time: 11:59
 Member (2015) English to French
Hi all,
With the enforcement of GDPR in Europe as from May 25th, any insights in how to anonymize client TM content, essentially removing personal data?
Thx! | | | Michael Beijer United Kingdom Local time: 10:59
 Member (2009) Dutch to English + ... We are living in interesting times. | Apr 12 |
Production SA wrote:
Hi all,
With the enforcement of GDPR in Europe as from May 25th, any insights in how to anonymize client TM content, essentially removing personal data?
Thx!
Hmm, I'm very curious as to whether this will actually become necessary, and also, whether anyone will actually do it if it is. However, assuming you do need to do it, there are a few options. I would first of all recommend contacting Kevin Dias (the guy behind TM-Town), since he's quite knowledgeable re automated data anonymisation stuff. I think he has some kind of automatic system in one of TM-Town's tools. Who knows, he may even be working on something relating to the upcoming GDPR changes. Actually, I suspect other people are already working on solutions for translation data anonymisation.
Another avenue to investigate is CAT tools, and specifically, "untranslatables", "non-translatables", "place-holders", "tokens" (or whatever term each particular tool was chosen). Some CAT tools already have ways of automatically filtering out certain terms from data sent to online machine translation systems (such as CafeTran), whether or not using regular expressions. I can imagine someone clever coming up with something that might work without all too much trouble, preferably automated of course. That is, a system which scans your document looking for potential candidates, using regular expressions and/or customer-defined lists, and then replaces them with codes.
All very interesting stuff, and I am extremely curious to see how it all pans out! if I hear anything interesting I will report back here in this thread.
Michael |  |  | | | | | Michael Beijer United Kingdom Local time: 10:59
 Member (2009) Dutch to English + ... encryption sufficient to ensure compliance? | Apr 12 |
Having just read the Wikipedia article, and specifically the bit about encryption:
"The GDPR requires for the additional information (such as the decryption key) to be kept separately from the pseudonymised data."
I suppose one way to comply might be to encrypt everything, and store the decryption key somewhere else. Will have to look into this. | |
|
|
Hans Lenting Netherlands
 Member (2006) German to Dutch + ... Regular expressions and macros | Apr 13 |
Michael Beijer wrote:
Some CAT tools already have ways of automatically filtering out certain terms from data sent to online machine translation systems (such as CafeTran), whether or not using regular expressions. I can imagine someone clever coming up with something that might work without all too much trouble, preferably automated of course. That is, a system which scans your document looking for potential candidates, using regular expressions and/or customer-defined lists, and then replaces them with codes.
I'm using CafeTran Espresso 2018 as my CAT tool and per client I maintain a dedicated glossary for non-translatables. (I only translate machine manuals.)
When I start working on a new job, I have a look at the first pages and the list of spare parts and technical specifications at the end of the PDF that is part of the job. Here I find most (not all) brand names, product names, street names etc., data that is sensitive and that I don't want to send out to any MT system.
Then I quickly add these data to my dedicated glossary for non-translatables for the particular client (of course I can attach several glossaries for non-translatables, created for different clients, to one job). By means of a simple macro I tag the lines in the glossary for non-translatables to become regular expressions.
When I meet new non-translatables during the actual translation stage, I quickly add them to my glossary for non-translatables. Since I've linked the four NMT systems that I'm currently using via their APIs, these missed non-translatables will be sent to the NMT system once.
I see three ways to prevent this:
Besides from the improved data security I also benefit from a better legibility of suggested translations where (long) company and product names are masked and replaced with a short token. I can concentrate better on the grammar and style.
[Edited at 2018-04-13 19:52 GMT] |  |  | | | | Igor Kmitowski Poland Local time: 11:59
Member (2016) English to Polish + ... Masking non-translatables | Apr 13 |
> Then I quickly add these data to my dedicated glossary for non-translatables
You can let the program mask them automatically via turning on Edit > Preferences > Mask non-translatable fragments option. | | | Hans Lenting Netherlands
 Member (2006) German to Dutch + ...
Igor Kmitowski wrote:
> Then I quickly add these data to my dedicated glossary for non-translatables
You can let the program mask them automatically via turning on Edit > Preferences > Mask non-translatable fragments option.
Yes. That is how I do it. Actually, this setting is always on, on my system. | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » TM anonymization (GDPR) Advanced search
SDL MultiTerm 2017 | Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.
SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.
More info » |
| SDL Trados Studio 2017 Freelance | The leading translation software used by over 250,000 translators.
SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |