Pages in topic:   [1 2] >
A useful online proofreading tool
Thread poster: phrasin
Aug 24, 2010

Hello everyone,

I'm an english student from Italy, with a passion for web development.

During the last week I made a (free) simple tool to proofread two phrases/expressions using web results. you can check it out at: Phras.in

I thought it might be useful even for seasoned professional translators like you.

You can also check phrases from the address bar of your browser, like this:

phras.in/having few drinks/having a few drinks

I'd be happy to answer any question and eventually read some feedback.

Bye!


Direct link Reply with quote
 

Luca Tutino  Identity Verified
Italy
Local time: 08:30
Member (2002)
English to Italian
+ ...
Nice and possibily useful ....but not for proofreading Aug 29, 2010

Nice idea - thanks! Calling it a "proofreading tool" might be misleading, though.

Direct link Reply with quote
 

Soonthon LUPKITARO(Ph.D.)  Identity Verified
Thailand
Local time: 20:30
Member (2004)
English to Thai
+ ...
A good tool to support Internet search Aug 30, 2010

Thanks. I tried it and found interesting. This computerized tool will make translation better both quantitatively and qualitatively.

Soonthon Lupkitaro


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 14:30
Member (2006)
English to Afrikaans
+ ...
Too bad only two phrases Aug 30, 2010

phrasin wrote:
You can also check phrases from the address bar of your browser, like this:
phras.in/having few drinks/having a few drinks


Too bad it only works on two phrases, and not on three or four at the same time.


Direct link Reply with quote
 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 14:30
Member (2005)
English to Spanish
+ ...
I agree Aug 30, 2010

Luca Tutino wrote:
Nice idea - thanks! Calling it a "proofreading tool" might be misleading, though.

This kind of tools are a true danger for the profession, since many people (especially those who don't believe in the value of dictionaries and quality reference material) might feel tempted to decide their grammar statistically instead of using solid references.

I clearly recommend not to trust the number of hits of an expression as part of a translation decision.


Direct link Reply with quote
 
FarkasAndras
Local time: 14:30
English to Hungarian
+ ...
Not bad Aug 30, 2010

This is not a proofreading tool.
It's more like a minor improvement on the once popular, now forgotten novelty service called googlefight:
http://www.googlefight.com/index.php?lang=en_GB&word1="having%20a%20few%20drinks"&word2="having%20few%20drinks"

The speed and the table with context makes it a lot more useful for us than googlefight, so props for that and thanks for sharing.


Direct link Reply with quote
 
FarkasAndras
Local time: 14:30
English to Hungarian
+ ...
Good one Aug 30, 2010

Tomás Cano Binder, CT wrote:
people ... might feel tempted to decide their grammar statistically instead of using solid references.


Technically speaking, deciding on your grammar statistically IS the correct way to do it. There is no ultimate, perfect reference book for a living language, and there never will be. There never will be correct answers to all possible questions about what's correct and what's not, either. Languages have a habit of setting up nice and tidy rules and then breaking them. There is no standard by which to decide what constitutes a widespread error and what's a legitimate exception to the main linguistic rule. Even if there was, there would be exceptions:)
Any linguist will tell you that a language is what people speak, not what some bespectacled man in an ivory tower tells them to speak. You can choose to try and follow what the RAE says, but theirs is just one opinion on what's correct, not the ultimate truth.
Arguably, as soon as an error becomes significantly more widespread than the correct form, it ceases to be an error and by definition becomes correct.
It's annoying if you grew up with the old "proper" norm and you start to find yourself in a minority, surrounded by people who just don't know better, but that's just the way it works.

It really is just a matter of timing. The "official grammar" is always lagging behind real language use, codifying changes only when it can't resist the tide anymore. You can choose to follow what it says, or "go quicker" and speak and write the way everybody else does. Usually, any given "official grammar" will have more than a few outdated, prescriptivist rules that are just out of touch with real language use, and are therefore simply wrong. It's up to you whether you override them.
As translators, we usually do best to stick to the party line in formal texts. However, if I'm not translating something officiously official, I aim for a happy medium: I mostly sick to what the academy of sciences would call correct, but I disregard its most blatantly unrealistic rules.

[Edited at 2010-08-30 12:05 GMT]


Direct link Reply with quote
 

Mette Melchior  Identity Verified
Sweden
Local time: 14:30
English to Danish
+ ...
Thanks for sharing Aug 30, 2010

The idea is good but I agree that a decision about whether or not to use a certain term or phrase in a translation should not depend (only) on the number of hits in a given search engine.

I do take that into account, though, as an indication of how "used" something is or to check whether a term is used in a specific context - but general searches can be very misleading (and Google sometimes state there are many thousands hits just to narrow it down to less than a hundred when clicking through the search pages... so in my experience the number of hits is far from reliable either).

I give more value to examples of use on specific websites which are relevant in relation to the subject matter and whose content is not likely to be based on translations. And of course to reference works, dictionaries, online glossaries, etc. when the search is related to specific industry terminology.

Dictionaries are not always reliable either, even if they are specialist dictionaries - so as a professional translator you always have to check and double-check and try to be an "expert" in your fields... Or at least know where you can find the necessary information to provide a good and accurate translation.

In any case, I think your tool is a nice idea since it allows you to search and compare two terms or phrases at the same time. I would appreciate the option to specify a domain though, similar to using "site:" when searching directly from the search engines.


Direct link Reply with quote
 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 14:30
Member (2005)
English to Spanish
+ ...
An important question Aug 30, 2010

FarkasAndras wrote:
Any linguist will tell you that a language is what people speak, not what some bespectacled man in an ivory tower tells them to speak. You can choose to try and follow what the RAE says, but theirs is just one opinion on what's correct, not the ultimate truth.
Arguably, as soon as an error becomes significantly more widespread than the correct form, it ceases to be an error and by definition becomes correct.

I entirely agree with everything you say. And exactly because I know this, it is my duty to check more than one source when I am in doubt. As these sources, I must use reputable, consistent materials published by people who are widely considered as leading experts in the field.

Instead of the web, which can contain all sorts of things, including an increasing share of machine translation, I trust dictionaries of use and style guides which are the result of intense expert work.

And an important question to you all: Would you include machine translation in the "what people speak" category, as is presumably the case in Phras.in?


Direct link Reply with quote
 
FarkasAndras
Local time: 14:30
English to Hungarian
+ ...
Numbers Aug 30, 2010

Tomás Cano Binder, CT wrote:

FarkasAndras wrote:
Any linguist will tell you that a language is what people speak, not what some bespectacled man in an ivory tower tells them to speak. You can choose to try and follow what the RAE says, but theirs is just one opinion on what's correct, not the ultimate truth.
Arguably, as soon as an error becomes significantly more widespread than the correct form, it ceases to be an error and by definition becomes correct.

I entirely agree with everything you say. And exactly because I know this, it is my duty to check more than one source when I am in doubt. As these sources, I must use reputable, consistent materials published by people who are widely considered as leading experts in the field.

Instead of the web, which can contain all sorts of things, including an increasing share of machine translation, I trust dictionaries of use and style guides which are the result of intense expert work.

And an important question to you all: Would you include machine translation in the "what people speak" category, as is presumably the case in Phras.in?

Academies tend to be prescriptive instead of descriptive, i.e. sometimes they make up rules instead of describing what rules they observe in real life, so their mode of operation is quite different from, say, corpus analysis. Which is why I sometimes tend to go off on my own instead of following them, or rather I stick with everyone else instead of them:)

And yes, there is a significant amount of rubbish on the web. All one can do is set smart search paramters and skim some of the first couple of dozen results to try and filter the crap. The table phras.in generates looks like it will make this fast and convenient to do. The rubbish usually doesn't outweigh good info but of course you can't ever rely on occurrence numbers alone. The googlefight numbers are just one data point.


Direct link Reply with quote
 
phrasin
TOPIC STARTER
Thanks Aug 30, 2010

Thanks everyone for the nice words you've had on this little project, I'm glad somebody found it useful.
Just to clarify a few points, though, obviously there isn't such a thing as an automated proofreading tool, the choice of a sentence always comes down to personal judgment and lexical context. This tool is intended to be an help for proofreading, and should be used as such.

In my opinion, it's pretty obvious that machine translation shouldn't be considered current linguistical reference at all, but I don't see it being an issue with Phras.in, as it's such a tiny fraction of the total text that can be found on the web, and especially on big numbers, the probability to take into account machine generated content is near-zero.

I like Mette's suggestion about site-specific results and Samuel's advise for allowing more than two searches at a time. I'll think about them.


Direct link Reply with quote
 

Alexandre Chetrite
France
Local time: 14:30
English to French
Useful tool Aug 30, 2010

Hello,

I think that this is a good effort to produce a useful tool, even if one has to be careful with statistics. I think this tool could be a complementary tool for a translator, but is not enough by itself to make a decision about whcih expression to use.

Somehow I wonder if translators should also be sociologists and study the use of expressions and words in local cultures?

It is very true that translation is not a science like mathematics. The choice of words is very complex and it varies from one translator to another, and the cultural factor is important.

So this tool helps to point out the use of expressions in today's society, but I wonder: the hits returned by Phras.in : are they more significative for English expressions (can we use it also for French expressions for example efficiently?)


----UPDATE: I checked with French expressions and this tool is not suited for expressions other than English. Maybe the author should also consider local Google pages like Google.fr, Google.es, etc and implement this functionality? For example with French expression "de suite " and "tout de suite" I got these hits:20.90 Million Hits and12.70 Million Hits respectively.

But the correct expression is "tout de suite" (immediately) not "de suite" (which is wrongly used by many people in French).

[Edited at 2010-08-30 22:27 GMT]


Direct link Reply with quote
 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 14:30
Member (2005)
English to Spanish
+ ...
Much higher chances than zero Aug 31, 2010

phrasin wrote:
In my opinion, it's pretty obvious that machine translation shouldn't be considered current linguistical reference at all, but I don't see it being an issue with Phras.in, as it's such a tiny fraction of the total text that can be found on the web, and especially on big numbers, the probability to take into account machine generated content is near-zero.

I sincerely think you did not examine Google results closely enough, or not with the eyes of a translator. You would be surprised to see how many web pages have automatic translation, including the majority of web pages of some companies, for instance Microsoft (who have a vast amount of machine-translated pages). Maybe we translators are an eye for these things.

Done by machine translation or mistranslation, let me give you some examples where your tool creates a risk:

- "Patriot Act" (An Act in the US)
Wrong: "acta patriótica" -> 4,700 hits
Right: "ley patriótica" -> 13,100 hits

- "suspect was arrested"
Wrong: "el sospechoso fue arrestado" -> 1,180 hits
Right: "el sospechoso fue detenido" -> 3,370 hits

- "worker's compensation"
Wrong: "compensación del trabajador" -> 5,990 hits
Right: "remuneración del trabajador" -> 8,490 hits

- "eviction of the occupant"
Wrong: "evicción del inquilino" -> 3 hits
Right: "deshaucio del inquilino" -> 17 hits

- "legislature of New York"
Wrong: "la legislatura de Nueva York" -> 832 hits
Right: "el poder legislativo de Nueva York" -> 8 hits

I cannot tell how many of these mistranslations were automatic translations (it would take me too long), but these quick examples are proof enough that one should use Phras.in with utmost care. It may be of help to check what others used, but this should only be one factor in a professional translation decision.


Direct link Reply with quote
 
FarkasAndras
Local time: 14:30
English to Hungarian
+ ...
Doing it all wrong Aug 31, 2010

Alexandre Chetrite wrote:

are they more significative for English expressions (can we use it also for French expressions for example efficiently?)

There is lots of French material on the net, so of course yes. Add a site:fr restriction to both search terms to get hits only from the .fr domain.

Alexandre Chetrite wrote:
----UPDATE: I checked with French expressions and this tool is not suited for expressions other than English. Maybe the author should also consider local Google pages like Google.fr, Google.es, etc and implement this functionality? For example with French expression "de suite " and "tout de suite" I got these hits:20.90 Million Hits and12.70 Million Hits respectively.

But the correct expression is "tout de suite" (immediately) not "de suite" (which is wrongly used by many people in French).

Well, I'm pointing out the obvious here, but "tout de suite" *contains* "de suite". I.e. each and every hit of "tout de suite" will also be a hit of "de suite", so it is absolutely impossible for the longer term to get more hits than the shorter one.
On top of that, of course the search tool reflects what is on the net. If half of French people use an expression wrong, the tool will give you a lot of hits of the wrong usage.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 14:30
Member (2006)
English to Afrikaans
+ ...
I don't know your language, but... Aug 31, 2010

Tomás Cano Binder, CT wrote:
Wrong: "acta patriótica" -> 4,700 hits
Wrong: "compensación del trabajador" -> 5,990 hits
Wrong: "el sospechoso fue arrestado" -> 1,180 hits
Wrong: "evicción del inquilino" -> 3 hits
Wrong: "la legislatura de Nueva York" -> 832 hits


Would a professional translator whose native language is this language commit these errors?

Look, a frequency search is very interesting to find out how many people use a phrase incorrectly, but it can also be helpful to determine which of two correct forms is the more common of the two. And if phras.in can add domain specific filtering (e.g. site:.br) this can make matters only more useful.

Frequency searches on internet sources should not be used by non-native or third-language speakers to determine which form is correct. For such people, frequency searches should be limited to mainstream printed media, such as newspapers (although in my own language even the newspapers don't use reliable language any more, especially for press releases and news reports translated from English at break-neck speed by inexperienced translators working through the night to make a morning deadline for something that will not appear in print but only online).

Even a good style guide or grammar can be dangerous in the hands of a translator for whom the language is not his native or second language.


Direct link Reply with quote
 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

A useful online proofreading tool

Advanced search







LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »
memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search