How to classify translator web sites
Thread poster: Samuel Murray

Samuel Murray  Identity Verified
Netherlands
Local time: 23:00
Member (2006)
English to Afrikaans
+ ...
Nov 15, 2008

G'day everyone

If, for argument's sake, you were in control of an online directory of translators' web sites, and you had to decide how best to classify them in a way which is most useful to users of the directory, how would you sort the sites?

Just to whet your thoughts, there are currently two directories with translator web sites in them, namely Yahoo and the Open Directory Project.

Yahoo (directory no longer updated)
http://dir.yahoo.com/Business_and_Economy/Business_to_Business/Translation_Services/
A hodge-podge of categories. Some language categories (not exclusively enforced) along with some subject field categories, and a number of sites in the top-level category too.

Open Directory Project (ODP)
http://www.dmoz.org/Business/Business_Services/Communications/Translation/
Sites are classified into two groups, namely those businesses offering a single language combination and those offering more than one language combination. The "single language" sites are subcategorised by the language combination. The "multiple language" sites are subcategorised by continent and country.

The advantage of the Yahoo approach is that it seems to make more sense initially. I mean, a user is likely to search for translators based on language or based on specialised field, right? The problem is that you can't categorise translator web sites like that unless you allow sites to be in multiple categories. A translator may offer both "technical" and "medical" translations. One translator may offer English/Spanish/German, another Spanish/German/French and another German/French/English. So it would be impossible to sort sites purely by language or subject field.

The advantage of the ODP approach is that a translator's site can logically be in only one category, and it is fairly easy to determine in which category it should be. But the ODP system does not make "logical sense", if I can put it that way.

The question is how translators (and translation vendors) would classify themselves. I'm talking about translators only -- not editors, reviewers, back-translators, interpreters, project managers etc. Well, here are my thoughts, and I'd like your comments on it, please:

I believe the best way to categorise translator web sites is by the type of business. And in my mind there are four types of translator businesses, namely freelancers, cooperatives, agencies and companies. Freelancers are those who work alone in a small set of languages. Cooperatives is a small category but it includes groups of freelancers working together semi-informally for mutual benefit. Translation agencies do mostly outsourcing, and translation companies do most work in-house.

So my way of classifying translator web sites in a directory would be:

Translators > freelancers > by language combination
Translators > cooperatives > by language combination
Translators > agencies > by location of head office
Translators > companies > by location of head office

Your thoughts? To repeat: If you were in charge of such a directory, and you could list all sites only once, and you had to make the classification simple and easy to use for both the directory editor and the user of the directory, how would you classify the sites?


Direct link Reply with quote
 

Harry Bornemann  Identity Verified
Mexico
English to German
+ ...
Good question Nov 15, 2008

I find neither the Yahoo nor the ODP nor your proposed classification satisfactory, but I have no better idea either. However, this does not stop me from adding another approach from http://www.springerlink.com/content/k673255122810110/

Automatic Classification of Websites based on Keyword Extraction of Nouns

Abstract

In this paper, an automatic collection system is proposed that can extract unique keywords appearing in websites belonging to a specific category and that can use these keywords to classify websites into tourism-related categories to establish a dynamic tourism-related Intemet directory.

First, the keyword extraction algorithm is explained and many tourism-related websites are gathered from the directory-based search engine "Yahoo! Japan". Then these sites are classified into categories by applying the proposed algorithm.

The experimental results show that the proposed method can classify websites into proper categories with a high degree of precision, and that by setting a threshold evaluation value it can detect unrelated websites not classified in any category.

Introduction

Recently, the number of tourists booking package tours offered by travel agents has decreased, while the number of independent tourists is increasing (Yamamoto et al. 2004). Therefore, there is an increasing demand for tourist information that is useful for independent travellers.

At the same time, the number of Web pages on the World Wide Web (WWW) is rising significantly. Included on the WWW is a large amount of useful information for individual and independent travellers.

Due to the huge amount of information on the WWW, search engines such as Google and Yahoo! are used to find appropriate information. There are generally two types of search engines: directory-based and term-based.

A directory-based search engine offers tourist information from websites belonging to a specific category; however, the number of these websites is small. This is because websites in a directory-based search engine are registered, edited and classified with human input, which is labor intensive. If a directory-based search engine can be established using automated ...


Direct link Reply with quote
 
Charlie Bavington  Identity Verified
Local time: 22:00
French to English
Possibly not answering the actual question Nov 16, 2008

Samuel Murray wrote:
Your thoughts? To repeat: If you were in charge of such a directory, and you could list all sites only once,

Stop there.
This is not 1908. You are not talking about some filing system where everyone gets one file and one file only (no photocopying!) and you have to find the best place for each file.

This is an electronic directory. If a freelancer offers 2 languages into one (e.g. French and Italian into English), why should he not be listed twice, once in each appropriate place?

So my first thought is that if I were in charge of such a directory, I would discard any restrictions which I do not see as serving any useful purpose.

That would, I feel, be "most useful to users of the directory".


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 23:00
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
A directory of what? Nov 16, 2008

Charlie Bavington wrote:
This is not 1908. You are not talking about some filing system where everyone gets one file and one file only (no photocopying!) and you have to find the best place for each file.


I'm talking about a directory of web sites, not a directory of services. If a business offers two services, but has only one web site, we'd expect to find it listed in two places in a directory of services, but in one place in a directory of sites. Does that make sense?

My question is not about how to classify translator services, but how to classify translator sites. But let's forget about the "one site, one listing" rule for the moment, and allow multiple listings in our directory. What classification system would you find most logical?

In my opinion, smaller categories are more useful to visitors. For example, if a visitor wants a Spanish/English translation, you won't be doing the visitor any favours by having listed all those agencies that offer "all languages" into the Spanish/English category. Or if a visitor needs a medical translation, your directory won't be very useful to hiim if you had listed all those agencies offering "all subject fields" (or even the standard "medical/technical/legal" lie) in the medical category.

If a freelancer offers 2 languages into one (e.g. French and Italian into English), why should he not be listed twice, once in each appropriate place?


Okay, but at how many listings would you draw the line?

Apart from being less useful to users, the "one site, many listings" principle also puts smaller, more specialised businesses at a disadvantage. They get lesser exposure, even though they're not offering a lesser service. What do you think about that?


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 23:00
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
The problem with "automatic" Nov 16, 2008

Harry Bornemann wrote:
I find neither the Yahoo nor the ODP nor your proposed classification satisfactory, but I have no better idea either. However, this does not stop me from adding another approach ... : Automatic Classification of Websites based on Keyword Extraction of Nouns.


The advantage of automatic systems is that they are not so labour intensive, but their disadvantage is that they depend on the assumption that black hat web designers won't figure out how to beat their system.

The advantage of a human edited directory is that black hat SEO techniques will have almost no effect on the quality of the listing. I believe that even if black hat SEO'ers figure out ways to beat the humans, the human editors will have a greater chance of catching them out.

I'm very skeptical about automatic directory creation systems. In my opinion, such systems are good for creating a product that is then sold to a client who doesn't have the resources to test the system adequately and has to accept the directory creator's word (and fancy presentation) about the quality of the directory.


[Edited at 2008-11-16 07:58 GMT]


Direct link Reply with quote
 

Harry Bornemann  Identity Verified
Mexico
English to German
+ ...
The problem with "manually" Nov 16, 2008

Samuel Murray wrote:
I'm very skeptical about automatic directory creation systems.


Given the huge number of relevant websites (>100,000?) I think you could not do without it.

Takatomo Honda et al. wrote:
The experimental results show that the proposed method can classify websites into proper categories with a high degree of precision, and that by setting a threshold evaluation value it can detect unrelated websites not classified in any category.


I don't think they have a very special method.

The idea is to first fetch a list of URLs, extract relevant keywords (languages, specialisations, countries), and then apply set operations on these contents, to create categories of sensible sizes, including a hodgepodge box for "universal agencies".

But since I am still spending my spare time to find out whether my idea of automatic website alignment would work, I don't even want to estimate how much time such a program would take me.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 23:00
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Less than ten thousand Nov 16, 2008

Harry Bornemann wrote:
Given the huge number of relevant websites (>100,000?) I think you could not do without it.


There may be over 100 000 profiles on ProZ.com, but you won't find more than about 10 000 translators' web sites. Ten editors working two hours a day, five days a week, doing 20 sites an hour, will take 1 month to do 10 000 sites.

I agree that some measure of automation could be useful. I suspect the real productivity booster would not be partially automated sorting but access to powerful reviewing tools.


Direct link Reply with quote
 
Charlie Bavington  Identity Verified
Local time: 22:00
French to English
Categories and attributes Nov 16, 2008

Samuel Murray wrote:

Charlie Bavington wrote:
This is not 1908. You are not talking about some filing system where everyone gets one file and one file only (no photocopying!) and you have to find the best place for each file.


I'm talking about a directory of web sites, not a directory of services. If a business offers two services, but has only one web site, we'd expect to find it listed in two places in a directory of services, but in one place in a directory of sites. Does that make sense?

Depends on your objective.
The way I see it, if you insist on 1 site = 1 entry, then your categorisation can only be based on an attribute that any site has one, and only one, of.
Hence a geography based directory would work OK, as any site probably only has one main (head office type) address.
Ditto your legal form idea.
This would also apply to a straightforward alphabetical list.

The problem, for me, is that any such attributes I can think of are not necessarily exactly what is most helpful for users (with the possible exception of geography).
Users want to know, I would contend, what they get out of the site, what the site is actually for, what it does/sells/provides.

This kind of stuff is likely to mean > 1 occurence of a given attribute (lang. pairs, specialisations, and combinations and variants thereof).


My question is not about how to classify translator services, but how to classify translator sites. But let's forget about the "one site, one listing" rule for the moment, and allow multiple listings in our directory. What classification system would you find most logical?

The purpose of translator sites is to promote services, and services are what users are most likely to be searching for.

In my opinion, smaller categories are more useful to visitors. For example, if a visitor wants a Spanish/English translation, you won't be doing the visitor any favours by having listed all those agencies that offer "all languages" into the Spanish/English category. Or if a visitor needs a medical translation, your directory won't be very useful to hiim if you had listed all those agencies offering "all subject fields" (or even the standard "medical/technical/legal" lie) in the medical category.

Why not? Does an agency that offers "all languages" offer "Span/Eng" or not? If so, you should list it, if that is what the user is looking for. If it doesn't, don't.
Likewise, in principle, any entity offering all subjects..

If a freelancer offers 2 languages into one (e.g. French and Italian into English), why should he not be listed twice, once in each appropriate place?

Okay, but at how many listings would you draw the line?[/quote]
I'm sure I am imaginative to come up with some kind of rule, perhaps based on a percentage of turnover.
Altho that might actually be unfair for someone who genuinely can provide Icelandic to Latvian, but has never actually done a job in that pair.
Hmmm, maybe I wouldn't bother with a line.

Apart from being less useful to users, the "one site, many listings" principle also puts smaller, more specialised businesses at a disadvantage. They get lesser exposure, even though they're not offering a lesser service. What do you think about that?

Immediate reaction - list those offering only one service ahead of those offering 2 serrvices ahead of those offering 3 services and so on and so forth. The more services you offer, the lower down you will appear in the list for any one of them.


[Edited at 2008-11-16 19:03 GMT]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Maria Castro[Call to this topic]

You can also contact site staff by submitting a support request »

How to classify translator web sites

Advanced search







PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs