How to count all words in an online web site
Thread poster: Ali Bayraktar
Ali Bayraktar  Identity Verified
Turkey
Member (2007)
English to Turkish
+ ...
Dec 22, 2008

Dear Colleagues,

I need help for finding the exact word count of a web site (not flash, simple commercial web site)

Can anybody explain the process?

Thanks in advance,

Best Regards,

M. Ali


Direct link Reply with quote
 

Shouguang Cao
China
Local time: 03:09
Member (2007)
English to Chinese
+ ...
make a local mirror Dec 22, 2008

If the site is plain HTML, you can,

1. Using Websnake or Teleport pro to download the whole site to your local hard drive.
2. Using Practicount or other counting tools to batch count all the files.

Hope this helps.

Dallas


Direct link Reply with quote
 

John Di Rico  Identity Verified
France
Local time: 21:09
Member (2006)
French to English
dynamic Dec 22, 2008

I use WinHTTrack for html sites.

Does anybody know the best way to deal with a dynamic site?

Thanks,

John


Direct link Reply with quote
 

gianfranco  Identity Verified
Brazil
Local time: 16:09
Member (2001)
English to Italian
+ ...
Asking to the site owner Dec 22, 2008

John Di Rico wrote:
Does anybody know the best way to deal with a dynamic site?


While a static website can be, in most cases, downloaded to prepare a quote, it is still better to ask the site owner, or their webmaster, to provide the official version to be translated. This is to avoid discrepancies from what they expect to receive back and what the translator may be able to extract.

In the case of dynamic sites, instead, where the content shown on-line comes usually from a database, this is the only reasonable route.
Extracting the content using the web interface is not a safe method to put together the original material to translate, or it could be extremely time consuming.

The translators, in this case, should always work on the basis of content provided by the site owner, in a format convenient for translation.


Gianfranco


Direct link Reply with quote
 
Ali Bayraktar  Identity Verified
Turkey
Member (2007)
English to Turkish
+ ...
TOPIC STARTER
Thanks to all Dec 22, 2008

Thank you for your kind advices.

You helped me very much.


Direct link Reply with quote
 

Cristina Heraud-van Tol  Identity Verified
Peru
Local time: 14:09
Member (2005)
English to Spanish
+ ...
I count the words in Word Dec 23, 2008

I go to the website, I highlight the words, then copy and paste them into Word. I do like this for every page and link. I do not copy words that are repeated; for example, sometimes words like "home" or "about", or the contact data are displayed on every page.

In this way, you go through all the pages and check the content, the repetitions, you can always ask the customer if he/she wants everything translated, etc. Many times they don't want things like Terms & Conditions or Forums translated. In the case of repetitions, many titles, subtitles, headers or footers of a page start in the same way; that would be 100% matches that a machine can't count well.


Regards,
Cristina


Direct link Reply with quote
 

Rod Walters  Identity Verified
Japan
Local time: 04:09
Japanese to English
Build a client relationship Dec 23, 2008

Asking the client to provide a document for translation in the easiest form for you to translate it is a good practice for a number of reasons.

It makes the client responsible for the document that you end up translating, so there are no quibbles about completeness.

It allows you to schedule the right amount of time.

It educates the client and their organization in the requirements of translation.

It helps to build a relationship with the client and their organization through the various transactions that arise.

However, there's no doubt that knowing the dirty way of grabbing a whole website will be useful at times...


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 21:09
Member (2006)
English to Afrikaans
+ ...
Another opinion Dec 23, 2008

M. Ali Bayraktar wrote:
I need help for finding the exact word count of a web site (not flash, simple commercial web site). Can anybody explain the process?


1. Ensure that you know which pages are included in the web site. It is easy to miss a page, even if you use a download manager or a web site ripper. The best way to ensure that you know of all pages is to ask the client for the pages.

2. For HTML pages, I use OmegaT to do the counting. Simply create a new project as if you're going to translate it, and then look in the project folder for the statistics file.

Which program will you be using to do the translation in? Does that program not have a word counting function?


Direct link Reply with quote
 
Ali Bayraktar  Identity Verified
Turkey
Member (2007)
English to Turkish
+ ...
TOPIC STARTER
The best solution Dec 24, 2008

Samuel Murray wrote:
Which program will you be using to do the translation in?


I am planning to do it with tag editor.


Does that program not have a word counting function?


Yes, but:

Finecount and Trados do not give the right word count.

Trados counts the codes too.


So I have contacted with the client and they will hand me the original files (webmaster's files)

So, IMHO the final solution in such situations is: contacting the client (in fact, client's webmaster)

Thank you again,


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to count all words in an online web site

Advanced search






SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »
memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums