How to count all words in an online web site
Thread poster: Ali Bayraktar
I need help for finding the exact word count of a web site (not flash, simple commercial web site)
Can anybody explain the process?
Thanks in advance,
| | Dallas Cao
Local time: 22:21
English to Chinese
| make a local mirror || Dec 22, 2008 |
If the site is plain HTML, you can,
1. Using Websnake or Teleport pro to download the whole site to your local hard drive.
2. Using Practicount or other counting tools to batch count all the files.
Hope this helps.
| | gianfranco
Local time: 12:21
English to Italian
| Asking to the site owner || Dec 22, 2008 |
John Di Rico wrote:
Does anybody know the best way to deal with a dynamic site?
While a static website can be, in most cases, downloaded to prepare a quote, it is still better to ask the site owner, or their webmaster, to provide the official version to be translated. This is to avoid discrepancies from what they expect to receive back and what the translator may be able to extract.
In the case of dynamic sites, instead, where the content shown on-line comes usually from a database, this is the only reasonable route.
Extracting the content using the web interface is not a safe method to put together the original material to translate, or it could be extremely time consuming.
The translators, in this case, should always work on the basis of content provided by the site owner, in a format convenient for translation.
| Thanks to all || Dec 22, 2008 |
Thank you for your kind advices.
You helped me very much.
| I count the words in Word || Dec 23, 2008 |
I go to the website, I highlight the words, then copy and paste them into Word. I do like this for every page and link. I do not copy words that are repeated; for example, sometimes words like "home" or "about", or the contact data are displayed on every page.
In this way, you go through all the pages and check the content, the repetitions, you can always ask the customer if he/she wants everything translated, etc. Many times they don't want things like Terms & Conditions or Forums translated. In the case of repetitions, many titles, subtitles, headers or footers of a page start in the same way; that would be 100% matches that a machine can't count well.
| | Rod Walters
Local time: 23:21
Japanese to English
| Build a client relationship || Dec 23, 2008 |
Asking the client to provide a document for translation in the easiest form for you to translate it is a good practice for a number of reasons.
It makes the client responsible for the document that you end up translating, so there are no quibbles about completeness.
It allows you to schedule the right amount of time.
It educates the client and their organization in the requirements of translation.
It helps to build a relationship with the client and their organization through the various transactions that arise.
However, there's no doubt that knowing the dirty way of grabbing a whole website will be useful at times...
| | Samuel Murray
Local time: 15:21
English to Afrikaans
| Another opinion || Dec 23, 2008 |
M. Ali Bayraktar wrote:
I need help for finding the exact word count of a web site (not flash, simple commercial web site). Can anybody explain the process?
1. Ensure that you know which pages are included in the web site. It is easy to miss a page, even if you use a download manager or a web site ripper. The best way to ensure that you know of all pages is to ask the client for the pages.
2. For HTML pages, I use OmegaT to do the counting. Simply create a new project as if you're going to translate it, and then look in the project folder for the statistics file.
Which program will you be using to do the translation in? Does that program not have a word counting function?
| The best solution || Dec 24, 2008 |
Samuel Murray wrote:
Which program will you be using to do the translation in?
I am planning to do it with tag editor.
Does that program not have a word counting function?
Finecount and Trados do not give the right word count.
Trados counts the codes too.
So I have contacted with the client and they will hand me the original files (webmaster's files)
So, IMHO the final solution in such situations is: contacting the client (in fact, client's webmaster)
Thank you again,