Word Count Software
Thread poster: Konstantinos88
Dec 19, 2013

I am looking for a software product that can calculate how frequently a word is appeared in a document.

I want it to produce a list ordered from most to least used words.

I have found many software products but...

My documents are 20.000 pages of pdf...

And not only in English but and in Greek and in Russian...

Any help??? Thank you.


 

Milan Condak  Identity Verified
Local time: 10:41
English to Czech
Balabolka and AntConc Dec 19, 2013

Konstantinos88 wrote:

I am looking for a software product that can calculate how frequently a word is appeared in a document.

I want it to produce a list ordered from most to least used words.

I have found many software products but...

My documents are 20.000 pages of pdf...

And not only in English but and in Greek and in Russian...

Any help??? Thank you.


For the extraction of the text from all PDF I use Balabolka. I sort my files by languages and I set up AntConc at UTF-8, after it choose some or all files.

Look at ready directory made from machine translated words

http://www.condak.net/cat_other/omegat/2013-11-24/cs/03.html

one dictionary is CS-DE,
second one is CS-DE-EN

I use for an extraction a tool AntConc 3.2.

The example with cyrillic:

http://www.condak.net/corpus/antconc/cs/04.html

and

http://www.condak.net/corpus/antconc/cs/05.html

Cheers,

Milan

[Edited at 2013-12-19 21:54 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Word Count Software

Advanced search






SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search