Do you know about any software tool to count the number of identical segments in a document?
Thread poster: Rafał Kotlicki

Rafał Kotlicki
Poland
Local time: 07:19
English to Polish
+ ...
May 7, 2009

Hello,

do you know about any tool which makes it possible to count the number of identical segments in a document? I have this +40000 entry text with a lot of repetitions and need to calculate the actual (not total) number of entries. Thank you in advance!

Rafał

[Subject edited by staff or moderator 2009-05-07 13:07 GMT]


Direct link Reply with quote
 

Heinrich Pesch  Identity Verified
Finland
Local time: 08:19
Member (2003)
Finnish to German
+ ...
Any CAT-tool May 7, 2009

Run an analysis with Trados Workbench or Wordfast and it will show the percentage of repetitions. But do you need to know how many times a specific segment is repeated? That would be difficult.
Regards
Heinrich


Direct link Reply with quote
 

Gerard de Noord  Identity Verified
France
Local time: 07:19
Member (2003)
German to Dutch
+ ...
Wordfast? May 7, 2009

Hi Rafał,

A Wordfast analyses reports e.g. this for each document and all documents.

Regards,
Gerard


Number of files: 6. Totals:
Analogy segments words char. %
---------------------------------------------------------
Repetitions 240 2292 15286 39%
100% 78 134 1019 2%
95%-99% 3 4 31 0%
85%-94% 2 30 219 1%
75%-84% 14 53 378 1%
00%-74% 315 3426 23243 58%
Total 652 5939 40176
=========================================================
Note: The character count includes spaces.


Direct link Reply with quote
 

Adam Łobatiuk  Identity Verified
Poland
Local time: 07:19
Member (2009)
English to Polish
+ ...
Excel would work too May 7, 2009

If you need to calculate unique segments, you can paste the content into Excel.

In Word, replace all full stops with ".^p", all colons with ":^p" and all tabs (^t) with just the paragraph symbol (^p). Remove all empty paragraphs, and paste everything into Excel. This should give you all segments in separate rows. Now go to Data - Filter - Advanced filter, and check the Only unique records checkbox.

You can paste that back into Word and get the unique wordcount.


Direct link Reply with quote
 

Rafał Kotlicki
Poland
Local time: 07:19
English to Polish
+ ...
TOPIC STARTER
Excel works May 8, 2009

Thank you all for your replies. Yes, I needed to calculate the exact number of unique segments. Adam, your way does the trick most of the time. Thanks!

Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Do you know about any software tool to count the number of identical segments in a document?

Advanced search






Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search