This site uses cookies.
Some of these cookies are essential to the operation of the site,
while others help to improve your experience by providing insights into how the site is being used.
For more information, please see the ProZ.com privacy policy.
Tool for checking for word repetition in TUs of TM
Thread poster: Samuel Murray
Samuel Murray Netherlands Local time: 12:44 Member (2006) English to Afrikaans + ...
May 13, 2011
G'day everyone
Do you know of a standalone tool (or any tool) that can do a QA on a TMX (or other TM format) to check whether any segments have duplicated words in them? I don't mean words that occur right next to each other, but words that occur more than once in the segment. Ideally the matching should be case insensitive.
So if my TM has this:
SL: I wonder if this can be done. TL: Ek wonder of dit kan gedoen kan word.
Do you know of a standalone tool (or any tool) that can do a QA on a TMX (or other TM format) to check whether any segments have duplicated words in them? I don't mean words that occur right next to each other, but words that occur more than once in the segment. Ideally the matching should be case insensitive.
So if my TM has this:
SL: I wonder if this can be done. TL: Ek wonder of dit kan gedoen kan word.
Then I would like that segment to be flagged because the word "kan" is repeated.
Any ideas? My main CAT format is uncleaned RTF, so a macro that could check such files would actually be the best option for me, but... a TMX checker would also do nicely.
I'm playing around with xbench as an alternative concordance app, and it looks like you can customize the searches pretty easily if you know what you're doing (I don't!). Have you had a look at it yet?
Olly
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Oscar Martin Spain Local time: 12:44 English to Spanish + ...
Xbench
May 13, 2011
Hi Samuel,
You can download Xbench from the ApSIC website. Install the software and add the TMX file. You can use regular expression to find duplicate words anywhere in the segment.
The following expression will find duplicate words in source or target.
"(<[:alpha:]+>)=1.*<@1>"
Change simple mode to regular expression mode and press Ctrl+P when searching (Powersearch).
You can download Xbench from the ApSIC website. Install the software and add the TMX file. You can use regular expression to find duplicate words anywhere in the segment.
The following expression will find duplicate words in source or target.
"(<[:alpha:]+>)=1.*<@1>"
Change simple mode to regular expression mode and press Ctrl+P when searching (Powersearch).
Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
Create customer quotes and invoices from within Trados Studio
Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.