Mobile menu

free tool to check unclean DOC/RTF files for technical errors?
Thread poster: Gergely Vandor

Gergely Vandor
Hungary
Local time: 23:03
English to Hungarian
Apr 14, 2009

Dear All,

Is there a free tool (prefereably a Word macro for easy corrections) that can check Trados RTF/DOC files for errors like corrupted/deleted segmentation codes, corrupted/missing inline tags, translated text entered with tw4wininternal style etc? Ideally, it should state the problem clearly and in a friendly way, making it possible for the average user to actually spot and fix the errors. If this does not exist, how well does the latest Trados support the translator in this? Thank you.

Bests,
Gergely


Direct link Reply with quote
 

Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 00:03
Member (2008)
English to Russian
+ ...
What about... Apr 14, 2009

... switching on all verification options available? (No other ideas so far...)

Direct link Reply with quote
 

Gergely Vandor
Hungary
Local time: 23:03
English to Hungarian
TOPIC STARTER
Verification options? Apr 14, 2009

Thank you, but I can't find any verification options anywhere, and I was always convinced that Trados provides nothing in this regard when using Word + Workbench. When running Cleanup in Workbench, that does create a log file, but I found it less than useful.

Also, I'm looking for something that is free and isn't part of Trados. The Trados bilingual DOC/RTF format is used by several other CAT tools, so many translators working with this format don't have Trados.

Gergely


Direct link Reply with quote
 

Grzegorz Gryc  Identity Verified
Local time: 23:03
French to Polish
+ ...
Trados clean up... Apr 14, 2009

Gergely Vandor wrote:

Is there a free tool (prefereably a Word macro for easy corrections) that can check Trados RTF/DOC files for errors like corrupted/deleted segmentation codes, corrupted/missing inline tags, translated text entered with tw4wininternal style etc? Ideally, it should state the problem clearly and in a friendly way, making it possible for the average user to actually spot and fix the errors. If this does not exist, how well does the latest Trados support the translator in this?


Digging in the Trados user interface stupidity

Workbench > Tools > Cleanup
Select Don't clean up
Click Clean up,

Quite intuitive
This option should be named Test or something like this.

Open you RTF/DOC file in Word and search for the tw4winError style.
I.e.:
File > Edit > Find (Ctrl+F)
In the dialog box, More > Format > Style
Select Tw4WinError.
Search for all the occurencies and fix 'em manually.
The light green arrows show the direction where is the error.

The incline codes (related to the tw4winInternal style) are not taken in account.

Cheers
GG

[Edited at 2009-04-14 11:40 GMT]


Direct link Reply with quote
 

Grzegorz Gryc  Identity Verified
Local time: 23:03
French to Polish
+ ...
Wordfast 5.5... Apr 14, 2009

Gergely Vandor wrote:

Is there a free tool (prefereably a Word macro for easy corrections) that can check Trados RTF/DOC files for errors like corrupted/deleted segmentation codes, corrupted/missing inline tags, translated text entered with tw4wininternal style etc?


Wordfast 5.5 demo

Quality check tab (QC)
Identical tags in source/text segments.

See the Wordfast manual

[...]

Cheers
GG


Direct link Reply with quote
 

Grzegorz Gryc  Identity Verified
Local time: 23:03
French to Polish
+ ...
Counterintuitive and useless... Apr 14, 2009

Gergely Vandor wrote:

Thank you, but I can't find any verification options anywhere, and I was always convinced that Trados provides nothing in this regard when using Word + Workbench.

As I said, The Trados user interface is damn counterintutive here.
But if you follow my scenario, it works.
Sure
Just try it.

BTW.
You can use Trados demo version.
Officially. it's no longer disponible but...

When running Cleanup in Workbench, that does create a log file, but I found it less than useful.

The log file in Trados is a big mistake, it's completely useless.
So why you must open the RTF/DOC files and check the errors manually.
Of course, if you have several files, you can use Ando Tools in order to speed up the searches.

Also, I'm looking for something that is free and isn't part of Trados. The Trados bilingual DOC/RTF format is used by several other CAT tools, so many translators working with this format don't have Trados.

In fact, I use Trados mainly for pre- and postprocessing.
So why I have no problems with the codes


I think the job should be rather simple for a skilled programmer.
The algotithm should be like this:

1) check for the tw4winMark occurencies.
You should have only three types of strings, i.e.:
{0>, <}x{> and <0}
where x is 0 or an entire number from the interval <30, 100>. (you can modify this interval as some tools, e.g. Logoport use special valors as 101)
Otherwise, flag the error.
2) make sure the tw4winMark occurencies form sequences like:
{0>aaa aaa aaa<}x{>bbb bbb bbb<0}
Otherwise, flag the error.
3) Check for hard return in these sequences
Otherwise, flag the error.
4) Define tw4winInternal and DO_NOT_TRANSLATE occurencies in the source text as variables, then check if these variables are present before the next <0}.
Otherwise, flag the error.

Of course, it's just a sketch and must be opimized.

Cheers
GG

[Edited at 2009-04-14 14:31 GMT]


Direct link Reply with quote
 

Vito Smolej
Germany
Local time: 23:03
Member (2004)
English to Slovenian
+ ...
hunting for Tw4WinError places Apr 14, 2009

Gergely Vandor wrote:

Dear All,

Is there a free tool (prefereably a Word macro for easy corrections) that can check Trados RTF/DOC files for errors like corrupted/deleted segmentation codes, corrupted/missing inline tags, translated text entered with tw4wininternal style etc?


The problem in such cases is not (just) the error per se, but the error recovery, i.e. how/where from to continue with the search, once things get screwed up. A simple test: try to convert a questionable file with PlusToyz into a double-column word file.

My way out of these (rare) occasions was always the hunt for Tw4WinError points. Maybe a more "humane" tool could be developed, starting from such a version of a file.

regards

Vito


Direct link Reply with quote
 

Grzegorz Gryc  Identity Verified
Local time: 23:03
French to Polish
+ ...
PlusToyz... Apr 14, 2009

Vito Smolej wrote:

Gergely Vandor wrote:

Is there a free tool (prefereably a Word macro for easy corrections) that can check Trados RTF/DOC files for errors like corrupted/deleted segmentation codes, corrupted/missing inline tags, translated text entered with tw4wininternal style etc?


The problem in such cases is not (just) the error per se, but the error recovery, i.e. how/where from to continue with the search, once things get screwed up.

Exactly.
Some errors are/seem easy to fix automatically (e.g. hard return in a middle of a segment) but a lot of them (e.g. all the tw4winInternal errors) need a human assistance.

A simple test: try to convert a questionable file with PlusToyz into a double-column word file.

BTW.
PiusToyz contain a macro for segmentation validation.
In most cases it should work well but in some heavily corrupt documents, e.g. recovered (i.e. text only) RTF/DOC documents, the macro fails.
As Trados Workbench ignores the style and checks only the strings, it's more bulletproof.
Another problem with PlusToyz, the batch processing is unavailable (unlike in Workbench).

[...]

Cheers
GG


Direct link Reply with quote
 

Arkady Vysotsky  Identity Verified
Local time: 00:03
English to Ukrainian
+ ...
Batch mode in PlusToyz Apr 15, 2009

Batch segment verification could be implemented easily, but I think it will be not useful, because most verification errors have to be fixed manually, one by one.

BR
Arkady


Direct link Reply with quote
 

Gergely Vandor
Hungary
Local time: 23:03
English to Hungarian
TOPIC STARTER
PlusToyz: spots only one error? Apr 15, 2009

I've tried the segmentation code checker tool in PlusToyz, and it just displays a message box for the first error, after which it simply quits. A quick look at the source seems to confirm it was designed to do this.

In my opinion, it is not realistic to use it on real life documents for this reason. It is a bit disappointing, since writing a loop to take care of continuing the check seems to be the easy part of the problem. It also seems to return an error for subsegments (footnotes). But the fact that the source is open makes it easier to create a "real" checker tool based on it. (At least in theory, because I'm not sure what rights, if any, the user has to the source code in this case.)

And checking the segmentation codes is only part of the problem, there are also the internal tags and the use of the tw4winInternal style, paragraph marks in segment text and who knows what else.

Thanks everyone for your help and insights.

Gergely

[Edited at 2009-04-15 20:09 GMT]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

free tool to check unclean DOC/RTF files for technical errors?

Advanced search


Translation news related to SDL Trados





TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs