Fine Reader and Wordfast
Thread poster: teddd76
teddd76
Local time: 23:18
English to French
Nov 17, 2008

Hi,
I just bought Fine Reader 9. I used it this morning to convert a PDF file to Word. The conversion went OK and I could translate the document with Wordfast. However, when I tried to clean it I got the following message: “Failure segmented document, analysis was dropped”. I suspect it’s got to do with the fact the document was originally a PDF file. Do you have any idea what happened and how I could avoid this in the future? Thanks in advance for your help!


Direct link Reply with quote
 

Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 00:18
Member (2008)
English to Russian
+ ...
only manual work is of value Nov 17, 2008

Never let FR do anything in Automatic mode!

Segment pages yourself. Only this way you will be sure the formatting is OK. After you recognise the text, check and adjust formatting in Word (don't forget to switch on "show unprintable characters" to see the formatting markers).

Only then start translating. It is better to spend one hour preparing the document, then not to know what to do at the end.

FR is good at recognising, but PDF is not a usual text format and the program does not process it as a scanned page. It tries to extract the text by copying-pasting where possible. (It is proved by the error, when you try to recognise a file with content extraction protection on. If it were not using copy-paste method, but just printing to bit map and recognise, it would not produce error messages.)

Any automation still requires good manual input.


Direct link Reply with quote
 
teddd76
Local time: 23:18
English to French
TOPIC STARTER
Thank you! Nov 17, 2008

Thank you Sergei!

Just another (dumb) question: how do I segment pages myself? I've just bought FR and the only mode I know is "automatic conversion"!


Direct link Reply with quote
 

Claire Cox
United Kingdom
Local time: 22:18
French to English
+ ...
Have you added Fine Reader to your Word add-ins? Nov 17, 2008

Just a thought; I know that if you allow Abbyy to be installed as part of the Word set-up (which is the default option when you set it up, unless you do a custom install), it can mess up the settings for Wordfast big time! If Fine reader is shown as an add-in under Tempates and Add-ins, uninstall it and reinstall without letting it be part of Word's set-up and you should avoid conflicting with the Wordfast template. Apologies if you've already done this, but it just struck me as something worth checking.

Best of luck!

[Edited at 2008-11-17 23:04 GMT]


Direct link Reply with quote
 
teddd76
Local time: 23:18
English to French
TOPIC STARTER
Thanks Nov 18, 2008

Ok I'll do that! Thanks for your help Claire and Sergei!

Direct link Reply with quote
 

Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 00:18
Member (2008)
English to Russian
+ ...
there are tools Nov 18, 2008

how do I segment pages myself?


There are many toolbars. Use Image Toolbar to set the rectangular fields for OCR. You can modify the shape of the field, add/cut space. Assign whether it is a text or an image or a table. The behaviour will be different.

Play with the tools. Customize them. There are much more buttons than on the default toolbar.


Direct link Reply with quote
 

Roman Bulkiewicz  Identity Verified
Ukraine
Local time: 00:18
Member (2004)
English to Ukrainian
+ ...
segmenting and segmenting Nov 18, 2008

teddd76 wrote:
However, when I tried to clean it I got the following message: “Failure segmented document, analysis was dropped”. I suspect it’s got to do with the fact the document was originally a PDF file.


The "segmentation" referred to in the WF's message is the WF's segmentation and not segmentation you apply in the FR for text recognition/conversion. These two have nothing in common. The FR's segmentation determines how various pieces of the text are arranged in respect to each other and thus may affect the document's formatting, but it should not leave any traces in the converted document that might interfere with WF's segmentation.

On the other hand, the conversion from PDF does leave traces in the resulting document (regardless of the segmenation), and these are known to cause trouble when the document is opened in TagEditor. I've never had any problems with such documents in WF, though. Also, if the conversion were the cause, I would expect the WF segmentation problems to occur when you are translating. But if you'd completed the document successfully, and then WF could not clean it -- probably that means that the segmentation got messed up in the process of the translation? Did you check it? What Claire said makes sense, too.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Fine Reader and Wordfast

Advanced search






LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »
WordFinder
The words you want Anywhere, Anytime

WordFinder is the market's fastest and easiest way of finding the right word, term, translation or synonym in one or more dictionaries. In our assortment you can choose among more than 120 dictionaries in 15 languages from leading publishers.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search