Abbyy PDF Transformer: breaking tags around non-English chars?!
Thread poster: Jan Sundström

Jan Sundström  Identity Verified
Sweden
Local time: 10:57
English to Swedish
+ ...
Sep 9, 2008

Hi all,

I came across an annoying bug in Abbyy PDF Transformer, and I wonder if anyone else encountered it.

I want to OCR a layered PDF with Swedish text. In Abbyy, I select "Remove all formatting", and convert it to DOC.

When opening the resulting DOC file in Word, everything looks fine. I'm able to translate this with Trados TWB using the Word interface. Even with "show hidden characters", nothing unusual is noticed.

BUT if I choose to open it with TagEditor instead, it reveals that the file has segment breaks (that were previously invisible) between all Swedish characters (å,ä,ö).

For instance "räksmörgås" is displayed in TE as räksmörgås

My guess is that it must be a problem with the character encoding, assigning the wrong code page to the output file.

But there is no setting for me to assign the character coding, neither in PDF Transformer 1.0 or 2.0.

I don't want any advice about switching to another OCR program. I have access to most of the other programs on the market, including FineReader. But I'd like to know if there is a solution to this bug?

Thanks a lot for your input!

/J


Direct link Reply with quote
 

Ahmed Maher  Identity Verified
Local time: 10:57
English to Arabic
+ ...
Copy & Paste Jan 11, 2009

Hello,

I remember that I have encountered the same problem, and I Just opened the word file and used sellect all to sellect all the file then I pasted it into a new word file. Then I imported this new one to tageditor, and every thing works well.


Regards,
Ahmed Maher


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Abbyy PDF Transformer: breaking tags around non-English chars?!

Advanced search






Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
Across v6.3
Translation Toolkit and Sales Potential under One Roof

Apart from features that enable you to translate more efficiently, the new Across Translator Edition v6.3 comprises your crossMarket membership. The new online network for Across users assists you in exploring new sales potential and generating revenue.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums