How to get rid of junk OCR character leftover in Word
Thread poster: Susan Welsh

Susan Welsh  Identity Verified
United States
Local time: 16:36
Member (2008)
Russian to English
+ ...
Apr 16, 2013

I have converted a PDF to Word using ABBYY Finereader, and wherever there was a hyphen at a line ending, the Word version has put it a junk character than I cannot search and replace to get rid of. It looks like a horizontal line with a short vertical line hanging down from the back of it -- like an L rotated 90 degrees clockwise. I have copied it into my Find field, but Word can't find it.

There are hundreds of these things in this rather long document, and I would really like to get a clean text to make translating easier.

Any suggestions?

Thanks in advance!


Direct link Reply with quote
 
Kevin Fulton  Identity Verified
United States
Local time: 16:36
German to English
Look under special characters Apr 16, 2013

If I recall correctly, this is for the optional hyphen ^-.

Direct link Reply with quote
 

Sam Pinson  Identity Verified
United States
Local time: 14:36
Member (2011)
Russian to English
Optional hyphens can be replaced in Word Apr 16, 2013

Hi, Susan.

Please see my blog article on how to replace these "optional hyphens".
http://pinsonlingo.com/blog/2011/05/27/tag-char-namesoftbreakhyphen-removed/.


Direct link Reply with quote
 

LEXpert  Identity Verified
United States
Local time: 15:36
Member (2008)
Croatian to English
+ ...
Easy! Apr 16, 2013

This is very common in multi-column articles.
Open Word's Find&Replace dialog.
Under Find, click the button "More >>"
Place the cursor in the Find box, and from the Special drop-down menu select "optional hyphen".
Leave the Replace box blank.
Replace All.


That's it.


Direct link Reply with quote
 
esperantisto  Identity Verified
Local time: 00:36
Member (2006)
English to Russian
+ ...
Better take care of it in FR Apr 16, 2013

In FineReader, go to Tools → Options → 4. Save → Format Settings → RTF/DOC/Word XML and tick Remove Optional Hyphens and re-export your document.

[Edited at 2013-04-16 07:57 GMT]


Direct link Reply with quote
 

Susan Welsh  Identity Verified
United States
Local time: 16:36
Member (2008)
Russian to English
+ ...
TOPIC STARTER
Thanks! Apr 16, 2013

I used Rudolf's solution, and it worked like a charm. (I didn't want to go back to FR, because I had already done some formatting work on the Word file, like moving footnotes around.)

Thanks to all.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to get rid of junk OCR character leftover in Word

Advanced search






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search