Can't find/replace a character in Word Thread poster: Kevin Fulton
| Kevin Fulton United States Local time: 11:20 German to English
(Word 2003, Windows XP Home) Whenever I use OCR to extract a Word text file from a faxed/PDF document, I frequently receive the following character "¬" (ALT 0172) in the place of hyphens. I've tried to do a "Find and replace" operation to find each occurance and replace it with nothing. The result is that I get a message indicating that this character is not found in the document. I tried performing this operation on the same document in Open Office, hoping to find a solution, but ... See more (Word 2003, Windows XP Home) Whenever I use OCR to extract a Word text file from a faxed/PDF document, I frequently receive the following character "¬" (ALT 0172) in the place of hyphens. I've tried to do a "Find and replace" operation to find each occurance and replace it with nothing. The result is that I get a message indicating that this character is not found in the document. I tried performing this operation on the same document in Open Office, hoping to find a solution, but with largely the same result. The difference is that each incidence of "¬" was highlighted in OO so I could find it quickly. In a multi-page document, however, this was still a major task. I'd appreciate any thoughts on solving this problem. Thanks, Kevin ▲ Collapse | | | Did you try special Tab? | Nov 3, 2004 |
This occurs frequently in scanned texts. Usually, such special characters are either optional hyphens, or nonbreaking hyphens. Whenever arises, I solve this problem by following the procedure outlined below. Open the normal "Find & Replace" window Expand "more" tab, if it is not visible 1. Press "Special" 2. Choose "Soft Hyphen" 3. Press "Find" If Word finds nothing, or if the found character is not what you intent to find, repeat the above steps... See more This occurs frequently in scanned texts. Usually, such special characters are either optional hyphens, or nonbreaking hyphens. Whenever arises, I solve this problem by following the procedure outlined below. Open the normal "Find & Replace" window Expand "more" tab, if it is not visible 1. Press "Special" 2. Choose "Soft Hyphen" 3. Press "Find" If Word finds nothing, or if the found character is not what you intent to find, repeat the above steps with "Nonbreaking Hyphen" (in step 2), but first do not forget first to clear the search box. Forgot to mention: "Manual Line Break" is another special character found in such OCRed documents. So you may consider repeating the a/m procedure with "Manual Line Break" h.i.h
[Edited at 2004-11-04 00:02]
[Edited at 2004-11-04 09:06] ▲ Collapse | | | IanW (X) Local time: 16:20 German to English + ... Search and Replace | Nov 4, 2004 |
Hi Kevin, In German, this is called a "Bedingter Trennstrich" - I've no idea how this translates into English. Here's how you get rid of them: Call up Search and Replace (Cntrl-H). Enter "^-" (the two symbols inside the inverted commas) in the Search field and leave the Replace empty. Press "Replace all" and this deletes all these Trennstriche. Hope this helps Ian | | | Kevin Fulton United States Local time: 11:20 German to English TOPIC STARTER Solution worked | Nov 4, 2004 |
Thank you, Selçuk and Ian! I was approaching the problem from the wrong direction, not realizing that "¬" represented a Word function instead of a character. Once again, thanks to both of you. Kevin | |
|
|
The soft hyphens, protected spaces etc. should not have any impact on pre- or postprocessing, analysis or translation. I don't think I'd even bother to filter these characters out if I were you. But maybe I just don't understand your reasons. These ¬ characters don't appear in a printout of the document, do they? Benjamin | | | Kevin Fulton United States Local time: 11:20 German to English TOPIC STARTER Screws up matches in DVX | Nov 4, 2004 |
> The soft hyphens, protected spaces etc. should not have any impact on pre- or postprocessing< The ¬ character appears in the source text when I import it into Deja Vu X and prevents matching. I recently had a document that had passages that were identical to those in a doc I had translated previously. Since there was some new text, there had been some some formatting changes (different line breaks resulting in different hyphenation, one doc had justified text and the document edi... See more > The soft hyphens, protected spaces etc. should not have any impact on pre- or postprocessing< The ¬ character appears in the source text when I import it into Deja Vu X and prevents matching. I recently had a document that had passages that were identical to those in a doc I had translated previously. Since there was some new text, there had been some some formatting changes (different line breaks resulting in different hyphenation, one doc had justified text and the document editor had used hyphenation. The first document had used no hyphenation at the end of lines). When I processed the second document with OCR, I had to manually remove all the ¬ characters to get the text to match. Even though removing this character took only a few keystrokes each time, it added up over 20 pages. Plus I didn't want my TM to have the characters. Cheers, Kevin ▲ Collapse | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Can't find/replace a character in Word Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
| TM-Town | Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |