Remove hard returns in pdf to word doc
Thread poster: Sonja Marks

Sonja Marks  Identity Verified
France
Local time: 15:25
Member (2006)
German to English
Apr 12, 2010

When I copy text from a pdf to a Word document it of course has a hard return at the end of each line and all free converter programs I have found wither do this too or use text boxes. Both of these methods present problems with CAT tools. Is there any easy way to remove all those irritating returns?

 

Tony M
France
Local time: 15:25
Member
French to English
+ ...
Remove hard returns (in general) Apr 12, 2010

I've encountered the same problem under other circumstances.

What I do is this, using search-&-replace under Word:

To preserve genuine paragraph breaks, I first search for double paragraph breaks ^p^p and replace those with some other character like § for example that doesn't occur anywhere else in your text.

Then I replace all the remaining single paragraph breaks ^p with spaces.

And finally, I go back and replace the § with a single proper paragraph break ^p

It takes longer to describe than to do!

[Edited at 2010-04-12 21:49 GMT]


 

Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 16:25
Member (2008)
English to Russian
+ ...
my way Apr 12, 2010

1) Replace double spaces with single spaces. It takes several cycles.
2) Replace ^p with a space. It may take several cycles.
3) Replace double space with a ^p
4) Replace ^L with space.
Done.

(It always works in the ideal worlds ... some things might happen... act accordingly and use logic.)

Sometimes, you need to...

5) Replace double spaces with single spaces again.

[Редактировалось 2010-04-12 19:51 GMT]


 

Pablo Bouvier  Identity Verified
Local time: 15:25
German to Spanish
+ ...
Remove hard returns in pdf to word doc Apr 12, 2010

Sonja Marks wrote:

When I copy text from a pdf to a Word document it of course has a hard return at the end of each line and all free converter programs I have found wither do this too or use text boxes. Both of these methods present problems with CAT tools. Is there any easy way to remove all those irritating returns?


Try with codezapper.


 

Hester Eymers  Identity Verified
Netherlands
Local time: 15:25
Member (2005)
English to Dutch
+ ...
Autounbreak Apr 13, 2010

Or try Autounbreak: http://digital.hollmen.dk/products/autounbreak/index.htm
It's quite good (not perfect) and it's freeware.


 

Katherine Mérignac  Identity Verified
France
Local time: 15:25
Member (2004)
French to English
Thanks Apr 13, 2010

To all - these are really useful tips, so thank you!

K


 

Pablo Bouvier  Identity Verified
Local time: 15:25
German to Spanish
+ ...
Remove hard returns in pdf to word doc Apr 13, 2010

Katherine Mérignac wrote:

To all - these are really useful tips, so thank you!

K



It was a pleasure, Katherine. Since I published te shared link to the codezapper template from Dave Turner it has been downloaded more than 25 times. I am proud to belong to such well educated community. Thank you all.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Remove hard returns in pdf to word doc

Advanced search







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search