Mobile menu

No usable .doc after .pdf conversion
Thread poster: Evelyne Morel

Evelyne Morel  Identity Verified
France
Local time: 13:03
English to French
Jul 3, 2006

Hi guys,

If someone could help me with that. I received a pdf document to be translated and returned into pdf. Right, I already managed to convert pdf into word doc sucessfully but for this one I cannot manage to obtain something workeable to be translated. I tried :
- using Microsoft Office Word Imaging (scanning and then exporting into .doc version)
- by downloading the trial version of a free converting software : verypdf.com

In both cases I get something far from the original documents, which are diplomas.
Does that mean I would need more powerfull tool like quarkXpress.... ????

Thanks for your answers.

[Edited at 2006-07-03 20:06]


Direct link Reply with quote
 

Vito Smolej
Germany
Local time: 13:03
Member (2004)
English to Slovenian
+ ...
there's pdfs and pdfs... Jul 3, 2006

quarkexpress would do no good...

I wonder that MSoft imaging did not produce anything useful. I have nitroPDF which is a rather good adobe clone, with DOC export. Why dont you send the file in and I'll see what I can get squeeze out of the document.

Regards

Vito


Direct link Reply with quote
 

Evelyne Morel  Identity Verified
France
Local time: 13:03
English to French
TOPIC STARTER
Jul 3, 2006



[Edited at 2006-07-03 19:57]


Direct link Reply with quote
 
Kevin Fulton
United States
Local time: 07:03
German to English
Image files hard to convert Jul 3, 2006

Generally the only files that are immediately usable after exporting from PDf are those created from a wordprocessing program. Your diplomas were read as images into PDF format.

Possibly an OCR program like OmniPage or FineReader may help, but the amount of effort required for a diploma would exceed what you would have to do to translate the document without such aids.

QuarkXpress is a desktop publishing program and would be of no use to you in converting to a usable word processing document.


Direct link Reply with quote
 

Vito Smolej
Germany
Local time: 13:03
Member (2004)
English to Slovenian
+ ...
well, as I said - "there's pdfs and there's pdfs" Jul 3, 2006

The file you have sent me - thank you - consists of two pictures (...of text etc, sent in a fax ...). I had even worse cases of "this kind of pdfs" - hand-annotated faxes of copies etc.

PDF format can handle a lot - pictures, tags, graphics. And text. But there's nothing one can do - except what you have already done, namely try to OCR - if the text is in pictures.

So my NitroPDF produced faithfully a doc file with two pictures.

Sorry, but you will have to cut and paste and write.


Direct link Reply with quote
 

Anna Fitzgerald  Identity Verified
France
Local time: 13:03
French to English
A good question Jul 3, 2006

Your question is a good one - never mind the errors, we all make them.

I recently had the same problem and ended up buying software called Able2Extract (I tried verypdf but it created a bunch of textboxes). The formatting was strange and I had to spend a lot of time fixing it up, so I'll have to look into nitroPDF and MS imaging, as well any other suggested solutions.


Direct link Reply with quote
 

Fernando Toledo  Identity Verified
Germany
Local time: 13:03
Member (2005)
German to Spanish
You can not extract Jul 3, 2006

Anna Fitzgerald wrote:

Your question is a good one - never mind the errors, we all make them.

I recently had the same problem and ended up buying software called Able2Extract (I tried verypdf but it created a bunch of textboxes). The formatting was strange and I had to spend a lot of time fixing it up, so I'll have to look into nitroPDF and MS imaging, as well any other suggested solutions.


text from a image with PDFs converters.

You need a OCR application.

Use the trial of ABBYY, Finereader (the best)
http://download.abbyy.com/content/default.aspx

you have 30 days


if the image is OK you'll do it in a few seconds




Regards

P.S.: and for converters, I mean tagged docs in PDF format to Word docs my favorite is:

http://www.solidpdf.com/





[Bearbeitet am 2006-07-03 20:01]


Direct link Reply with quote
 

Evelyne Morel  Identity Verified
France
Local time: 13:03
English to French
TOPIC STARTER
OUOUOUOUPPPPS Jul 3, 2006

Have to apologize for the MISTAKES (you could not have missed).....

Well until now "convertion" has always been written "conversion" (at least in English)
Hi guys ! would be more appropriate I suppose.

2 mistakes (maybe more????) in a few words.... Too much translating lately I guess...:)


Direct link Reply with quote
 

Vito Smolej
Germany
Local time: 13:03
Member (2004)
English to Slovenian
+ ...
hint - Microsoft office imaging does OCR Jul 3, 2006

the same old story - wherever there a piece of action, Microsoft will move in on it.

It does a respectable job. I prefer Abby FineReader tho.


Direct link Reply with quote
 

Giles Watson  Identity Verified
Italy
Local time: 13:03
Italian to English
I wish I could write French as well as you write English ; -) Jul 3, 2006

Evelyne Morel wrote:

Have to apologize for the MISTAKES (you could not have missed).....

Well until now "convertion" has always been written "conversion" (at least in English)
Hi guys ! would be more appropriate I suppose.

2 mistakes (maybe more????) in a few words.... Too much translating lately I guess...:)



Dear Evelyne,

Why apologise when everyone understands you?

You make no claim to translate professionally into English so who cares if your spelling is not quite perfect (it's still excellent, though)? If you have found out something about PDFs from our wonderful colleagues, then everyone's happy.

There's a difference between a "communication language" (English in your case) and a "language of culture" (French for you), in other words the "music with which we charm the serpents guarding another's treasure" (Ambrose Bierce, The Devil's Dictionary).

Bonne chance,

Giles


Direct link Reply with quote
 

Evelyne Morel  Identity Verified
France
Local time: 13:03
English to French
TOPIC STARTER
The end of the story.... Jul 3, 2006

First of all, thanks for the tips, help and support

After trying different ways of converting my .pdf document, downlowding some of the softwares that were suggested, and realizing as you said that there would be no ways of properly converting the document into a .doc, I finally asked my contact at the agency if we could find another quicker solution.
I got an answer going like : ....."Thank you for your help........it's a scanned image and even Adobe Acrobat Professional is having problems recognizing the text, so I will send you the doc in .rtf format".

So in the end, this was really worth the trouble striving to find a solution (thanks to the precious help of my Proz collegues) - I now have plenty of pdf converter softwares and a good first contact with my agency

Welcome into the tricky PDF world !!!






[Edited at 2006-07-03 22:14]

[Edited at 2006-07-03 22:47]

[Edited at 2006-07-03 22:49]


Direct link Reply with quote
 

ViktoriaG  Identity Verified
Canada
Local time: 07:03
English to French
+ ...
Hi Évelyne Jul 5, 2006

Nice to know you've been doing too much translation lately

I can't help you fix your problem, but while we're at it, as this has to do with the original subject, here is a link to a nice little piece of freeware:

http://digital.hollmen.dk/products/autounbreak/index.htm

What this does is the following: when you take a PDF - one that's not a scan of something but rather has text that's recognizable by the computer - once you have pasted it into a Word doc, you turn it into RTF and then use this to delete all the unnecessary carriage returns. It can take about 65,000 characters at a time. It basically makes any PDF-to-RTF document instantly editable using a CAT tool - and it preserves all other formatting!

You may want to give it a try... It has saved me lots of time and many many headaches already. Can't live without it anymore...

Good luck!


Direct link Reply with quote
 
Sabine Knorr
Germany
Local time: 13:03
Member (2005)
Spanish to German
+ ...
Virus found in "autounbreak" software Jul 9, 2006

Warning!

I just downloaded the freeware program from the site Viktoria had suggested.
When extracting the ZIP file, my antivirus program found and neutralized a virus ..... I don't think I will install this nice little piece of program.


Direct link Reply with quote
 

neilmac  Identity Verified
Spain
Local time: 13:03
Spanish to English
+ ...
My final solution Jul 13, 2006

I eventually gave up trying, it takes more time and is more bother than worthwhile IMO. I now charge 50% extra for PDF docs (or anything on paper/fax) and try to convince regular clients to factor in translating when planning ahead in order to avoid these PDF hassles...

Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Maria Castro[Call to this topic]

You can also contact site staff by submitting a support request »

No usable .doc after .pdf conversion

Advanced search


Translation news





TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs