No usable .doc after .pdf conversion Thread poster: Evelyne Morel
|
Hi guys, If someone could help me with that. I received a pdf document to be translated and returned into pdf. Right, I already managed to convert pdf into word doc sucessfully but for this one I cannot manage to obtain something workeable to be translated. I tried : - using Microsoft Office Word Imaging (scanning and then exporting into .doc version) - by downloading the trial version of a free converting software : verypdf.com In both cases I get somethin... See more Hi guys, If someone could help me with that. I received a pdf document to be translated and returned into pdf. Right, I already managed to convert pdf into word doc sucessfully but for this one I cannot manage to obtain something workeable to be translated. I tried : - using Microsoft Office Word Imaging (scanning and then exporting into .doc version) - by downloading the trial version of a free converting software : verypdf.com In both cases I get something far from the original documents, which are diplomas. Does that mean I would need more powerfull tool like quarkXpress.... ???? Thanks for your answers.
[Edited at 2006-07-03 20:06] ▲ Collapse | | |
Vito Smolej Germany Local time: 13:07 Member (2004) English to Slovenian + ... SITE LOCALIZER there's pdfs and pdfs... | Jul 3, 2006 |
quarkexpress would do no good... I wonder that MSoft imaging did not produce anything useful. I have nitroPDF which is a rather good adobe clone, with DOC export. Why dont you send the file in and I'll see what I can get squeeze out of the document. Regards Vito | | |
Evelyne Morel France Local time: 13:07 English to French TOPIC STARTER
[Edited at 2006-07-03 19:57] | | |
Kevin Fulton United States Local time: 07:07 German to English Image files hard to convert | Jul 3, 2006 |
Generally the only files that are immediately usable after exporting from PDf are those created from a wordprocessing program. Your diplomas were read as images into PDF format. Possibly an OCR program like OmniPage or FineReader may help, but the amount of effort required for a diploma would exceed what you would have to do to translate the document without such aids. QuarkXpress is a desktop publishing program and would be of no use to you in converting to a usable wo... See more Generally the only files that are immediately usable after exporting from PDf are those created from a wordprocessing program. Your diplomas were read as images into PDF format. Possibly an OCR program like OmniPage or FineReader may help, but the amount of effort required for a diploma would exceed what you would have to do to translate the document without such aids. QuarkXpress is a desktop publishing program and would be of no use to you in converting to a usable word processing document. ▲ Collapse | |
|
|
Vito Smolej Germany Local time: 13:07 Member (2004) English to Slovenian + ... SITE LOCALIZER well, as I said - "there's pdfs and there's pdfs" | Jul 3, 2006 |
The file you have sent me - thank you - consists of two pictures (...of text etc, sent in a fax ...). I had even worse cases of "this kind of pdfs" - hand-annotated faxes of copies etc. PDF format can handle a lot - pictures, tags, graphics. And text. But there's nothing one can do - except what you have already done, namely try to OCR - if the text is in pictures. So my NitroPDF produced faithfully a doc file with two pictures. Sorry, but you will have to... See more The file you have sent me - thank you - consists of two pictures (...of text etc, sent in a fax ...). I had even worse cases of "this kind of pdfs" - hand-annotated faxes of copies etc. PDF format can handle a lot - pictures, tags, graphics. And text. But there's nothing one can do - except what you have already done, namely try to OCR - if the text is in pictures. So my NitroPDF produced faithfully a doc file with two pictures. Sorry, but you will have to cut and paste and write. ▲ Collapse | | |
A good question | Jul 3, 2006 |
Your question is a good one - never mind the errors, we all make them. I recently had the same problem and ended up buying software called Able2Extract (I tried verypdf but it created a bunch of textboxes). The formatting was strange and I had to spend a lot of time fixing it up, so I'll have to look into nitroPDF and MS imaging, as well any other suggested solutions. | | |
You can not extract | Jul 3, 2006 |
Anna Fitzgerald wrote: Your question is a good one - never mind the errors, we all make them. I recently had the same problem and ended up buying software called Able2Extract (I tried verypdf but it created a bunch of textboxes). The formatting was strange and I had to spend a lot of time fixing it up, so I'll have to look into nitroPDF and MS imaging, as well any other suggested solutions. text from a image with PDFs converters. You need a OCR application. Use the trial of ABBYY, Finereader (the best) http://download.abbyy.com/content/default.aspx you have 30 days if the image is OK you'll do it in a few seconds Regards P.S.: and for converters, I mean tagged docs in PDF format to Word docs my favorite is: http://www.solidpdf.com/
[Bearbeitet am 2006-07-03 20:01] | | |
Evelyne Morel France Local time: 13:07 English to French TOPIC STARTER OUOUOUOUPPPPS | Jul 3, 2006 |
Have to apologize for the MISTAKES (you could not have missed)..... Well until now "convertion" has always been written "conversion" (at least in English) Hi guys ! would be more appropriate I suppose. 2 mistakes (maybe more????) in a few words.... Too much translating lately I guess...:) | |
|
|
Vito Smolej Germany Local time: 13:07 Member (2004) English to Slovenian + ... SITE LOCALIZER hint - Microsoft office imaging does OCR | Jul 3, 2006 |
the same old story - wherever there a piece of action, Microsoft will move in on it. It does a respectable job. I prefer Abby FineReader tho. | | |
Giles Watson Italy Local time: 13:07 Italian to English In memoriam I wish I could write French as well as you write English ; -) | Jul 3, 2006 |
Evelyne Morel wrote: Have to apologize for the MISTAKES (you could not have missed)..... Well until now "convertion" has always been written "conversion" (at least in English) Hi guys ! would be more appropriate I suppose. 2 mistakes (maybe more????) in a few words.... Too much translating lately I guess...:) Dear Evelyne, Why apologise when everyone understands you? You make no claim to translate professionally into English so who cares if your spelling is not quite perfect (it's still excellent, though)? If you have found out something about PDFs from our wonderful colleagues, then everyone's happy. There's a difference between a "communication language" (English in your case) and a "language of culture" (French for you), in other words the "music with which we charm the serpents guarding another's treasure" (Ambrose Bierce, The Devil's Dictionary). Bonne chance, Giles | | |
Evelyne Morel France Local time: 13:07 English to French TOPIC STARTER The end of the story.... | Jul 3, 2006 |
First of all, thanks for the tips, help and support After trying different ways of converting my .pdf document, downlowding some of the softwares that were suggested, and realizing as you said that there would be no ways of properly converting the document into a .doc, I finally asked my contact at the agency if we could find another quicker solution. I got an answer going like : ....."Thank you for your help..... See more First of all, thanks for the tips, help and support After trying different ways of converting my .pdf document, downlowding some of the softwares that were suggested, and realizing as you said that there would be no ways of properly converting the document into a .doc, I finally asked my contact at the agency if we could find another quicker solution. I got an answer going like : ....."Thank you for your help........it's a scanned image and even Adobe Acrobat Professional is having problems recognizing the text, so I will send you the doc in .rtf format". So in the end, this was really worth the trouble striving to find a solution (thanks to the precious help of my Proz collegues) - I now have plenty of pdf converter softwares and a good first contact with my agency Welcome into the tricky PDF world !!!
[Edited at 2006-07-03 22:14]
[Edited at 2006-07-03 22:47]
[Edited at 2006-07-03 22:49] ▲ Collapse | | |
Nice to know you've been doing too much translation lately I can't help you fix your problem, but while we're at it, as this has to do with the original subject, here is a link to a nice little piece of freeware: http://digital.hollmen.dk/products/autounbreak/index.htm What t... See more Nice to know you've been doing too much translation lately I can't help you fix your problem, but while we're at it, as this has to do with the original subject, here is a link to a nice little piece of freeware: http://digital.hollmen.dk/products/autounbreak/index.htm What this does is the following: when you take a PDF - one that's not a scan of something but rather has text that's recognizable by the computer - once you have pasted it into a Word doc, you turn it into RTF and then use this to delete all the unnecessary carriage returns. It can take about 65,000 characters at a time. It basically makes any PDF-to-RTF document instantly editable using a CAT tool - and it preserves all other formatting! You may want to give it a try... It has saved me lots of time and many many headaches already. Can't live without it anymore... Good luck! ▲ Collapse | |
|
|
Sabine Knorr Germany Local time: 13:07 Spanish to German + ... Virus found in "autounbreak" software | Jul 9, 2006 |
Warning! I just downloaded the freeware program from the site Viktoria had suggested. When extracting the ZIP file, my antivirus program found and neutralized a virus ..... I don't think I will install this nice little piece of program. | | |
neilmac Spain Local time: 13:07 Spanish to English + ... My final solution | Jul 13, 2006 |
I eventually gave up trying, it takes more time and is more bother than worthwhile IMO. I now charge 50% extra for PDF docs (or anything on paper/fax) and try to convince regular clients to factor in translating when planning ahead in order to avoid these PDF hassles... | | |