Mobile menu

New PDF > Word program
Thread poster: Mats Wiman

Mats Wiman  Identity Verified
Sweden
Local time: 02:37
Member (2000)
German to Swedish
+ ...

MODERATOR
Mar 9, 2004

I found and have downloaded a trial version of a program called Solid Converter PDF at:

http://www.solidpdf.com

It is purported to make an excellent conversion, keeping the formatting better than other programs (Acrobat 6.0?)

Please have a look and report back to this list.

Mats J C Wiman
Übersetzer/Translator/Traducteur/Traductor > swe
http://www.MatsWiman.com
http://www.Deutsch-Schwedisch.com
http://www.proz.com/pro/1749
(Proz.com moderator, deu>swe, Swedish)
Träsk 201
SE-872 97 Skog
Schweden/Sweden/Suède/Suecia
Tel:+46-612-54112 Fax:+46-612-54181 Mobile:+46-70-5769797


Direct link Reply with quote
 

Natalie  Identity Verified
Poland
Local time: 02:37
Member (2002)
English to Russian
+ ...

Moderator of this forum
Another similar program Mar 9, 2004

http://www.scansoft.com/pdfconverter/

I wonder which one is better


Direct link Reply with quote
 

Magda Dziadosz  Identity Verified
Poland
Local time: 02:37
Member (2004)
English to Polish
+ ...
Miracle... Mar 9, 2004

Hi Mats,
I've just tried it: makes miracle - text ready for CAT plus Polish diacritics supported.

Haven't tried the second one, so can't tell the difference.

Magda


Direct link Reply with quote
 

Acarte
France
Local time: 02:37
French to German
+ ...
Toller Tipp Mar 10, 2004

herzlichen Dank für diesen Tipp, eine Software, die alles kann (außer vielleicht abwaschen und bügeln)!!

Direct link Reply with quote
 
Barnaby Capel-Dunn  Identity Verified
Local time: 02:37
French to English
The jury is out Mar 11, 2004

Mats Wiman wrote:

I found and have downloaded a trial version of a program called Solid Converter PDF at:

http://www.solidpdf.com

It is purported to make an excellent conversion, keeping the formatting better than other programs (Acrobat 6.0?)

Please have a look and report back to this list.

Mats J C Wiman
Übersetzer/Translator/Traducteur/Traductor > swe
http://www.MatsWiman.com
http://www.Deutsch-Schwedisch.com
http://www.proz.com/pro/1749
(Proz.com moderator, deu>swe, Swedish)
Träsk 201
SE-872 97 Skog
Schweden/Sweden/Suède/Suecia
Tel:+46-612-54112 Fax:+46-612-54181 Mobile:+46-70-5769797



Dear Mats
I find it difficult to reach a conclusion as the trial version only works on a small percentage of a given PDF document. On the basis of my limited experience, Solid Converter looks good but not significantly better (if at all) than PDF Converter. Having already bought the latter, I'm unlikely to fork out more cash for the former! But those considering purchase will certainly be interested in your survey.
Best regards
Barnaby


Direct link Reply with quote
 
Ken Cox  Identity Verified
Local time: 02:37
German to English
+ ...
not convinced Mar 15, 2004

The basic idea seems good (why use an OCR program when you already have text?), so I downloaded and tried the trial version on several PDF documents. SolidPDF can be used in several different modes that are intended to preserve the original formatting and layout to various degrees, but if your objective is to extract text so it can be translated, you're probably more interested in the integrity of the text than the fidelity of the layout. And with regard to this there are several features of SoftPDF that I find at least questionable, if not annoying or simply unsatisfactory.

1. It insists on converting all text in fonts not embedded in the PDF document (and/or not present on the host system, I'm not sure about this) into 'standard' fonts. Since Word will happily accept just about any font you throw at it and display it using a substitute font if necessary, it's hard to understand why this should be (considered to be) necessary (and if you copy and paste text from a PDF document to a Word document, the orignal fonts are preserved, so it is certainly possible to retain the original fonts). Replacing fonts may (severely) annoy clients who want to use the translated text in a different version of the original document and will be forced to convert it back to the original fonts, and the replacement fonts may (typically will) not have the same dimensions as the original fonts, which can cause layout problems.

2. The way SolidPDF handles tables in the original is rather unsatisfactory.
First, unless you use the mode (sorry, I forget the exact name just now)in which each block of text is placed in its own text box, it aligns texts located in the various cells of a table on a line-by-line basis using multiple spaces (nary a tab to be seen), which is naturally a nightmare for translation (maybe it doesn't do this if you select plain vanilla text extraction; I didn't try that). Of course, you can (probably) get around this by using the índividual text boxes mode, but that's not particularly convenient for translation, and it's an absolute nuisance for doing anything with the result other than printing it.
Second, SolidPDF tries to reproduce the cell boundaries (rules) of tables using Word drawing objects (lines), rather than by constructing Word tables. Besides the fact that the result will most likely be useless if you significantly edit (read: translate) the text, in many cases the extracted text does not accurately fit within the lines (probably as much as anything due to font replacement and converting all fractional font sizes to the nearest 0.5 point size).

3. Finally, at least in all the modes I tried (which did not include plain vanilla text extraction), SolidPDF tries to preserve variable word spacing in justified text by inserting multiple spaces as necessary. Of course, you can do a global find and replace to change all multiple spaces to single spaces, but that will clobber all of your pseudo-tables, bullet lists, etc. I'm far from being technically knowledgeable about PDF, but this may be a logical consequence of extracting the text info from the PDF file rather blindly (if you copy and past text from a PDF file into a Word file, you don't get the extra spaces...).

However, if you can live with the quirks and limitations, the price is nice.

[Edited at 2004-03-15 23:00]

[Edited at 2004-03-15 23:00]


Direct link Reply with quote
 

Jan Sundström  Identity Verified
Sweden
Local time: 02:37
English to Swedish
+ ...
What version of Solid PDF? Feb 24, 2005

Hi Kenneth and all,

This doesn't sound too good. Do you remember what version of Solid PDF you used? There is a v2 on the site now, maybe it's improved?

Kenneth: Since you're not happy with Solid PDF, what do you use instead?

Other Solid PDF users: Do you agree with Kenneth on these limitations, or have you found ways to overcome them?!

Jan

Kenneth Cox wrote:

The basic idea seems good (why use an OCR program when you already have text?), so I downloaded and tried the trial version on several PDF documents. SolidPDF can be used in several different modes that are intended to preserve the original formatting and layout to various degrees, but if your objective is to extract text so it can be translated, you're probably more interested in the integrity of the text than the fidelity of the layout. And with regard to this there are several features of SoftPDF that I find at least questionable, if not annoying or simply unsatisfactory.

1. It insists on converting all text in fonts not embedded in the PDF document (and/or not present on the host system, I'm not sure about this) into 'standard' fonts. Since Word will happily accept just about any font you throw at it and display it using a substitute font if necessary, it's hard to understand why this should be (considered to be) necessary (and if you copy and paste text from a PDF document to a Word document, the orignal fonts are preserved, so it is certainly possible to retain the original fonts). Replacing fonts may (severely) annoy clients who want to use the translated text in a different version of the original document and will be forced to convert it back to the original fonts, and the replacement fonts may (typically will) not have the same dimensions as the original fonts, which can cause layout problems.

2. The way SolidPDF handles tables in the original is rather unsatisfactory.
First, unless you use the mode (sorry, I forget the exact name just now)in which each block of text is placed in its own text box, it aligns texts located in the various cells of a table on a line-by-line basis using multiple spaces (nary a tab to be seen), which is naturally a nightmare for translation (maybe it doesn't do this if you select plain vanilla text extraction; I didn't try that). Of course, you can (probably) get around this by using the índividual text boxes mode, but that's not particularly convenient for translation, and it's an absolute nuisance for doing anything with the result other than printing it.
Second, SolidPDF tries to reproduce the cell boundaries (rules) of tables using Word drawing objects (lines), rather than by constructing Word tables. Besides the fact that the result will most likely be useless if you significantly edit (read: translate) the text, in many cases the extracted text does not accurately fit within the lines (probably as much as anything due to font replacement and converting all fractional font sizes to the nearest 0.5 point size).

3. Finally, at least in all the modes I tried (which did not include plain vanilla text extraction), SolidPDF tries to preserve variable word spacing in justified text by inserting multiple spaces as necessary. Of course, you can do a global find and replace to change all multiple spaces to single spaces, but that will clobber all of your pseudo-tables, bullet lists, etc. I'm far from being technically knowledgeable about PDF, but this may be a logical consequence of extracting the text info from the PDF file rather blindly (if you copy and past text from a PDF file into a Word file, you don't get the extra spaces...).


Direct link Reply with quote
 

Orla Ryan  Identity Verified
Ireland
Local time: 01:37
Trial software that converts ALL pages? Feb 24, 2005

I was going to post a question about this today.

Is there a freebie download software that converts _all_ pages of a PDF file to DOC/RTF?

I am going absolutely CRAZY this morning trying to find a suitable program for this purpose :~(

I have tried the following: CD from Softinterface inc., Omniformat, PDF2Word, InstallAble2Doc100, Transformer01TB from AABBY (?), PractiCount, and their free download trial software only converts the first 5 pages (generally).

While I realise that trial downloads are not going to have the same range as the full version, time is against me right now. By the time I get the software, i will have missed my deadline.

Any recommendations at all? pleeeeease??

Orla


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

New PDF > Word program

Advanced search






LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs