Pages in topic:   [1 2] >
Why do ever more outsourcers come with the .pdf file and demand word format
Thread poster: Brandis (X)
Brandis (X)
Brandis (X)
Local time: 02:05
English to German
+ ...
Sep 26, 2006

don´t they have a secretary? Normally, a .pdf file that is ready for final printing is well structured, but after doing the OCR I see only patches that do not fit together, index from a file and does not go with the true paragraph heading, page number that do not fit or embedded text of which nobody talks about. I feel processing .pdf files is not easy regardless of a couple of good programs, they will never manage to filter the full content into a processable form and it is convenient for the ... See more
don´t they have a secretary? Normally, a .pdf file that is ready for final printing is well structured, but after doing the OCR I see only patches that do not fit together, index from a file and does not go with the true paragraph heading, page number that do not fit or embedded text of which nobody talks about. I feel processing .pdf files is not easy regardless of a couple of good programs, they will never manage to filter the full content into a processable form and it is convenient for the outsourcer to generate a stupid .pdf that a translator has to regenerate in the same form (opticaly) .We all know that a pdf file is not directly processale, the extra work involved to re-generate the optical .pdf structure is never paid.Since when are translators also secretaries. We can only translate the content given to us, not the missing index links or the lined graphics with embedded text content that is truly not avialble in the file. Well outburst has to be . BrandisCollapse


 
Antoní­n Otáhal
Antoní­n Otáhal
Local time: 02:05
Member (2005)
English to Czech
+ ...
You can say that again Sep 26, 2006

It is frustrating.

I usuallay ask for the source from which the pdf was made. Sometimes that is impossible, such as they scanned as paper source - this cannot be helped. But sometimes an agency is reluctant to request the source from their client, who would most likely not object (especially if the issue is explained properly) - such instances drive me mad.

My rule of thumb is:

(1) If I do not need the job money-wise, I refuse it (with the exception
... See more
It is frustrating.

I usuallay ask for the source from which the pdf was made. Sometimes that is impossible, such as they scanned as paper source - this cannot be helped. But sometimes an agency is reluctant to request the source from their client, who would most likely not object (especially if the issue is explained properly) - such instances drive me mad.

My rule of thumb is:

(1) If I do not need the job money-wise, I refuse it (with the exception of jobs I find interesting or stimulating);

(2) I charge at least 30% extra for "pure pdf" jobs.

Antonin
Collapse


 
Luisa Ramos, CT
Luisa Ramos, CT  Identity Verified
United States
Local time: 20:05
English to Spanish
Convert yourself Sep 26, 2006

Ask the outsourcer not to convert the file. Purchase a pdf converter (I use Solid Converter) and do it yourself. It works pretty well.

 
Niina Lahokoski
Niina Lahokoski  Identity Verified
Finland
Local time: 03:05
Member (2008)
English to Finnish
+ ...
Handling PDFs should not be the translator's job Sep 26, 2006

Translators should not have to worry about the formatting etc. That the agency's job! This applies, of course, only to jobs done for agencies. IMO it is an agency's DUTY to see that their translators only receive easily editable files.

I would most certainly be happier to receive several (even if big) xxx files to be translated with TagEditor, for example, than having to extract the text from a 100-page PDF and transferring it to Word before translating...

After all, m
... See more
Translators should not have to worry about the formatting etc. That the agency's job! This applies, of course, only to jobs done for agencies. IMO it is an agency's DUTY to see that their translators only receive easily editable files.

I would most certainly be happier to receive several (even if big) xxx files to be translated with TagEditor, for example, than having to extract the text from a 100-page PDF and transferring it to Word before translating...

After all, many (hopefully most) PDFs are made with professional software, and for example Trados is able to handle many of those file types. Then WHY is it so hard to send us the original files from which the PFD was created? Even a scanned paper document has originally been created with a computer program (unless it is handwritten, which I hope does not happen too often), so that original file must be somewhere. The agency should at least ask for it.





[Edited at 2006-09-26 21:33]
Collapse


 
Brandis (X)
Brandis (X)
Local time: 02:05
English to German
+ ...
TOPIC STARTER
I have all kinds of top-level converters and extractors Sep 27, 2006

Luisa Ramos wrote:

Ask the outsourcer not to convert the file. Purchase a pdf converter (I use Solid Converter) and do it yourself. It works pretty well.
Hi ! converting a page or a 5 page document is not the problem, but large files more than 80 pages with pictures and pictures with embedded text, it takes lot of energy having to guess or compare with the .pdf original file and retyping the stuff. Finally they want a document that reflects similarity with the .pdf ( optical structure), now when you do the OCR the truth is sometimes astonishingly bad, some of them do not reflect any true workable structure or they are patched etc., Tough job and the time loss. Best Brandis


 
Yolande Haneder (X)
Yolande Haneder (X)  Identity Verified
Local time: 02:05
German to French
+ ...
2 problems as far as I know... Sep 27, 2006

I did sometimes in the past ask for the source document of the pdf and I encountered 2 problems:

1. The PDF had been done by some graphic agency / printing agency and the PDF was a preview of a finished work. The original had been done in some graphic program and the graphic agency / printing agency simply refuses to give out the source text in case they would not get the priniting job in the end ( the work is usually calculated as a part of the printing job).

It is lik
... See more
I did sometimes in the past ask for the source document of the pdf and I encountered 2 problems:

1. The PDF had been done by some graphic agency / printing agency and the PDF was a preview of a finished work. The original had been done in some graphic program and the graphic agency / printing agency simply refuses to give out the source text in case they would not get the priniting job in the end ( the work is usually calculated as a part of the printing job).

It is like to ask the photographer who did my wedding pictures to give me the negatives. They will rather store them for 10 years and only then I can ask for them. Most of them assume that after then years nobody would ever try to make duplicates and until then making a duplicate costs 4 times the price of a picture processing center (at least - no kidding).

2. Size of the document: a PDF of 20 MB is somethimes the output of a 100 MB document made in Pagemaker or something else.

The client prefers to get a half corrupt document where they have to put everything back in their software themselves than trying to get a translator having the correct software AND ready to work on a 100 MB documents before (and especially if you are using trados and do not have the computer that the client may have), the translators's computer shuts down.

In Adobe Acrobat 7 there is a function save to text. It saves the document in a txt format and you can work on it properly. It is even saving with the page break signs so that if you copy everything back in words, you get the same page formatting just without the pictures. When case 1 or 2 does happens, the client is happy with it too because they are processing it itself afterwards with pictures of higher definitions than on the pdf or converted word.
Collapse


 
Heinrich Pesch
Heinrich Pesch  Identity Verified
Finland
Local time: 03:05
Member (2003)
Finnish to German
+ ...
They would often even save money Sep 27, 2006

by converting them prior to outsourcing. Not only would the translator charge less, but usually the amount of text deminishes when it is converted into Word-format.
It is funny, how agencies struggle for the cent here and there but do not care to save the euro when they could.
For instance the table of content in a pdf can be left out completely, when converting to Word, because Word can create the TOC from the final translation.

Regards
Heinrich


 
Brandis (X)
Brandis (X)
Local time: 02:05
English to German
+ ...
TOPIC STARTER
Ah!Heinrich you speak into the soul Sep 27, 2006

Heinrich Pesch wrote:

by converting them prior to outsourcing. Not only would the translator charge less, but usually the amount of text deminishes when it is converted into Word-format.
It is funny, how agencies struggle for the cent here and there but do not care to save the euro when they could.
For instance the table of content in a pdf can be left out completely, when converting to Word, because Word can create the TOC from the final translation.

Regards
Heinrich
.pdf may look good optically, but when it comes to processing, one faces the devil. I have often heard from outsourcers that their translators do not have any trouble processing a .pdf file and can reproduce them with all the structure in an editable form (MS-Word) What a blatant way of donig things. Who is really saving or damaging the market. Top level tools like abby, iris, omnipage are doing a good job, to a a certain extent, but they can never give a translator a full reproduction of the original with everything that is natural for a translation source product. Surely there are many tools, but they have many limitations, ultimately the necessity having to purchase all these expensive tools is also not a feasible solution. I also think that many of these agencies should undergo a schooling in various areas of translation technology. Now I am still confronted with .xml tag split (has nothing to do with the .pdf) but that is yet another headache. Best Brandis

[Edited at 2006-09-27 06:15]


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 02:05
Member (2006)
English to Afrikaans
+ ...
I produce plain text for PDF translations Sep 27, 2006

Brandis wrote:
...the extra work involved to re-generate the optical .pdf structure is never paid.


If the text can be extracted from the PDF, I copy it section by section to a plaintext file (Notepad, etc) in such a way that a DTP artist with a reasonable IQ who is a non-speaker of my langauge can easily see which parts of the translation are the translations of which parts of the source text. I make generous use of open lines or dashes in the margin to indicate the start or end of sections, paragraphs, blocks, etc. But I don't apply formatting (you can't do that in Notepad, anyway). No bold, no underline, no justification, no fonts, no graphs, no nothing -- just plain text against the left margin. If the client wants an MS Word document, I open the TXT file in MS Word and save it as an MS Word file (optionally I change the font to Verdana or something so that it doesn't look too geeky).

I'm a translator, not a DTP artist. If the client wants me to deliver a formatted document, he should provide me with an editable formatted file such as MS Word or OpenDocument.

PS. I do translate the stuff in Wordfast, which is done in MS Word, but the end product is still unformatted.


 
Marie-Céline GEORG
Marie-Céline GEORG  Identity Verified
France
Local time: 02:05
German to French
+ ...
pdf to plain text Sep 27, 2006

Hi,
Same as Samuel: one of my customers send me pdf files because that's what he gets from the company's publishing agency and if he can't get the source file or if the source is a format that I can't edit (such as Quark XPress), I make a word doc with very little formatting because I know that the agency will take care of the DTP work.

I was recently asked for a quote for a large user's manual. I got a pdf file created from a word file, with lots of links, images, etc. I imme
... See more
Hi,
Same as Samuel: one of my customers send me pdf files because that's what he gets from the company's publishing agency and if he can't get the source file or if the source is a format that I can't edit (such as Quark XPress), I make a word doc with very little formatting because I know that the agency will take care of the DTP work.

I was recently asked for a quote for a large user's manual. I got a pdf file created from a word file, with lots of links, images, etc. I immediately told the client that if he gave me the word file, I'd be able to give back a complete translation, with the images and links and references. At first he said it was impossible. So I offered a plain text translation - no links, no reference, no TOC... He told me he would try to get the word file Sometimes you can educate them! Unfortunately, others just try to save money by having us work for free...

Marie-Céline
Collapse


 
David Brown
David Brown  Identity Verified
Spain
Local time: 02:05
Spanish to English
Long live PDF!!!! Sep 27, 2006

PDF is the stuff of translators who do not have a translation memory tool. Forget trying to convert them into word documents electronically. There are far too many small errors it takes longer to proofread and correct than to translate. For an agency to re-type the document into Word would increase costs and they would have to charge the client more or the translator less.

 
Tony M
Tony M
France
Local time: 02:05
Member
French to English
+ ...
SITE LOCALIZER
Sometimes we're just worrying for nothing! Sep 27, 2006

David Brown wrote:

PDF is the stuff of translators who do not have a translation memory tool. Forget trying to convert them into word documents electronically. There are far too many small errors it takes longer to proofread and correct than to translate.


Quite! Although I like the comfort and re-assurance of having a formatted Word document to work with, many of my PDF customers specifically ask me NOT to attempt to reformat my output doc, but merely to provide 'text by the km'

Of course, if you're trying to use CAT tools, then it's a different kettle of fish! But in that case, you have to decide whether the trade-off in time saved using CAT justifies the time spent recovering the text. Personally, I find for the type of work I do, CAT is pretty useless, and if anything slows me up, so I don't mind... But then again, by supplying only a PDF, the customer is doing themselves out of any saving that might have been made through repetitions etc.

One thing I really hate is having to proof-read work that the translator has done from a PDF > DOC conversion, where they have not cleaned up the formatting! All those wretched text boxes etc. can be a real pain, and especially where padding spaces have not been replaced with tabs, etc. Grrr! And of course, it's even worse when it's an uncleaned tagged file to boot!

Sometimes I wonder why I even bother doing proofing at all... but that's a whole other thread!


 
Giovanni Guarnieri MITI, MIL
Giovanni Guarnieri MITI, MIL  Identity Verified
United Kingdom
Local time: 01:05
Member (2004)
English to Italian
I ask... Sep 27, 2006

the client to convert it... sorry, I'm too busy for that, and if you want me to do the conversion, the deadline will have to be extended... miraculously, a Word doc appears after a couple of hours...


Giovanni



[Edited at 2006-09-27 13:57]


 
NMR (X)
NMR (X)
France
Local time: 02:05
French to Dutch
+ ...
Why do ever more outsourcers come with the .pdf file and demand word format Sep 27, 2006

I wondered too... until I had to translate a magazine that was slightly political, and where every word was "weighed", diplomatically correct, etc., and had been corrected several times. The client, when giving you the PDF, is sure to give you the latest version, the published one. The same thing is true when lay-out has been handled by a printing house on the basis of an old lay-out (annual reports, etc., but also all kinds of labels). But I agree that agencies could do a bit more work. If you ... See more
I wondered too... until I had to translate a magazine that was slightly political, and where every word was "weighed", diplomatically correct, etc., and had been corrected several times. The client, when giving you the PDF, is sure to give you the latest version, the published one. The same thing is true when lay-out has been handled by a printing house on the basis of an old lay-out (annual reports, etc., but also all kinds of labels). But I agree that agencies could do a bit more work. If you ask them for an editable version, they send you a .jpg picture in a Word file, as happened to me today!

And I agree with Samuel, if the client gives me a PDF and asks for Word I give him a plain text, only bold and underlined (most printing houses don't like extracted texts with lots of wrong colours and fonts). He also can ask me for an XPress file, but then he should give me the original file (clients rarely do because layout companies are sitting on it (it's their source of income)). And client education doesn't work here.

[Edited at 2006-09-27 12:43]
Collapse


 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Why do ever more outsourcers come with the .pdf file and demand word format






Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »