Can I delete the tags in Alchemy Publisher 3.0?
Thread poster: Fredrik Pettersson

Fredrik Pettersson  Identity Verified
Hong Kong
Member (2009)
English to Swedish
+ ...
Sep 19, 2011

I received five scanned PDF-files from a customer but the PDFs had been OCR:ed because the text was searchable. But these PDF:s can't be opened in SDL Trados Studio at all, all I get is a blank page. And they can't be converted from PDF to plain text (.txt), the result if I try is a blank page.

The only solution that worked for me so far is to open in Alchemy Publisher 3.0 where all the text is opened up for translation (although some of the source text is distorted with symbols and signs like ¨^¨'^¨~¨.

However, there are three problems with this only solution:

1. Each TU consists of too many tags so it would be very cumbersome to translate the words between each tag. Here is a screencast of all the tags in each TU: http://screencast.com/t/FW4RTHnPa

2. Some of the source text is distorted. This is not a big problem though, I could always just have a quick look at the original PDF and type the translation and disregard the faulty source text in the window Translate Expert. Here is a screencast of some text that is distorted: http://screencast.com/t/ACCrsbtbGv

3. I get the error message "Failed to create autosave file" continously in the window Translate Expert. Will this be a problem later when I create the finished and cleaned target file? Here is a screencast with this error message: http://screencast.com/t/IKtdt5uL

So the most important for me is how to deal with all the tags in each TU. Can I delete them? Or can I just type all the translated words for a TU after each other before all the tags? This would be the easiest: Leave the tags as they are and type the whole translated sentence before the tags begin.

Below is my posting I made in the support forum for SDL Trados Studio 2009:

Why do I get a blank pdf when open in SDL Studio 2009?

I tried to first open a TM and when the TM is selected in the left pane, I open a pdf file "inside" that TM.
Usually I have no problem with this and the .sdlx version of the pdf opens up for translation.

But now, all I get is an empty editor window in Studio. Why?

I tried to create a project also, and also tried to open without a TM. But I only get an empty .sdlx version of the pdf.

Scanned PDF? 19:05

If the PDF file is a scanned one, Studio will open it and produce an empty file.
Use a decent PDF converter then.

It looks scanned but it's searchable text 19:18

It does seem to be scanned PDF:s I received from the client, but I can search any words in the PDF:s so it's searchable. I always thought that if a PDF is searchable it has been OCR:ed from scanned.

Actually, I managed to open for translation in Alchemy Publisher 3.0:

http://screencast.com/t/FW4RTHnPa

As you can see, I get a lot of tags for each TU so if I can't delete those tags it would be to cumbersome to translate between all these tags.

Use a decent PDF converter 19:26

Convert PDF to plain text, open in Word, create the formating and be lucky about no tags.
Relying on any automated PDF conversion process will end up with dozens or tons of tags, but you cannot expect no tags then.
There is NO PDF converter on the market, which would convert a PDF and format it accordingly, without using formatting tricks. And those cause tags.

PDF to .txt only gives a blank .txt 19:53

I tried to convert from PDF to .txt in both Adobe Acrobat X Pro and Nitro PDF Professional but all I get is a blank .txt with no contents at all.

So I think the only option I have is to translate in Alchemy Publisher 3.0, although I would need to translate between each tag. Or remove them and see what happens with the target file.


Direct link Reply with quote
 
Alchemy Support
Local time: 19:36
English
FYI Sep 19, 2011

Hi Frederik,

I have also sent you the same as an email response:

1. Each TU consists of too many tags so it would be very cumbersome to translate the words between each tag. Here is a screencast of all the tags in each TU: http://screencast.com/t/FW4RTHnPa

In Publisher if you navigate to Tools > Options > Application and uncheck "Display Inline Tags In String View", you shouldn't see any inline tags in the translator toolbar after clicking OK.

2. Some of the source text is distorted. This is not a big problem though, I could always just have a quick look at the original PDF and type the translation and disregard the faulty source text in the window Translate Expert. Here is a screencast of some text that is
distorted: http://screencast.com/t/ACCrsbtbGv

Publisher is not able to handle scanned PDF files, Publisher only handles text PDF. Are you able to select text with your mouse in the PDF?

3. I get the error message "Failed to create autosave file"
continously in the window Translate Expert. Will this be a problem later when I create the finished and cleaned target file? Here is a screencast with this error message: http://screencast.com/t/IKtdt5uL

You can change the autosave settings or turn this feature off under Tools > Options > Application, if you turned it off just make sure you do save the PPF from time to time.
It might also be worth checking if you got admin rights to the path specified, it could be just a case of insufficient rights for saving the PPF.

Thanks and regards,
Jette


Direct link Reply with quote
 

Fredrik Pettersson  Identity Verified
Hong Kong
Member (2009)
English to Swedish
+ ...
TOPIC STARTER
The PDF is definitely a searchable although originally scanned and OCR:ed Sep 20, 2011

The PDF is definitely a searchable PDF although originally scanned and OCR:ed. Yes, I can select any text in the PDF with the cursor and mark it and highlight it.

The PDF:s really seem to have been OCR:ed, not saved as .doc

You can see on this screencast the perforations to the left on the page so it surely has been scanned: http://screencast.com/t/w1WWLnZhP3a

And then it must have been OCR:ed because the text is searchable.

But the main problem for me now is that I want to save the finished translation in Publisher as PDF and not as Word (this is the only option given to me when choosing Export As:. to original file format which, for some strange reason, is Word). If I try to Save As... the only file extension I can choose is .ppf.

If I export to original file format and get my translation in Word all the formatting is distorted. So I can't create a PDF from Word (with Adobe Professional or similar software) because the formatting will be lost.


Direct link Reply with quote
 

Fredrik Pettersson  Identity Verified
Hong Kong
Member (2009)
English to Swedish
+ ...
TOPIC STARTER
The Display inline tags checkbox was already unchecked Sep 20, 2011

The Display inline tags checkbox was already unchecked but all the tags are still displayed in each TU.

N.B.! I just discovered this regarding distortion of text:

If I open the PDF separately in Adobe Acrobat Reader and select a sentence to paste in Alchemy Publisher 3.0, I get exactly the same distortion as is already present in the Alchemy Publisher application itself.

Here is an example:

I copy this text from the PDF:

XXX products are manufactured with the most
advanced techniques assuring durability and
high qualtiy.The metals used are top quality.

And this is how it gets rendered:

b4~ p’oducis are manukciursd with the most
ydvance4 recjiniques assu,lpg durabuity and
high qua?ity.Tiie rneials used are hep qualLy

N.B.! If I copy the same text from this same PDF and paste in this posting right here, it also gets distorted. So there must be some system setting in my computer that needs to be changed.


Direct link Reply with quote
 
Alchemy Support
Local time: 19:36
English
Need file Sep 20, 2011

Hi Frederik,

Could you send us a sample PDF?

It could be encoded characters in the PDF that are not handled properly, please also make sure you set the right target and source language when opening the file.

Thanks,
Jette


Direct link Reply with quote
 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Can I delete the tags in Alchemy Publisher 3.0?

Advanced search






Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search