Strange tags in source text
Thread poster: Paul Klassen

Paul Klassen
Canada
Local time: 06:19
French to English
Mar 7, 2013

I have had this problem using Trados on Word files, but thought it was a Word thing. Now I'm translating a LaTex file using OmegaT, and am encountering the same thing … tags that seem to serve no purpose insterted between each character in a paragraph It only afects one or two paragraphs, the rest are normal. Please take a look:

tags.jpg

Any idea what could be causing this? and how to get rid of them?

Thanks

Paul


 

Didier Briel  Identity Verified
France
Local time: 11:19
Member (2007)
English to French
+ ...
It might be real formatting Mar 7, 2013

Paul Klassen wrote:
I have had this problem using Trados on Word files, but thought it was a Word thing. Now I'm translating a LaTex file using OmegaT, and am encountering the same thing … tags that seem to serve no purpose insterted between each character in a paragraph It only afects one or two paragraphs, the rest are normal. Please take a look:

tags.jpg

Any idea what could be causing this?

If it is not a bug (difficult to say without discussing the file in details, which would be better done in the OmegaT Yahoo support group), it might be real formatting.
LaTeX can place very precisely characters.

and how to get rid of them?

The simplest would be to clean the source document, but that supposes understanding the LaTeX syntax, to know what to remove.

Didier


 

Paul Klassen
Canada
Local time: 06:19
French to English
TOPIC STARTER
Please clarify Mar 8, 2013

Thanks for your prompt reply, Didier.

Since *.tex files are pure text, I wasn't sure what to make of your comments (i.e. what kind of formatting could be embedded in the text like that). But you inspired me to try some stuff.

First of all, in text editors that bit of text looks like this:

avec $\ell=\|y\|=(\sum_{p}y_{i}^{2})^{1/2}$ la norme euclidienne de $y$ et $m(\theta)$ une fonction de transformation des coordonnées polaires d'angles $\theta$ telle que $m(\theta)=y/\ell(y)$ et $\|m(\theta)\|=1$. $m(\theta)$ peut également s'exprimer comme suit :

That seems ok.

If I create another *.tex file including only that paragraph and view it in TeXnicCenter (without running build, as it has no front matter), the paragraph looks fine, but loading that file into OmegaT shows the same gibberish. If I load that file in a text editor (Notepad++) and then save it as a text file (*.txt), it looks normal in OmegaT. (but saving that file back to *.tex does not eliminate the problem).

I'm really not sure whether this is an OmegaT problem, or what.

Thanks again,
Paul


 

Didier Briel  Identity Verified
France
Local time: 11:19
Member (2007)
English to French
+ ...
Probably a bug Mar 8, 2013

Paul Klassen wrote:
Since *.tex files are pure text,

They are not. I.e., they are no more text than XML files.
LaTeX files are formatted files, the issue being that we had to hard-code a parser, as there is no standard parser we could use.

First of all, in text editors that bit of text looks like this:

avec $\ell=\|y\|=(\sum_{p}y_{i}^{2})^{1/2}$ la norme euclidienne de $y$ et $m(\theta)$ une fonction de transformation des coordonnées polaires d'angles $\theta$ telle que $m(\theta)=y/\ell(y)$ et $\|m(\theta)\|=1$. $m(\theta)$ peut également s'exprimer comme suit :

That seems ok.

Supposing there are no other things explaining the behaviour, it looks like a bug in OmegaT.

If I create another *.tex file including only that paragraph and view it in TeXnicCenter (without running build, as it has no front matter), the paragraph looks fine, but loading that file into OmegaT shows the same gibberish. If I load that file in a text editor (Notepad++) and then save it as a text file (*.txt)

No need to "save as". Renaming it from the desktop would do the same thing.

, it looks normal in OmegaT.

When renamed as .txt, you are not using the LaTeX parser, but the Text filter. It has cons and pros. If your LaTeX file is not heavily formatted, it might be simpler to translated it as a text file. You see more LaTeX "tags", and you might have very odd linebreaks, but you have no risk of a wrong interpretation.

(but saving that file back to *.tex does not eliminate the problem).

There's no reason it should, you are just renaming your file.

I'm really not sure whether this is an OmegaT problem, or what.

It's probably a bug.
You could open a bug report on Sourceforge.

Didier


 

Paul Klassen
Canada
Local time: 06:19
French to English
TOPIC STARTER
Thanks Mar 8, 2013

Thank you Didier. I've filed a report. I've been a Trados user for years, but am tired of the constant drain on my wallet. OmegaT seems to be well reviewed, so I thought I'd give it a try. This is my first project on this CAT.

 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Strange tags in source text

Advanced search






SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search