How to remove tags from the original text
Thread poster: Oscar Rivera

Oscar Rivera
Hungary
Local time: 04:17
English to Spanish
+ ...
Dec 10, 2014

Hello everyone,

So I made the move from 2.5 to 3.1 as there were some new features the former (2.5) lacked, e.g. search and replace (CTRL+K). However, 2.5 had an "advantage" over 3.1 in that I didn't have any problems with the original text containing tags. I just started a new project and within one paragraph, one single word has been broken down into syllables with tags. This has happened throughout the text in all the paragraphs.

I would like to eliminate these tags from broken down words from the paragraphs in the original document as it makes it impossible to spot words rightaway. This is what I get: "decidido manifestarnos y actuar en el continente, involucrando a diversos actores sociales, y buscando concientizar a la sociedad."

I went to the user's manual and I couldn't find the "Remove tags" feature/option. I learned about CodeZapper but I'd like to remove the tags directly from OmegaT.

Thanks in advance.


 

Susan Welsh  Identity Verified
United States
Local time: 22:17
Member (2008)
Russian to English
+ ...
I suspect... Dec 10, 2014

if you tried this particular file in 2.5 you would get the same thing, because as I understand it, it has to do with the .docx format. I use Codezapper, which works fine. Alternatively, if you don't need ANY tags in the document, use the "remove tags" setting at Project > Properties (Control +E).
Someone else may have another solution for you.


 

Samuel Murray  Identity Verified
Netherlands
Local time: 04:17
Member (2006)
English to Afrikaans
+ ...
This is what you get Dec 11, 2014

Oscar Rivera wrote:
This is what I get: "decidido manifestarnos y actuar en el continente, involucrando a diversos actores sociales, y buscando concientizar a la sociedad."


Thanks to a bug in ProZ.com's forum software that hasn't been fixed in (oh, I think) ten years, that is not what you get. What you get, is this:

"de<t20/>ci<t21/>di<t22/>do <t23/>ma<t24/>ni<t25/>fes<t26/>tar<t27/>nos<t28/> y ac<t29/>tuar en el con<t30/>ti<t31/>nen<t32/>te, in<t33/>vo<t34/>lu<t35/>cran<t36/>do a di<t37/>ver<t38/>sos ac<t39/>to<t40/>res so<t41/>cia<t42/>les, y bus<t43/>can<t44/>do con<t45/>cien<t46/>ti<t47/>zar a la so<t48/>cie<t49/>dad."

I went to the user's manual and I couldn't find the "Remove tags" feature/option.


"Remove tags" is in the Project > Properties dialog. But that will remove all tags, not just the unuseful ones. Are you sure the file would look better in an earlier version of OmegaT?

Samuel


 

esperantisto  Identity Verified
Local time: 05:17
Member (2006)
English to Russian
+ ...
Some advise Dec 11, 2014

What you show makes me think that the document was produced by OCRing or converting a PDF file. Anyway, read Marc Prior’s guide on DOCX compatibility. If your document contains no particular formatting, try clearing it (select text and, as I remember, Shift+Ctrl+O). Unlike removing tags in the OmegaT project properties menu, resetting the formatting will save certain elements represented as tags at their respective positions. Also, try a simple trick: convert the document to RTF and back to DOCX.

 

Oscar Rivera
Hungary
Local time: 04:17
English to Spanish
+ ...
TOPIC STARTER
Belated thanks for the prompt help Dec 15, 2014

Susan Welsh wrote:

if you tried this particular file in 2.5 you would get the same thing, because as I understand it, it has to do with the .docx format. I use Codezapper, which works fine. Alternatively, if you don't need ANY tags in the document, use the "remove tags" setting at Project > Properties (Control +E).
Someone else may have another solution for you.



Susan, I also think it had to do with the document itself, more specifically the .docx format. Thanks for the "remove tag" advice. It worked and solved the problem. The target document without the tags looks exactly as the original source document.

Samuel Murray wrote:

Oscar Rivera wrote:
This is what I get: "decidido manifestarnos y actuar en el continente, involucrando a diversos actores sociales, y buscando concientizar a la sociedad."


Thanks to a bug in ProZ.com's forum software that hasn't been fixed in (oh, I think) ten years, that is not what you get. What you get, is this:

"decidido manifestarnos y actuar en el continente, involucrando a diversos actores sociales, y buscando concientizar a la sociedad."

I went to the user's manual and I couldn't find the "Remove tags" feature/option.


"Remove tags" is in the Project > Properties dialog. But that will remove all tags, not just the unuseful ones. Are you sure the file would look better in an earlier version of OmegaT?

Samuel




Samuel, you're right. I am not sure if it'd look better in an earlier OmegaT version and the "remove tags" worked really well. Fortunately, the document hasn't been altered.

esperantisto wrote:

What you show makes me think that the document was produced by OCRing or converting a PDF file. Anyway, read Marc Prior’s guide on DOCX compatibility. If your document contains no particular formatting, try clearing it (select text and, as I remember, Shift+Ctrl+O). Unlike removing tags in the OmegaT project properties menu, resetting the formatting will save certain elements represented as tags at their respective positions. Also, try a simple trick: convert the document to RTF and back to DOCX.


Esperantisto, thanks for the advice. I'd seen the document before but I'd never gotten round to reading it.


 

Samuel Murray  Identity Verified
Netherlands
Local time: 04:17
Member (2006)
English to Afrikaans
+ ...
Remember, though, "Remove tags" doesn't actually remove tags Dec 15, 2014

Oscar Rivera wrote:
Thanks for the "remove tag" advice. It worked and solved the problem. The target document without the tags looks exactly as the original source document.


FWIW, as Marc mentioned in another thread, the "Remove tags" feature doesn't actually remove any tags -- it simply shoves all the tags of the segment to the end of the segment, without letting the translator see it. So, theoretically, if there is any loss of formatting, it should not affect more than just the one sentence that the formatting is in.


 

Oscar Rivera
Hungary
Local time: 04:17
English to Spanish
+ ...
TOPIC STARTER
The tags at the end hasn't altered the document so far. Dec 15, 2014

Samuel Murray wrote:

Oscar Rivera wrote:
Thanks for the "remove tag" advice. It worked and solved the problem. The target document without the tags looks exactly as the original source document.


FWIW, as Marc mentioned in another thread, the "Remove tags" feature doesn't actually remove any tags -- it simply shoves all the tags of the segment to the end of the segment, without letting the translator see it. So, theoretically, if there is any loss of formatting, it should not affect more than just the one sentence that the formatting is in.


Thanks once again, Samuel. Although the tags were pushed at the end of the segmen, so far it hasn't altered the formatting.


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


How to remove tags from the original text

Advanced search






SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running, helps experienced users make the most of the powerful features, ensures new

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search