Glossary comments formatting - line breaks possible?
Thread poster: peregrinus

peregrinus
United Kingdom
Dec 30, 2017

Good morning all,

I was wondering if it's possible to break lines inside comments in glossaries?

I was aiming at something with a visual effect similar to this:


adjective
1.
having or involving several parts, elements, or members.
"multiple occupancy"
noun
1.
a number that may be divided by another a certain number of times without a remainder.
"15, 20, or any multiple of five"
2.
BRITISH
a shop with branches in many places, especially one selling a specific type of product.
"the major food multiples"


Also, I've noticed that when the comment becomes too long, it's cut off at the first encounter of a full stop. Is there a way to remedy it (other than not using full stops, that is...)


Thank you in advance!


Direct link Reply with quote
 

Didier Briel  Identity Verified
France
Local time: 20:47
Member (2007)
English to French
+ ...
Glossary definitions should wrap automatically Jan 1

peregrinus wrote:
I was wondering if it's possible to break lines inside comments in glossaries?

I was aiming at something with a visual effect similar to this:


adjective
1.
having or involving several parts, elements, or members.
"multiple occupancy"
noun
1.
a number that may be divided by another a certain number of times without a remainder.
"15, 20, or any multiple of five"
2.
BRITISH
a shop with branches in many places, especially one selling a specific type of product.
"the major food multiples"


Normally, it happens automatically (see image below).

Glossary

Also, I've noticed that when the comment becomes too long, it's cut off at the first encounter of a full stop. Is there a way to remedy it (other than not using full stops, that is...)


Never seen that.

What version are you using?
On which platform?

For more in-depth support, I suggest using the Yahoo support group:
https://groups.yahoo.com/neo/groups/OmegaT/info

Didier


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 20:47
Member (2006)
English to Afrikaans
+ ...
Could we be a little more specific, please? Jan 1

peregrinus wrote:
I was wondering if it's possible to break lines inside comments in glossaries?


The example you give appears to relate to the display of TBX glossaries, but TBX files don't have a "comments" field, as far as I know. Are you talking about tab delimited files here or about TBX files? If TBX, which field in the TBX file are you referring to when you say "comments"? Do you possibly mean the "definition" field instead?

Using 
 for a line break inside the "definition" text appears to work as a line break within OmegaT.

Also, I've noticed that when the comment becomes too long, it's cut off at the first encounter of a full stop.


I tried this with both a TXT glossary (in the "comments" field) and a TBX glossary (in the "description" field"), but even if the text goes on for over 10 000 characters, OmegaT has no problem with the length.


[Edited at 2018-01-01 16:32 GMT]


Direct link Reply with quote
 

peregrinus
United Kingdom
TOPIC STARTER
TBX Glossary? Jan 1

Thank you, both.

I think I've located the problem when it comes to OmegaT losing parts of the comments - it happens when I insert text from a website. there must be some invisible characters that are forcing OmegaT to stop.

Didier, I wasn't able to replicate your example. When I type word 'noun' after full stop at the end of the comment it stays in the same line. Is it perhaps in a fourth column? Or is the problem with the fact that I'm using .txt file extension and should be using something else?

My version is 4,1,3_1 on Windows.

And thank you for the link to yahoo group!



Samuel, I'm talking about the glossary.txt tab-delimited file. I use the word 'comment' because that's what it says when I open 'Add glossary entry' dialogue box.

Do I get it right - is there a way to use a ".tbx" file extension instead of ".txt" to achieve this effect? If so I need to do some more research...


Direct link Reply with quote
 

Didier Briel  Identity Verified
France
Local time: 20:47
Member (2007)
English to French
+ ...
TBX glossaries are different Jan 3

peregrinus wrote:
I think I've located the problem when it comes to OmegaT losing parts of the comments - it happens when I insert text from a website. there must be some invisible characters that are forcing OmegaT to stop.

Didier, I wasn't able to replicate your example. When I type word 'noun' after full stop at the end of the comment it stays in the same line.

That's normal.
It's because my example is a TBX file, the Microsoft terminology, available here:
https://www.microsoft.com/en-us/language/Terminology

Is it perhaps in a fourth column?

There are only three columns in text files.
No, it's a different "column", because it's a TBX file.
You can read more about that standard here:
http://www.ttt.org/oscarstandards/tbx/

Or is the problem with the fact that I'm using .txt file extension and should be using something else?

You cannot convert a format by changing the extension.

Didier


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 20:47
Member (2006)
English to Afrikaans
+ ...
@Per Jan 3

peregrinus wrote:
I'm talking about the glossary.txt tab-delimited file.


Okay, sorry for the confusion. The OmegaT glossary options are:

1. TBX

TBX as a glossary format is really (IMO) only intended as export format, and not as a format that can or should be edited on the fly. I know of no TBX editor in existence (i.e. a program with which one can create, modify and add to TBX files), but there are a number of TBX converters out there. Using such a converter with e.g. OmegaT would mean always adding new terms to the import format and then exporting a TBX again for every new term that you add, and I would consider that cumbersome.

Still, if you have or want to have a finished glossary that you only ever read (not edit or add entries to), then TBX is a useful format, because you can specify e.g. parts of speech.

2. TXT

OmegaT's tab-delimited glossary format is quite primitive (and it has a bu^H^Hfeature whereby it merges the information from entries ostensibly to save screen space), but you can try to identify glossary labels by putting them in certain types of brackets (although all text will appear on a single line). For example:

one     een     [[noun]] {{singular}} <mathematics> DEF: a nice little three-letter word

3. CSV

OmegaT also accepts CSV files as glossary files, but oddly it can't handle line breaks within fields (regardless of whether the line breaks are CR, LF or CRLF). When a field contains a line break, OmegaT simply stops reading at that point. In other words, this CSV entry:

"the internet","das interwebs","a place
where one can find pictures
of cats"


will display in OmegaT as:

the internet = das interwebs
1. a place


4. OmegaT-DSL

There is a dictionary format in OmegaT that can be edited in a text editor quite easily, namely OmegaT-DSL. OmegaT-DSL is sort of based on ABBYY DSL, but it's not the same thing. For example, formatting tags in ABBYY DSL use square brackets, whereas formatting tags in OmegaT-DSL are HTML-style.

If you use OmegaT-DSL glossaries, the matches will appear in the Dictionary pane instead of the Glossary pane, and your ".dsl" file will reside in the /dictionary/ folder instead of the /glossary/ folder. You can keep an OmegaT-DSL file open in a text editor while using the same file in OmegaT. You can't add entries to a DSL file directly from within OmegaT.

With an OmegaT-DSL file you have some control over formatting (bold, italic, underline, font face, font color, etc) and you can have line breaks within entries. One great thing about using OmegaT-DSL is that OmegaT doesn't merge your entries in the pane. One downside of using OmegaT-DSL is that TransTips don't work. DSL support is also a bit buggy (for example, an entry for "one two" will not match the source text "one two" but will match the source text "ones and twos"). Also, Ctrl+comma can't insert dictionary target texts. OmegaT shows a hyphen between the source term and translation (sorry, you can't change that, which can look a bit odd if you put the target term on a separate line).

The DSL format:

There is no formal format description for the OmegaT-DSL format, but basically, it's a plain text file in UTF16-LE encoding, with the file extension ".dsl". Any word that starts directly against the left marging is a source text term. If any line directly after the source text term has a space or a tab in front of it (i.e. the text is not against the left margin), it is considered part of the entry. Line breaks within entries are treated as spaces, but you can add visual line breaks using <br>.

So, this is a valid OmegaT-DSL entry:

computer
     ordinateur<br><i>noun</i><br><font face="courier" color="red">A thingy that computes.</font>


and so is this:

computer
     ordinateur<br>
     <i>noun</i><br>
     <font face="courier" color="red">A thingy that computes.</font>




[Edited at 2018-01-03 11:18 GMT]


Direct link Reply with quote
 

peregrinus
United Kingdom
TOPIC STARTER
Thank you! Jan 3

Thank you, both,

This is brilliant, I really appreciate it. I've recently made a switch over from memoQ and I don't think I'm ever going to look back. I did not appreciate how versatile OmegaT is. Once again thank you for sharing your knowledge!

Best,
p.


Direct link Reply with quote
 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Glossary comments formatting - line breaks possible?

Advanced search






WordFinder Unlimited
For clarity and excellence

WordFinder is the leading dictionary service that gives you the words you want anywhere, anytime. Access 260+ dictionaries from the world's leading dictionary publishers in virtually any device. Find the right word anywhere, anytime - online or offline.

More info »
SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search