A question on segmentation in MetaTexis
Thread poster: Callmeaspade

Callmeaspade
Russian Federation
Local time: 22:43
English to Russian
+ ...
Mar 23, 2008

Greetings everyone.

I am trying to translate a small text in Metatexis (Lite Trial version) and can't figure out how to tweak it for this project.

The text I am translating is a subtitles file. It is just .txt organised like this:

1 - [segment number]
00:00:03,196 --> 00:00:05,156 - [subtitle time and duration mark]
Lorem ipsum dolor sit amet, consectetur adipisicing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. - [two lines of subtitle text divided by manual break]

Segments are separated from each other by an empty line.

By default MetaTexis would skip segment number, but open subtitle time and duration as a segment to be translated. So the first question is, can I configure it to skip segments/fragments that contain no letters?

The next problem is that it considers a manual line break as end of segment. This would not be much of a problem, since I can expand a segment with a hotkey, but every time I do it, MetaTexis displays an annoying 'are-you-sure-you-want-to-do-this" message. So, is there a way to automatically treat two lines as one segment? And if not, can I somehow disable the warning popup?

Thanks in advance. I am currently trying to find a solution in help and options, but hope someone already knows the answer.

Natalie
Local time: 21:43
Member (2002)
English to Russian
+ ...
Hi Vladimir Mar 23, 2008

In case you don't need to keep the lines as in the source text, then before using Metatexis you could replace manual line breaks with spaces (i.e. simply remove them). Would this solve your problem with segmentation?

What about skipping time and duration at segmentation: go to Metatexis > Document options > Segmentation and uner "Do not segment paragraphs with numbers only" uncheck the option "except when containing dots or commas"

[Edited at 2008-03-23 23:41]

Boyan Brezinsky
Local time: 22:43
English to Bulgarian
+ ...
Ask support Mar 24, 2008

Regarding your first problem - it seems that by default the option "Skip paragraphs without letters" in the segmentation settings is turned on. Therefore the time/duration paragraph should not be segmented. I just checked and the paragraph is still segmented. I suppose this is a bug. I'd suggest you send a bug report. Unchecking the option "except when containing dots or commas", as suggested by Natalie, and resegmenting the whole document worked as expected, so there's a workaround. But I still think this is a bug.
On your second problem with the line breaks. Are you certain they are line breaks, not paragraph breaks? I just checked and a two-line paragraph with a line break is treated as a single segment.
If the file is really a txt-file, I believe that upon import everything there is treated as paragraph breaks. So I'd suggest you play a little with find and replace and replace paragraphs with line breaks between lines. I don't see an easy way to do it though apart from confirming or denying the replacement for every occurence of the mark.
P. S. As I was reading another thread, I came ucross AutoUnbreak, a free tool that's supposed to remove extraneous line or paragraph breaks from text -
You may try it with your source text before pasting it into Word, it might remove the extra paragraph breaks between lines.

[Edited at 2008-03-24 10:26]

Hermann Bruns
Local time: 21:43
English to German
Remove manual paragraph breaks manually Mar 24, 2008

Hi Vladimir,

as regards the segmentation of the paragraph numbers only, Natalie has given the right suggestion. I have tested Natalies suggestion, and it works with your text sample.

As regards the issue with manual paragraph breaks, there are two ways to solve the issue. One is the way you have described. If the MetaTexis prompt annoys you, you the other way is to remove the paragrahp breaks befor you navigate though the affected paragraphs.

Kind regards

Callmeaspade
Russian Federation
Local time: 22:43
English to Russian
+ ...
Thanks for your advice, Natalie, bsb_2 and Hermann Mar 24, 2008

I was checking/unchecking "Do not segment paragraphs with numbers only" and other items but couldn't guess I have to resegment the document to apply these changes. It works after resegmenting.

As for manual removal of paragraph breaks (of course not line breaks - my fault) or changing them to line breaks - this was the first walkaround I was thinking of.

It would remove the problem, but I was hoping to find a MultiTexis-based solution.

Anyway, thanks everyone once more. It really helped.

