Segmentation issue in Studio 2014
Thread poster: tvtruc

tvtruc
Vietnam
Local time: 00:32
English to Vietnamese
Oct 13, 2013

When I open a TTX file in Studio 2014, the segmentation is not what I want, like this:



Segmentation rules in Studio 2014 does not treat line break as it is used to.

I don't have this issue with Studio 2011 and with Studio 2014 when used with a certain TM. For the same file with Studio 2014 when used with a certain TM, segmentation is like this:



This is what I want.

I know it can be fixed by changing Segmentation rules in TM settings, but I can't figure out how.

Any one please help?

Thanks,
Truc


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 19:32
English
I think you need... Oct 13, 2013

... to change the segmentation rules in Trados 2007 before this file even becomes a TTX. It doesn't sound to me as though you are using the same TTX in 2011 as you are in 2014 because the TTX is already segmented.

Are you sure you are comparing apples with apples?

I mocked up your example and tested in both 2011 and 2014 with exactly the same results... all in one segment. If I look in the TTX itself with a text editor I can also see that this is one Translation Unit, so this is to be expected.

Regards

Paul


Direct link Reply with quote
 

Shai Navé  Identity Verified
Israel
Local time: 20:32
Member
English to Hebrew
+ ...
May (or may not) be a missing segmentation rule in the TM Oct 13, 2013

If the same TTX file is segmented "correctly" against a different TM, perhaps the main project-specific TM is missing the line-break segmentation rule.

Direct link Reply with quote
 

tvtruc
Vietnam
Local time: 00:32
English to Vietnamese
TOPIC STARTER
It works Oct 14, 2013

Shai Nave wrote:

If the same TTX file is segmented "correctly" against a different TM, perhaps the main project-specific TM is missing the line-break segmentation rule.


I follow instructions in this KB article and it works.

Thank you very much Sai.

Rgds,
Truc


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 19:32
English
Confused... I am! Oct 14, 2013

Hi Shai, Truc,

I'd love to know how you made this work, and am very happy to stand corrected. But for me, making changes to the TM won't help with a TTX at all because it's already segmented. This does help with a Word doc for example, but not with a TTX. Even creating an unsegmented TTX doesn't help me here.

How did you manage this?

Regards

Paul


Direct link Reply with quote
 

tvtruc
Vietnam
Local time: 00:32
English to Vietnamese
TOPIC STARTER
Just add segmentation rule for soft return Oct 14, 2013

Hi Paul,

I just added segmentation rule for soft return to my TM by following the instructions in the KB article.

Regards,
Truc

SDL Support wrote:

Hi Shai, Truc,

I'd love to know how you made this work, and am very happy to stand corrected. But for me, making changes to the TM won't help with a TTX at all because it's already segmented. This does help with a Word doc for example, but not with a TTX. Even creating an unsegmented TTX doesn't help me here.

How did you manage this?

Regards

Paul


Direct link Reply with quote
 

Shai Navé  Identity Verified
Israel
Local time: 20:32
Member
English to Hebrew
+ ...
Might be a Studio 2014 issue Oct 15, 2013

Hi Paul,
What I'm about to write is just a speculation at this point as I need to investigate this further (and don't have the time); I don't know enough about the inner-workings of the Studio 2014 parser so the following may or may not be valid.

This might be a Studio 2014 specific issue. The TTX might have been segmented correctly (i.e. after a line break) but for some reason Studio 2014 ignores it.
I discovered this by accident. A colleague of mine called me in despair after switching to Studio 2014 and told me that she has problems with a TTX file, that is a part of a long-running project of very similar structured files that she never had similar problems with before. The TTX was, in large, an itemized list with the items separated by a line-break. In Trados 2007 and Studio 2011 the items were segmented after a line-break, but in Studio 2014 not and the whole thing was displayed as few huge segments.
I really didn't have the time to look into this too much so I suggested something like, "let me try something quick here", added the line-break segmentation rule to the TM, and it worked - the TTX was now segmented "correctly" in Studio 2014. It left me quite puzzled, I must admit, so I can understand how you might feel.
If I recall correctly, I think that the TTX was segmented after a line-break (I took a quick peak in a text editor), but I'm not too sure about it now.

and another, maybe related, issue that I have also noticed is that Studio 2014 - this time in a docx file - seems to ignore non-breaking spaces. For example, it segments the text after a colon even if it is immediately followed by a non-breaking space.

However, as I've said, I need to investigate the TTX issue further.

[Edited at 2013-10-15 21:18 GMT]


Direct link Reply with quote
 

tvtruc
Vietnam
Local time: 00:32
English to Vietnamese
TOPIC STARTER
Segmentation in TTX Oct 16, 2013

Hi Paul!

FYI, here is how is segmented in the TTX (opened in TagEditor):



Direct link Reply with quote
 

Luca Tutino  Identity Verified
Italy
Local time: 19:32
Member (2002)
English to Italian
+ ...
but what does "[\w\p{P}]\s?" mean?? Nov 17, 2014

[\w\p{P}]\s?

I think we should be given some more insight by this. KB #3632 does not explain anything.


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 19:32
English
It's just a regular expression... Nov 17, 2014

... nothing specific to Studio. But it basically means this:

[\w\p{P}]\s?[\n]+

The square brackets mean match anything inside, so \w or \p{P}

\w means a word character.

\p{P} means any kind of punctuation.

The the \s at the end means match a single whitespace character. But the ? after it makes it greedy, so the regex engine will keep looking for spaces as many times as is needed until it finds them (it's greedy)

Then the [\n] looks for a line feed character which is also greedy because of the + symbol at the end. But this is a different kind of greedy that keeps looking forever whether it finds them or not and only stops when the available string of text runs out.

If you're interested in this kind of stuff then I'd recommend you get a copy of Regex Buddy. You can paste the expression in there and it will tell you what it means! You could also take a look at these articles which cover some of the basics for regular expressions in Studio.

Regards

Paul


Direct link Reply with quote
 

Miguel Carmona  Identity Verified
United States
Local time: 10:32
English to Spanish
... Nov 18, 2014

SDL Support wrote:

Confused... I am!

Hi Shai, Truc,

I'd love to know how you made this work, and am very happy to stand corrected. But for me, making changes to the TM won't help with a TTX at all because it's already segmented. This does help with a Word doc for example, but not with a TTX. Even creating an unsegmented TTX doesn't help me here.

How did you manage this?

Paul


I am as confused as Paul.

Are you sure you are talking about a TTX file?


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Segmentation issue in Studio 2014

Advanced search







LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search