SDLXLIFF: which of source or seg-source is the source text?
Thread poster: Samuel Murray

Samuel Murray  Identity Verified
Netherlands
Local time: 19:08
Member (2006)
English to Afrikaans
+ ...
Nov 29, 2012

G'day everyone

In the SDLXLIFF file itself (i.e. in the source code) the TUs have three pairs of elements in them, namely "source", "seg-source" and "target". In my translated file, the tags in "seg-source" are identical to the tags in "target", but there are fewer tags in "source" than in "seg-source". Does this mean that "seg-source" contains the actual source text that "target" should be a translation of? If so, then what is "source" for?

Thanks
Samuel


Direct link Reply with quote
 
ErikAnderson3
Local time: 10:08
Explanation Nov 29, 2012

Studio is capable of allowing translators to edit the source segment. This can be *extremely* helpful when dealing with messy source documents -- i.e., nearly the entire sum total of everything that real-world translators have to deal with. (Currently this is *only* available for MS Word and MS PowerPoint, and then only if the option was enabled during project creation -- Studio devs, **PLEASE** make this option available for **ALL** file types!)

As best I've been able to figure out, the <source> element in the SDLXLIFF file contains the source text as-is from the source file. Depending on the source file type (and the file type filter thus applied), the <source> element may actually contain multiple segments.

The <seg-source> element contains the segmented version of the content in <source>. There are two main differences between what you'll see in <seg-source> versus what you'll see in <source>. For starters, if the file type filter puts multiple segments into <source>, you'll see that each of these segments are separated out into their own <mrk> elements. The XLIFF 1.2 specification describes the <mrk> element as a marker, appropriately enough; the value of the mtype attribute says what that marker is for. The table of mtype values in the spec says that mtype="seg" indicates segmented text. From what little I've seen so far, SDLXLIFF seems to only ever have <mrk mtype="seg">, so it appears that SDL only uses the <mrk> element for segments. (Note that <mrk> elements also have mid attribtues -- these are marker ID values, and they look to be numbered sequentially from the start of the file. The mid values in the <seg-source> element will match those in the <target> element.)

The second difference only occurs if you have edited the source segment text within Studio (and again, this is currently only possible for MS Word and MS PowerPoint, due to programming design decisions made by SDL). In this case, it looks like <source> contains the source text from the source file, while <seg-source> contains the edited source text as shown in Studio's Editor view. You'd asked, "Does this mean that "seg-source" contains the actual source text that "target" should be a translation of?" Yes, as best I understand it.

I hope this answers your questions. And if anyone spots something that I've gotten anything wrong above, please post a correction.

Cheers,

-- Erik Anderson


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 19:08
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
@Erik Nov 29, 2012

ErikAnderson3 wrote:
You'd asked, "Does this mean that "seg-source" contains the actual source text that "target" should be a translation of?" Yes, as best I understand it.


Thanks, you confirm my opinion and my guess.

The MRK tag is also used for translator comments, by the way (the comment itself is stored way up in the head of the file, with an ID number attached to it).

Samuel


Direct link Reply with quote
 
ErikAnderson3
Local time: 10:08
Thanks in return Nov 29, 2012

Samuel Murray wrote:
The MRK tag is also used for translator comments, by the way (the comment itself is stored way up in the head of the file, with an ID number attached to it).


Thank you for that, good to know!

Cheers,

-- Erik


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

SDLXLIFF: which of source or seg-source is the source text?

Advanced search







BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »
SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search