Technical Questions Regarding CDATA Embedded Content in Trados Studio 2009
Thread poster: lthurston
Jan 15, 2013

Hello,
I have a couple of quick technical questions regarding the CDATA Embedded Content processing in Trados 2009. I'm new to Trados, but am good with regular expressions and understand HTML / XML quite well.

First of all, are backreferences possible in the Tag Definition Rules, specifically from the end tag to the start tag? This is a somewhat minor point, but the start and end tags don't seem to reliably be associated with the other. If there is a backreference syntax, what is it? It other words, without making a rule for each tag, could I refer to a captured group from the start tag in order to match the appropriate end tag, like this (ignore that there's no slash at the beginning of the end tag, I'm having some issues with this forum's content processing):

Start tag: <([a-z][a-z0-9]*)[^<>/]*>
End tag: </\1[^<>]*>

Of course that doesn't work or I wouldn't be asking. Trados' feedback when I enter that indicates backreference seem like it would work, but would only refer to a captured string in the end tag. I want the capture in the start tag.

The more important question has to do with HTML attributes in CDATA Embedded Content, and whether anyone is aware of a way to make attributes editable. Ideally, the goal would be to make a very small subset of all attributes editable (src, href, title, etc) for all tags. I might be able to do this by modifying the regex for the start and end tag so that it doesn't include anything past the tag name, but that feels hacky and wrong.

If this isn't possible with the CDATA Embedded Content, is there another way to come at this problem? Another Embedded Content option? I guess what I'm getting at is: is this possible with Trados at all?

Thanks for you help!

best,
Lucas


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 17:19
English
Back references in here Jan 16, 2013

Hi,

I think back references will only work here in the same field, so capturing in either the start and referencing in the end will not work. What you would have to do is this (as you probably already know):

Start
<[a-z][a-z0-9]*[^<>/]*>

End
</[a-z][a-z0-9]*[^<>]*>

The current solution here is not hugely comprehensive and you would have to go the "hacky" route to try and take translatable content from attributes within an embedded content section. There will be an improved solution for this in a future build but for now you'd have to use regex.

Regards

Paul


Direct link Reply with quote
 
lthurston
TOPIC STARTER
. Jan 16, 2013

Thanks, Paul.

Understood regarding the start / end tag matching.

As far as the editable attributes are concerned, do you have any idea of when the "improved solution" will be released? It's an improvement to the CDATA embedded content component, is that right? Do you have any specifics on what / how it will allow attributes to be marked as translatable or not translatable?

And there's no other way to go about this currently?

thanks,
Lucas


Direct link Reply with quote
 
ErikAnderson3
Local time: 08:19
You have to get super-specific Jan 16, 2013

As Paul notes, any backreferences in an end tag expression only apply to that end tag expression -- they cannot refer to anything in the start tag expression.

Consequently, to get closing tags to match up properly with their starting tags, you need to get super-specific in your tag definition rules. For some XML file types that we deal with, which also include CDATA sections, I've just defined all of the specific tags that show up in those CDATA sections. Tedious, but effective.

Defining attributes to translate gets uglier. I haven't tested this, but something like the following might work, with one start/end def for the element with the attributes to translate, and one start/end def for the element without the attributes to translate. It's hacky, and probably won't work if the element has more than one attribute.

Start:
<p (src|href|title|etc)="

End:
">

and

Start:
<p\s*(?!(src|href|title|etc)=".*?")>

End:
</p>


Cheers,

-- Erik Anderson


Direct link Reply with quote
 
lthurston
TOPIC STARTER
Thanks Jan 22, 2013

Erik,
I appreciate the additional information, and the alternate start / end tag configuration to get translatable attributes. I'll definitely set up tag-specific rules, but will try to see if I can hold off on the translatable attributes until it's supported by the software.

Paul, I'd still be interested if you have any timeline on the features we're discussing.

thanks,
Lucas


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Technical Questions Regarding CDATA Embedded Content in Trados Studio 2009

Advanced search







SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search