XML Namespace segmentation
Thread poster: Extra Consult

Extra Consult
Belgium
Local time: 19:16
Member (2008)
English
Jun 30, 2014

Hi,

I seem to be having some problems creating a working XML filetype filter when the XML contains elements with namespace declarations. (in SDL Studio 2014)

The XML structure looks as follows:
[paragraph>
[title][/title]
[body]
[div]
[strong][/strong]
[/div]
[/body]
[/paragraph]

Now I can create parser rules for this, and the [strong] tag would be tagged in the SDL editor. So if I were to have this:
[div] hello [strong] stranger [/strong][/div]

the line "hello stranger" would be one segment in SDL, with a strong tag between the words. This all works fine.

However, when the [div] segment contains a namespace, the trouble begins. So:
[div xmlns="http://www.w3.org/1999/xhtml"] hello [strong] stranger [/strong][/div]

now consists out of two segments in SDL, without any tags. It seems the [div] tag and child element [strong] is no longer properly recognized.

All text is still there to translate, and the cleaned files work fine, but it's troublesome to translate as text is segmented strangely.

I'm sure there is a way to fix this using XPath, but I'm unable to find the correct syntax. Does anyone have any idea how to build the XPath, or atleast have any proper SDL documentation on the file type filter for XML? (SDL help has it very summarily documented)

Kind regards,
Geert

//please note I've created the tags as [ ] as XML tags aren't allowed in the post content.


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 19:16
English
Try this... Jun 30, 2014

... XPath expression which will ignore the namespace:

//*[local-name()='strong']

Regards

Paul


Direct link Reply with quote
 

Extra Consult
Belgium
Local time: 19:16
Member (2008)
English
TOPIC STARTER
local-name() Jun 30, 2014

Hi Paul,

thanks, didn't think of trying that. I did now, replacing the existing 'strong' attribute by the XPath and the result is the same. I'm guessing using the local-name() attribute does extract the text from the node (as it should), but the segmentation in the Editor window is still off.
Where the lines without the namespace yield --> segment 1: hello [tag] stranger [tag], once the namespace is introduced in the parent node, it creates 2 segments, 'hello' and 'stranger'.

Tried both XPath1 and XPath2 notation (//*[]) and (//*:name), both with the same result.
And funny thing, even with the XPath //*[local-name()='strong'], the file without the namespace still works fine, but with the namespace, wrong segmentation.

Kind regards,
Geert


Direct link Reply with quote
 

Extra Consult
Belgium
Local time: 19:16
Member (2008)
English
TOPIC STARTER
Works now Jun 30, 2014

Hi Paul,

sorry, I missed a [br] tag in the XML, which caused segmentation, even with the //*[local-name()] Xpath. Now with both removing the br tag and replacing the strong tag with the XPath, segmentation is normal in the Editor window.

Kind regards, and thanks again for the help.
Geert



[Edited at 2014-06-30 14:55 GMT]


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 19:16
English
Any chance... Jun 30, 2014

... you can send me a file that does not work? I created a test file based on the info you provided so far and it works fine, so I think it would help to not guess any more. Maybe you can cut it down and replace the translatable text so it's a simple file showing only the problem and without potentially sensitive information.

Regards

Paul
pfilkin@sdl.com


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 19:16
English
Or did this mean you're sorted? Jun 30, 2014

Extra Consult wrote:

Now with both removing the br tag and replacing the strong tag with the XPath, segmentation is normal in the Editor window.



Direct link Reply with quote
 

Extra Consult
Belgium
Local time: 19:16
Member (2008)
English
TOPIC STARTER
yes, it's ok Jul 1, 2014

Hey Paul,

sorry for the confusion, it does work now. I overlooked a [br] tag in the first test, but the XPath solution works very well.

Kind regards,
Geert


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

XML Namespace segmentation

Advanced search







CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search