XML Namespace segmentation
Thread poster: Extra Consult

Extra Consult
Belgium
Local time: 02:07
Member (2008)
English
Jun 30, 2014

Hi,

I seem to be having some problems creating a working XML filetype filter when the XML contains elements with namespace declarations. (in SDL Studio 2014)

The XML structure looks as follows:
[paragraph>
[title][/title]
[body]
[div]
[strong][/strong]
[/div]
[/body]
[/paragraph]

Now I can create parser rules for this, and the [strong] tag would be tagged in the SDL editor. So if I were to have this:
[div] hello [strong] stranger [/strong][/div]

the line "hello stranger" would be one segment in SDL, with a strong tag between the words. This all works fine.

However, when the [div] segment contains a namespace, the trouble begins. So:
[div xmlns="http://www.w3.org/1999/xhtml"] hello [strong] stranger [/strong][/div]

now consists out of two segments in SDL, without any tags. It seems the [div] tag and child element [strong] is no longer properly recognized.

All text is still there to translate, and the cleaned files work fine, but it's troublesome to translate as text is segmented strangely.

I'm sure there is a way to fix this using XPath, but I'm unable to find the correct syntax. Does anyone have any idea how to build the XPath, or atleast have any proper SDL documentation on the file type filter for XML? (SDL help has it very summarily documented)

Kind regards,
Geert

//please note I've created the tags as [ ] as XML tags aren't allowed in the post content.


 

SDL Community  Identity Verified
United Kingdom
Local time: 02:07
English
Try this... Jun 30, 2014

... XPath expression which will ignore the namespace:

//*[local-name()='strong']

Regards

Paul


 

Extra Consult
Belgium
Local time: 02:07
Member (2008)
English
TOPIC STARTER
local-name() Jun 30, 2014

Hi Paul,

thanks, didn't think of trying that. I did now, replacing the existing 'strong' attribute by the XPath and the result is the same. I'm guessing using the local-name() attribute does extract the text from the node (as it should), but the segmentation in the Editor window is still off.
Where the lines without the namespace yield --> segment 1: hello [tag] stranger [tag], once the namespace is introduced in the parent node, it creates 2 segments, 'hello' and 'stranger'.

Tried both XPath1 and XPath2 notation (//*[]) and (//*:name), both with the same result.
And funny thing, even with the XPath //*[local-name()='strong'], the file without the namespace still works fine, but with the namespace, wrong segmentation.icon_smile.gif

Kind regards,
Geert


 

Extra Consult
Belgium
Local time: 02:07
Member (2008)
English
TOPIC STARTER
Works now Jun 30, 2014

Hi Paul,

sorry, I missed a [br] tag in the XML, which caused segmentation, even with the //*[local-name()] Xpath. Now with both removing the br tag and replacing the strong tag with the XPath, segmentation is normal in the Editor window.

Kind regards, and thanks again for the help.
Geert



[Edited at 2014-06-30 14:55 GMT]


 

SDL Community  Identity Verified
United Kingdom
Local time: 02:07
English
Any chance... Jun 30, 2014

... you can send me a file that does not work? I created a test file based on the info you provided so far and it works fine, so I think it would help to not guess any more. Maybe you can cut it down and replace the translatable text so it's a simple file showing only the problem and without potentially sensitive information.

Regards

Paul
pfilkin@sdl.com


 

SDL Community  Identity Verified
United Kingdom
Local time: 02:07
English
Or did this mean you're sorted? Jun 30, 2014

Extra Consult wrote:

Now with both removing the br tag and replacing the strong tag with the XPath, segmentation is normal in the Editor window.



 

Extra Consult
Belgium
Local time: 02:07
Member (2008)
English
TOPIC STARTER
yes, it's ok Jul 1, 2014

Hey Paul,

sorry for the confusion, it does work now. I overlooked a [br] tag in the first test, but the XPath solution works very well.

Kind regards,
Geert


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

XML Namespace segmentation

Advanced search







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running, helps experienced users make the most of the powerful features, ensures new

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search