XML Namespace segmentation
Thread poster: Extra Consult

Extra Consult
Belgium
Local time: 12:03
Member (2008)
English
Jun 30, 2014

Hi,

I seem to be having some problems creating a working XML filetype filter when the XML contains elements with namespace declarations. (in SDL Studio 2014)

The XML structure looks as follows:
[paragraph>
[title][/title]
[body]
[div]
[strong][/strong]
[/div]
[/body]
[/paragraph]

Now I can create parser rules for this, and the [strong] tag would be tagged in the SDL editor. So if I were to have this:
[div] hello [strong] stranger [/strong][/div]

the line "hello stranger" would be one segment in SDL, with a strong tag between the words. This all works fine.

However, when the [div] segment contains a namespace, the trouble begins. So:
[div xmlns="http://www.w3.org/1999/xhtml"] hello [strong] stranger [/strong][/div]

now consists out of two segments in SDL, without any tags. It seems the [div] tag and child element [strong] is no longer properly recognized.

All text is still there to translate, and the cleaned files work fine, but it's troublesome to translate as text is segmented strangely.

I'm sure there is a way to fix this using XPath, but I'm unable to find the correct syntax. Does anyone have any idea how to build the XPath, or atleast have any proper SDL documentation on the file type filter for XML? (SDL help has it very summarily documented)

Kind regards,
Geert

//please note I've created the tags as [ ] as XML tags aren't allowed in the post content.


 

SDL Community  Identity Verified
United Kingdom
Local time: 12:03
English
Try this... Jun 30, 2014

... XPath expression which will ignore the namespace:

//*[local-name()='strong']

Regards

Paul


 

Extra Consult
Belgium
Local time: 12:03
Member (2008)
English
TOPIC STARTER
local-name() Jun 30, 2014

Hi Paul,

thanks, didn't think of trying that. I did now, replacing the existing 'strong' attribute by the XPath and the result is the same. I'm guessing using the local-name() attribute does extract the text from the node (as it should), but the segmentation in the Editor window is still off.
Where the lines without the namespace yield --> segment 1: hello [tag] stranger [tag], once the namespace is introduced in the parent node, it creates 2 segments, 'hello' and 'stranger'.

Tried both XPath1 and XPath2 notation (//*[]) and (//*:name), both with the same result.
And funny thing, even with the XPath //*[local-name()='strong'], the file without the namespace still works fine, but with the namespace, wrong segmentation.icon_smile.gif

Kind regards,
Geert


 

Extra Consult
Belgium
Local time: 12:03
Member (2008)
English
TOPIC STARTER
Works now Jun 30, 2014

Hi Paul,

sorry, I missed a [br] tag in the XML, which caused segmentation, even with the //*[local-name()] Xpath. Now with both removing the br tag and replacing the strong tag with the XPath, segmentation is normal in the Editor window.

Kind regards, and thanks again for the help.
Geert



[Edited at 2014-06-30 14:55 GMT]


 

SDL Community  Identity Verified
United Kingdom
Local time: 12:03
English
Any chance... Jun 30, 2014

... you can send me a file that does not work? I created a test file based on the info you provided so far and it works fine, so I think it would help to not guess any more. Maybe you can cut it down and replace the translatable text so it's a simple file showing only the problem and without potentially sensitive information.

Regards

Paul
pfilkin@sdl.com


 

SDL Community  Identity Verified
United Kingdom
Local time: 12:03
English
Or did this mean you're sorted? Jun 30, 2014

Extra Consult wrote:

Now with both removing the br tag and replacing the strong tag with the XPath, segmentation is normal in the Editor window.



 

Extra Consult
Belgium
Local time: 12:03
Member (2008)
English
TOPIC STARTER
yes, it's ok Jul 1, 2014

Hey Paul,

sorry for the confusion, it does work now. I overlooked a [br] tag in the first test, but the XPath solution works very well.

Kind regards,
Geert


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

XML Namespace segmentation

Advanced search







CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search