XML filetype parser with embedded html
Thread poster: foremonly
Jan 8, 2013

I have a few questions.

1. Is it possible to change segmentation rules for certain nodes within a parser? For example, I have an xml with list of keyword separated by commas (i.e. keyword1,keyword2,keyword3). Is it possible to have these segmented by the comma? In the editor view, I would like to have each keyword in a separate segment and exclude the comma as per the following:


Please note that the keywords are found in a keyword node. The other nodes in the XML do not need to be parsed in this way.

This leads me to my second question...

2. Is it possible to set up sub parsers. For example, I want to parse the content in one node differently from another node and have the text segmented differently.

3. For HTML embedded XMLs, is it possible to import the settings from the HTML filetype into the XML filetype instead of manually setting embedded tag rules? It seems like a bunch of work to manually add all possible html in the XML filetype, when the default settings for the HTML filetype seem to be very comprehensive.

I would be very grateful if someone could provide some insight on these questions.

Thank you!


SDL Community  Identity Verified
United Kingdom
Local time: 22:37
Tricky questions! Jan 9, 2013

Hi Emily,

I created a test file like this (maybe not exactly what you have but it may give you some ideas):

Then I created a filetype that does this:

So not perfect, but it may be enough to provide you with what you need, or a few examples for starters. The parser rules I used were these:

Note that I added some context to li which contains the lists seperated by commas. This allowed me to then add two embedded content rules like this:

The first rule captures any tag at all in the CDATA section where I placed my html content, and the second was simply a comma. I then made the comma a nontranslatable placeable and excluded it so it was help outside the segments:

So, not a perfect embedded html solution you were after but perhaps overall something you can work with?




Thanks! Jan 9, 2013

Thanks, Paul! Helpful as always!

I will have a look and see how it goes.


To report site rules violations or get help, contact a site moderator:

You can also contact site staff by submitting a support request »

XML filetype parser with embedded html

Advanced search

Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »

  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search