XML import--how to hide unwanted content?
Thread poster: Marketing-Lang.

Marketing-Lang.  Identity Verified
Germany
Local time: 02:41
English to German
+ ...
Jun 5, 2014

Dear all,
I've been playing with the parser for hours now and I'm getting nowhere.
A customer supplies their own type of XML file containing comments which are not to be translated.
The comments are marked like this:
<draft-comment> ...blah... </draft-comment>

With the parser I have so far managed to mark the tags themselves as "not translatable" and hide them, but the text between the tags - the actual comments themselves - are still there, very much editable and being counted to the word count.

Unfortunately I can't make head or tail of the help documents.

Any tips would be very welcome...

With thanks in advance,
-Mike

[Edited at 2014-06-05 13:32 GMT]


 

SDL Community  Identity Verified
United Kingdom
Local time: 02:41
English
Insufficient information Jun 5, 2014

Hi,

Clearly you understand how to make an element non-translatable, so the problem you are encountering is more likely to be based on the nesting of elements where a parent element is affecting the behaviour of the child... or somthing like this.

If you can share more of the file I'm sure someone can help you.

Regards

Paul


 

SDL Community  Identity Verified
United Kingdom
Local time: 02:41
English
ok... on rereading your.... Jun 5, 2014

... post more carefully. I think the issue is simply that you need to make these tags inline and non-translatable. This should place a lock around them and allow you to copy them over from the source in one go as a complete item.

Regards

Paul


 

Marketing-Lang.  Identity Verified
Germany
Local time: 02:41
English to German
+ ...
TOPIC STARTER
copy them over from the source in one go? Jun 5, 2014

Hi Paul,
Thanks for your input.

Am I laboring under a misconception? You write "copy them over in one go", which sounds like the comments will still appear to the translator, even if the content is not intended for translation. Is that right? So I could then somehow lock the content (preferably automatically) so that I don't translate it by mistake?

If I set the tag type to "in line", the comments still appear and they can be translated - but at least everything between the start and end of the comment is bundled into a single segment and no segmentation (e.g. after full stops) occurs.

If I set the tag type to "structure" I can at least set the structure information properties to "Comment", and the editor marks the segment with "COM", which is a least a hint.

But since these comments are not meant for the translator at all, I would prefer to have them hidden entirely and discounted from the analysis.

-Mike-


 

SDL Community  Identity Verified
United Kingdom
Local time: 02:41
English
This is why... Jun 5, 2014

... it would really help to see more of the XML file you are translating. I will try and mock something up to cover a few examples but you are not helping when we're guessing about how the overall content is structured.

Regards

Paul


 

SDL Community  Identity Verified
United Kingdom
Local time: 02:41
English
In the absence of any idea abut the file... Jun 5, 2014

... here's an example of how it works. I have a file that looks like this:


I deliberately tried to paint a few different scenarios in one.

  1. First, separate elements. Clear and straightforward.
  2. Second, inline tags but the text either side should ideally be segmented. So it should be external and not inline - but I'll ignore this.
  3. Third, inline tags where it really is an inline tag.

I create a filetype and keep default undefined structure for each one and just make the draft-comment tags "Not translatable":


When this opens in Studio I get just that. All the draft-comment tags are excluded. But my last example is segmented when it really should not be because the last one really is inline.


So in order to make sure segments #5 and #6 are joined I have to make the draft-comment tag inline:


This time I open the file in Studio and it looks like this:


So now they are inline, and the comments can be seen but they are locked. So if I copy the tag over and then try to edit it I see this:


In this example I would have the problem of how to deal with segment #3 now because this should really be two segments. Making the tag inline joins them together because this is what inline is meant to do. If the content is really prepared this way with all the examples I have here in one file then you should probably go back to your client because this would be pretty poor.

Finally the analysis ignores the non-translatable content as follows:


The inline content is treated as a tag, and also as a placeable. The wordcount being 19 which is correct.

This example is obviously completely over simplified, but if you can provide more information (unless this actually helped you?) by mocking up a file we can look at, or even sharing one of your files, then it's pretty hard to know what your problem really is.

Regards

Paul


 

Marketing-Lang.  Identity Verified
Germany
Local time: 02:41
English to German
+ ...
TOPIC STARTER
Terrific! Jun 6, 2014

Hi Paul,
wow, I can't tell you how much I appreciate your help here. Your mini-tutorial is fantastic, thanks!

I have in the meantime found (a part of?) the problem - the draft comments are (sometimes) supplemented with the author's name (J. Doe) as follows:

<draft-comment author="jdoe">Please check, is this right?</draft-comment>

I've tried using the XPath wildcards (i.e. in the form "//draft-comment*" and "//draft-comment@*", but the comments are then not recognized as such at all!

Here's a sample of the format, in this case without the author name and with elements *within* the comments:

<paramsheet id="this" xml:lang="de">

<parambody>
...
<valuelist>
<draft-comment><codeph>valuelist</codeph> ... (Inhalt) ... <codeph>arglist</codeph> und <codeph>specialval</codeph> im Abschnitt <codeph>valuefield</codeph>.</draft-comment>

It's a bit of a beast :-/
-Mike-


 

SDL Community  Identity Verified
United Kingdom
Local time: 02:41
English
Attribute values Jun 6, 2014

Hello Mike,

What's the usecase for the attribute author? Unless it is going to determine whether you translate or not then you can ignore it altogether.

Does the authors name have an particular relevance? So if it's jdoe to translate it and if it's anyone else you don't? Or maybe you have a variety of names you must translate and others you don't?

You can do whatever you like here but need to know what in order to create the rule.

Regards

Paul


 

Marketing-Lang.  Identity Verified
Germany
Local time: 02:41
English to German
+ ...
TOPIC STARTER
Want to ignore comments Jun 6, 2014

Hi Paul,
For this customer in general, the comments are irrelevant to the translation. Background: I translate iterations of the (preliminary) documentation to achieve as near as possible simultaneous publication in both languages. The comments are purely for the end client during the release cycles of the source-language documentation.

Basically, whoever the author, anything marked "draft-comment" has to be ignored.

I tried to achieve this by using the XPath wildcards, but I'm not getting the desired result.

-Mike


 

SDL Community  Identity Verified
United Kingdom
Local time: 02:41
English
In this case... Jun 6, 2014

... you don't need to do anything. It's enough to say the draft-comment should be non-translatable. The attribute has nothing to do with this at all unless it's relevant to something which is why I asked.

Regards

Paul


 

Marketing-Lang.  Identity Verified
Germany
Local time: 02:41
English to German
+ ...
TOPIC STARTER
The problem... Jun 6, 2014

...seems to relate to the XML itself, because different instances of <draft-comment> are being handled in different ways, even within the same file! So it's back to the customer, next...

OK Paul, thanks a million for your help so far. I've certainly learned something, and this forum means that others will too.

With best regards,
-Mike-


 

SDL Community  Identity Verified
United Kingdom
Local time: 02:41
English
ok - maybe a couple of useful links... Jun 6, 2014

... I think they're useful anyway.

One on using Xpath in xml files : http://wp.me/s2xDjK-xpath

One on some neat features using the xml filetype : http://wp.me/p2xDjK-Dl

And one loooong one on xml generally plus embedded content : http://wp.me/p2xDjK-Ff

Regards

Paul


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

XML import--how to hide unwanted content?

Advanced search







BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search