Translate Joomla! export (import jDiction XLIFF file)
Thread poster: Languageman

Languageman  Identity Verified
United Kingdom
Local time: 05:32
German to English
+ ...
Jun 2, 2016

Hi,

I've recently invested in a new website built on Joomla! because it has good support for multilingual sites. The plan is to use the jDiction plugin to export to XLIFF in order to translate the site with an existing TM.

Unfortunately, when jDiction creates source segments it does not segment on punctuation, but rather puts the entire content of each page into a single segment. The file imports fine, but because there is so much in each segment most of them don't come up with TM matches (only the headings work as you'd hope). An example of how this looks when imported to MemoQ is shown at the bottom of this post.

Can anyone advise how I can get MemoQ to show me segments at sentence level so I can use my TMs? Or recommend an alternative way of extracting the Joomla pages for translation?

Thanks and kind regards,

Stephen



Direct link Reply with quote
 
Stanislav Okhvat
Local time: 07:32
English to Russian
Use XLF filter options Jun 2, 2016

Hi Stephen,

You should use Import with Options command instead and activate the cryptic "Segment text if no <seg-source> is present for a 'trans-unit'" option. Also add Regex tagger cascading filter with <[^>]+> expression in order to turn HTML tags into memoQ tags (or do it after using Regex tagger).

Best regards,
Stanislav


Direct link Reply with quote
 

Languageman  Identity Verified
United Kingdom
Local time: 05:32
German to English
+ ...
TOPIC STARTER
Part way there Jun 2, 2016

Hi Stanislav,

Thanks for the suggestions. I implemented the two points in several combinations ("cryptic option" only, Regex Tagger only, "Cryptic Option" + regex tagger).

All resulted in warnings along the lines of: "Segmentation in source and target content may have resulted in a different number of segments in the following trans-unit elements:"

Some of the target segments were split better, but others were not, and the changes to source and target were not the same (see image at end of post).

I can think of a couple of things might be contributing to the continuing problems:

1/ The XLIFF contains identical text in source and target for some reason, not blank targets.
2/ I haven't set up the Regex Tagger correctly (first time using this) - I copied the settings in the image at the bottom

Thanks for any further suggestions you can offer.

Best wishes, Stephen





Direct link Reply with quote
 
Stanislav Okhvat
Local time: 07:32
English to Russian
Re: Part way there Jun 3, 2016

Hello again, Stephen,

The settings for the cascading filter look correct.

In my opinion, the problem lies in two areas:
1) The XLIFF file is set up in a specific way which causes memoQ to display these warnings.
2) The segments contain structural HTML elements such as Level 4 headings (h4), paragraphs (p), etc., all in the same segments. If you used the HTML cascading filter, memoQ would split the segments by structural boundaries. However, there is no way to use the HTML cascading filter after the XLIFF filter (only Regex tagger is supported as the cascading filter for the XLIFF filter), so we cannot split the text by boundaries of HTML structural elements. The only way to ensure proper segmentation is for you to split the segments manually in memoQ.

You can send me the document privately at stasokhvat AT gmail DOT com so I can have a better look.

Also, it may be possible to find some toolkits for XLIFF file processing which will clear the translation (target) from your files. Ideally here is how your XLIFF file records must look like (note that the target is empty):

<trans-unit id='p137'>
<source>RIL – Mega PP </source>
<target></target>
</trans-unit>

Best regards,
Stanislav Okhvat
TransTools – Useful tools for every translator


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Translate Joomla! export (import jDiction XLIFF file)

Advanced search






WordFinder
The words you want Anywhere, Anytime

WordFinder is the market's fastest and easiest way of finding the right word, term, translation or synonym in one or more dictionaries. In our assortment you can choose among more than 120 dictionaries in 15 languages from leading publishers.

More info »
memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search