Translate Joomla! export (import jDiction XLIFF file)
Thread poster: Languageman

Languageman  Identity Verified
United Kingdom
Local time: 03:38
German to English
+ ...
Jun 2, 2016

Hi,

I've recently invested in a new website built on Joomla! because it has good support for multilingual sites. The plan is to use the jDiction plugin to export to XLIFF in order to translate the site with an existing TM.

Unfortunately, when jDiction creates source segments it does not segment on punctuation, but rather puts the entire content of each page into a single segment. The file imports fine, but because there is so much in each segment most of them don't come up with TM matches (only the headings work as you'd hope). An example of how this looks when imported to MemoQ is shown at the bottom of this post.

Can anyone advise how I can get MemoQ to show me segments at sentence level so I can use my TMs? Or recommend an alternative way of extracting the Joomla pages for translation?

Thanks and kind regards,

Stephen

Import_j_Diction_XLIFF.png


 

Stanislav Okhvat
Local time: 05:38
English to Russian
Use XLF filter options Jun 2, 2016

Hi Stephen,

You should use Import with Options command instead and activate the cryptic "Segment text if no <seg-source> is present for a 'trans-unit'" option. Also add Regex tagger cascading filter with <[^>]+> expression in order to turn HTML tags into memoQ tags (or do it after using Regex tagger).

Best regards,
Stanislav


 

Languageman  Identity Verified
United Kingdom
Local time: 03:38
German to English
+ ...
TOPIC STARTER
Part way there Jun 2, 2016

Hi Stanislav,

Thanks for the suggestions. I implemented the two points in several combinations ("cryptic option" only, Regex Tagger only, "Cryptic Option" + regex tagger).

All resulted in warnings along the lines of: "Segmentation in source and target content may have resulted in a different number of segments in the following trans-unit elements:"

Some of the target segments were split better, but others were not, and the changes to source and target were not the same (see image at end of post).

I can think of a couple of things might be contributing to the continuing problems:

1/ The XLIFF contains identical text in source and target for some reason, not blank targets.
2/ I haven't set up the Regex Tagger correctly (first time using this) - I copied the settings in the image at the bottom

Thanks for any further suggestions you can offer.

Best wishes, Stephen

2016_06_02_20_54_10_memo_Q_Joomla_Chinese_Omflo.jpg2016_06_02_21_37_05_memo_Q_Joomla_Chinese_Omflo.png


 

Stanislav Okhvat
Local time: 05:38
English to Russian
Re: Part way there Jun 3, 2016

Hello again, Stephen,

The settings for the cascading filter look correct.

In my opinion, the problem lies in two areas:
1) The XLIFF file is set up in a specific way which causes memoQ to display these warnings.
2) The segments contain structural HTML elements such as Level 4 headings (h4), paragraphs (p), etc., all in the same segments. If you used the HTML cascading filter, memoQ would split the segments by structural boundaries. However, there is no way to use the HTML cascading filter after the XLIFF filter (only Regex tagger is supported as the cascading filter for the XLIFF filter), so we cannot split the text by boundaries of HTML structural elements. The only way to ensure proper segmentation is for you to split the segments manually in memoQ.

You can send me the document privately at stasokhvat AT gmail DOT com so I can have a better look.

Also, it may be possible to find some toolkits for XLIFF file processing which will clear the translation (target) from your files. Ideally here is how your XLIFF file records must look like (note that the target is empty):

<trans-unit id='p137'>
<source>RIL – Mega PP </source>
<target></target>
</trans-unit>

Best regards,
Stanislav Okhvat
TransTools – Useful tools for every translator


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Translate Joomla! export (import jDiction XLIFF file)

Advanced search






Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running and helps experienced users make the most of the powerful features.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search