Embedded HTML in .xlsx file
Thread poster: Ian Giles

Ian Giles  Identity Verified
United Kingdom
Local time: 00:09
Member (2012)
Swedish to English
+ ...
Mar 25, 2013

I am working on a website translation with the source text provided in a column in excel and the translation to go in the column beside it. As is often the case, there are instances of html in source such as (< replaced with - to avoid posting issues):

-b-text-/b-
-strong-text-/strong-
-other tag-text-/other tag-

And so on. Additionally there are dynamic markers (or whatever) that look like this:

{0}
{9}

etc.

I must retain all of these things while translating. How do I get studio 2011 to recognise these as tags and "lock" them to ensure I keep them in the translation? I've googled around and browsed the forums here and found a couple of suggestions regarding the settings for embedded content for this excel file type, but having implemented this without success (the html remains standard text) I'm at a loss. I suspect I'm being dense and missing an obvious issue, but pointers would be appreciated.

[Edited at 2013-03-25 18:30 GMT]


 

Grzegorz Gryc  Identity Verified
Local time: 01:09
French to Polish
+ ...
DVX or memoQ Mar 26, 2013

iangiles wrote:

I must retain all of these things while translating. How do I get studio 2011 to recognise these as tags and "lock" them to ensure I keep them in the translation? I've googled around and browsed the forums here and found a couple of suggestions regarding the settings for embedded content for this excel file type, but having implemented this without success (the html remains standard text) I'm at a loss.

Indeed, it's rather untractable.

I suspect I'm being dense and missing an obvious issue, but pointers would be appreciated.

Download a demo version of memoQ from http://kilgray.com or DVX from http://www.atril.com
Then you can use cascaded filters (memoQ) or the HTML content processing feature (DVX).
Just few clicks.

You must deliver .xlsx or .sdlxliff?

Cheers
GG


 

SDL Community  Identity Verified
United Kingdom
Local time: 01:09
English
It's not so hard... Mar 26, 2013

... and you can probably get away with one expression for the html tag pair content and then one more for the dynamic placeables or whatever you call them. So I take an excel file like this:


Then add these rules:


I open the excel file and see this:


So not too complicated and the tag pair expression is a catch all so should get anything written in this style. Important things tonote are you must have embedded content enabled as shown in the second screenshot and select the "cell" reference for the Document Structure Information:


This question has been asked so many times, and answered, I'm quite surprised the only response you got was to use another tool!

Regards

Paul


 

Ian Giles  Identity Verified
United Kingdom
Local time: 00:09
Member (2012)
Swedish to English
+ ...
TOPIC STARTER
that seems to have done it Mar 27, 2013

Thanks Paul. That seems to have done the trick - really don't know why it wasn't playing ball previously - but step by step instructions was the fix.

One follow up question. It seems to have made most of the tags invisible (that is, if they surround a full segment: tag-segmenttext-tag, they are not visible in editor, only when viewing a target preview) leaving only the tags that occur in the middle of a segment (such as one word being bolded) with tags. Is there any way to make all the html tags visible even if they don't have to be? (for peace of mind)


 

Grzegorz Gryc  Identity Verified
Local time: 01:09
French to Polish
+ ...
Attributes Mar 27, 2013

SDL Support wrote:

(...)

This question has been asked so many times, and answered, I'm quite surprised the only response you got was to use another tool!


Because in this way you don't handle tag attributes like alt.

Cheers
GG


 

SDL Community  Identity Verified
United Kingdom
Local time: 01:09
English
Showing tags Mar 27, 2013

Hi,

If you click on the rule and then Edit -> Advanced you can change the segmentation hint to Include and then all the tags will be inside the segment whether they can be excluded or not.

@GG
On the attributes... indeed you won't get these with these rules but you can always add more. Obviously the more complex the html the harder it gets with this method but we are talking about html inside an excel file here and I don't think it often gets really complex. So I can take this:


Make it look like this:


Or if I want all the tags to show anyway like this:


Clearly being able to simply run this through the html filter and still have the flexibility to handle the {1} tags or whatever else wasn't html and needed to be a tag would be better. But I think this is suitable for the majority of usecases.

How would you handle the html and the {1} tags in the other tools? Just curious as I have not played with these a lot... in fact I've never played with the DVX embedded content feature.

Regards

Paul


 

Grzegorz Gryc  Identity Verified
Local time: 01:09
French to Polish
+ ...
{1} like tags... Mar 28, 2013

SDL Support wrote:

(...)

@GG
On the attributes... indeed you won't get these with these rules but you can always add more. Obviously the more complex the html the harder it gets with this method but we are talking about html inside an excel file here and I don't think it often gets really complex.

(...)

Or if I want all the tags to show anyway like this:


Wow.
How you managed to display "Smiley face" in blue?
Can you share your RegEx sets you used?
Or suggest to the marketing team to bundle a programmer with every Studio license icon_smile.gif

Clearly being able to simply run this through the html filter and still have the flexibility to handle the {1} tags or whatever else wasn't html and needed to be a tag would be better. But I think this is suitable for the majority of usecases.

How would you handle the html and the {1} tags in the other tools?
Just curious as I have not played with these a lot... in fact I've never played with the DVX embedded content feature.

RegEx.
But I can combine the HTML content processing and RegEx, so the RegEx rules are less complex.
The DVX help for RegEx basically doesn't exist but you can find ll the necessary info at
http://www.regular-expressions.info/
For memoQ, I would use cascaded filters i.e. XLS(X), HTML, then RegEx.
I did sometimes it in the past, it's really simple, at least in my cases.
Generally memoQ is more user friendly here, their RegEx generator is really nice, e.g. the basic RegEx syntax control is awesome.
On of the main DVX/memoQ advantages is the testing is faster, the reimporting of a file in an existing Studio project is a PITA.

BTW, recently I provided a simple scenario for the DVX {1} like codes in two column RTF EV:


 

SDL Community  Identity Verified
United Kingdom
Local time: 01:09
English
Regex Mar 29, 2013

Hi GG,

Very good... I'm not sure whether you're having fun or you really want to know. Sometimes it's hard to tell! But it may be helpful for others too so all you do is use the edit button when you edit the regex rule and you can set this to be whatever style you like. So you could make these <strong> and these <italic> and these <blue> if you like:


The rule could be as clever as you like to catch more variations but I kept it simple for this case, so the rules I used were these:


For this example I don't think memoQ or DVX could be any simpler but I imagine more complex html would be easier with the cascading filters and regex tagger. And for testing... just use Open Document. All you do is this:

Ctrl+shift+O
Double click file
Enter

Not too hardicon_wink.gif Once the filetype is as you want create your project. If the file was part of an existing project that you created before making sure the filetype was good, and it was already partially complete then you might wish to add the file to the project rather than Open Document so this would be a PITA as you'd have to copy the corrected settings into the Project filetype once it was right and then add the file to the project again.

Regards

Paul


 

EGolubtsova
Local time: 02:09
English to Russian
+ ...
Regex excluding text for 'title' attribute Jul 16, 2013

Hi Paul,

Thank you for your examples of how to handle tags in Excel files with the regex rules, they are really helpful.

Is there any way to make the text of a 'title' attribute in the below example available for translation?

E.g. in the string "abbr title="World Health Organization"WHO/abbr was founded in 1948." (the tags are in bold, because I cannot add the angled quotes) the text "World Health Organization" should also be translatable along with WHO.

I'm not really good with regex, could you please help? Thank you very much in advance!


 

SDL Community  Identity Verified
United Kingdom
Local time: 01:09
English
For this... an perhaps other simliar... Jul 17, 2013

EGolubtsova wrote:

Hi Paul,

Thank you for your examples of how to handle tags in Excel files with the regex rules, they are really helpful.

Is there any way to make the text of a 'title' attribute in the below example available for translation?

E.g. in the string "abbr title="World Health Organization"WHO/abbr was founded in 1948." (the tags are in bold, because I cannot add the angled quotes) the text "World Health Organization" should also be translatable along with WHO.

I'm not really good with regex, could you please help? Thank you very much in advance!


... examples you could do this:


Using these expressions:

Tag Pair:

Start Tag:
<(\w|\s)+="

End Tag
">


Placeholder:

Start Tag:
</abbr>


But it may depend on what else you have in there. So this is based on this example and this alone. If it doesn't work for you then I'd need to see what else you are doing... and I can't do this for a week or so as I'm on leave and killing time before I leave todayicon_smile.gif

Regards

Paul


 

EGolubtsova
Local time: 02:09
English to Russian
+ ...
Thank you Jul 17, 2013

Thank you Paul!

While waiting for your reply I experimented and changed the order of rules and added two rules for the attributes I need to localize, so it looks like this now:

2013-07-17_16-39_ProjectSettings_zps00977898.jpg

A far as I can see, it works the same, and the ending tag 'abbr' is treated as a separate placeholder. With the other order of the same rules it worked differently, so probably everyone should also take this into account.

As there are other attributes that have to stay in English, I think I'll stick to that configuration for nowicon_smile.gif

Thank you anyway, and have a beautiful vacation!

Best regards,
Ekaterina


 

Brossard Mael
Local time: 01:09
English to French
HTML tags in trados studio 2011 Nov 13, 2013

Hello,

I'm having some trouble with html tags embedded in xls files.

I am using the following rules:



I find that everything seems to work, except that the br tags tend to be treated as tags and not placeable if other tags are present in the segment.

For example



Any idea what I should do?

Thanks!

Mael

[Edited at 2013-11-13 21:04 GMT]

[Edited at 2013-11-13 22:42 GMT]


 

SDL Community  Identity Verified
United Kingdom
Local time: 01:09
English
If this is still a problem... Nov 21, 2013

... can you show the full tag details so we can see what the text was?

Regards

Paul


 

Brossard Mael
Local time: 01:09
English to French
Still weird... Nov 28, 2013

Hello Paul,

Thanks for your help.

The main issue is that "br" is treated as a tag and not a placeable, like in the following simple example:

6Hy8G5z.png

Trados then just "closes" br with any closing tag following.

Thank you!

Mael


 

SDL Community  Identity Verified
United Kingdom
Local time: 01:09
English
Generic rules Nov 28, 2013

Hi Mael,

The rules you are using are the generic ones I used to demonstrate a catchall earlier on. It may be that the specific usecase you have causes these to be inappropriate and yo need to tackle them in a different way.

If you have an example file you can share I can take a look and make a suggestion?

Regards

Paul
pfilkin@sdl.com


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Embedded HTML in .xlsx file

Advanced search







PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search