Filter Segments in TMX File that contain more than e.G. 5 Words
Thread poster: Sarah Jackowski
Sarah Jackowski
Germany
Local time: 15:18
English to German
Sep 18, 2014

Hi there,

I have an TMX Export of a Software user Interface and I would like to create a glossary or termbase from it for translators reference.
However, I would like to reduce the amount of segments by kicking out all segments which more than 5 words in the source text.

I'm open for any tool, but I thought I could do this with Olifant or Trados Studio 2014...

Olifant:
When I choose "View > Filter Settings" there are a few examples for filters, and one filter already comes close to what I want:

"Source Text is longer than 255 characters"
The condition for it Looks like this:
LEN(Text_DE_DE) > 255

It would be great if I could adapt this rule to "Source Text is longer than 5 Words".
I also tried to work with the "longer than 255 characters" filter, turning it down to "longer than 30 / 40 characters", but I was not satisfied with the result. I think for my needs it's the best to filter by the amount of words.


In Trados Studio 2014 there is a similar Feature in the Translation Memories tab.
I Choose the TM, create a Filter and a condition
"Source Segment" + "greater than" but I have no idea what Kind of value to write down then... obviously, when I just write "5 Words" it wouldn't work...

Thanks a lot in advance for your help and happy translating


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 15:18
English
If you use Studio... Sep 18, 2014

... and you seem to then maybe try this approach as it's fairly straightforward.

1. Convert the TMX to Excel using the Glossary Converter on the OpenExchange
2. Sort your rows in Excel by length, or use a formula to find the ones with 5 words or less
3. Remove the rows you don't want in Excel
4. Convert the Excel to a termbase using the Glossary Converter

Should be fairly simple I think.

Regards

Paul


Direct link Reply with quote
 
Sarah Jackowski
Germany
Local time: 15:18
English to German
TOPIC STARTER
Tried that before, not happy with excel filtering options Sep 18, 2014

SDL Support wrote:

... and you seem to then maybe try this approach as it's fairly straightforward.

1. Convert the TMX to Excel using the Glossary Converter on the OpenExchange
2. Sort your rows in Excel by length, or use a formula to find the ones with 5 words or less
3. Remove the rows you don't want in Excel
4. Convert the Excel to a termbase using the Glossary Converter

Should be fairly simple I think.

Regards

Paul


Hi Paul,

thanks, but I already tried #2 before in Excel, but I could not find out what Kind of formula I have to create, I'm not a pro when it comes to creating regular expressions and stuff like that...


Direct link Reply with quote
 
Sarah Jackowski
Germany
Local time: 15:18
English to German
TOPIC STARTER
Glossary Converter in the Open Exchange App Store Sep 18, 2014

By the way,

I see two versions of the glossary converter on the open Exchange App Store, one Charity Edition and one Free Edition, but it seems like I can not access the Free Edition...

I wanted to have a look on the free Edition, but then it's product page won't open, instead I am forwarded to a General page with Solutions for freelancers, Translation agencies etc.

Not that I wouldn't want to pay for the charity Edition, but I'd like to have a look on the App's Reviews which are not available on the Charity Edition's overvview page...


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 15:18
English
Excel and stuff... Sep 18, 2014

Hi Sarah,

I used this formulae:

=IF(LEN(TRIM(A1))=0,0,LEN(TRIM(A1))-LEN(SUBSTITUTE(A1," ",""))+1)

I attached a spreadsheet here that will show you how to use it: https://www.dropbox.com/s/i10maoi58q2pyfx/count.xlsx?dl=0

On the links... I'm not sure why you can't get to it. It might be related to you having to tell the site what kind of user you are (I recall seeing a problem for someone else like this), so a Freelancer, or an LSP or a Corporate... it can then display different content. Maybe look here which is the direct link to the developers site where you can download anyway: http://www.cerebus.de/glossaryconverter/

Regards

Paul


Direct link Reply with quote
 
Sarah Jackowski
Germany
Local time: 15:18
English to German
TOPIC STARTER
Finally... Worked out Sep 18, 2014

SDL Support wrote:

Hi Sarah,

I used this formulae:

=IF(LEN(TRIM(A1))=0,0,LEN(TRIM(A1))-LEN(SUBSTITUTE(A1," ",""))+1)

I attached a spreadsheet here that will show you how to use it: https://www.dropbox.com/s/i10maoi58q2pyfx/count.xlsx?dl=0

On the links... I'm not sure why you can't get to it. It might be related to you having to tell the site what kind of user you are (I recall seeing a problem for someone else like this), so a Freelancer, or an LSP or a Corporate... it can then display different content. Maybe look here which is the direct link to the developers site where you can download anyway: http://www.cerebus.de/glossaryconverter/

Regards

Paul


Hi Paul,

thank you very much. Finally managed to get those things sorted in Excel

I will check out the link for the glossary converter.
When I tell the site that I am a Company, I am forwarded to the Trados Studio 2014 Product Information and not to the Glossary Converter in the App Store.

[Edited at 2014-09-18 12:42 GMT]

[Edited at 2014-09-18 12:54 GMT]


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 14:18
Member (2009)
Dutch to English
+ ...
PS: Jun 10

You can also easily sort columns in an excel file on text length (or anything else you want), using http://www.asap-utilities.com/

see e.g.: http://www.asap-utilities.com/blog/index.php/2013/04/23/tip-sort-your-data-on-anything-you-can-think-of/

Michael


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Filter Segments in TMX File that contain more than e.G. 5 Words

Advanced search







SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »
LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search