regular expression in Xbench
Thread poster: bourriquet

bourriquet
Poland
Local time: 14:33
English to Polish
+ ...
Apr 25, 2015

Could someone help me create a regular expression for Xbench? I need this program to find the mistake where there's a redundant period between words in the target (i.e. "word.word") instead of a space (i.e. "word word"). I don't know much about regular expressions but I heard they could help find mistakes like these.

 

Mikhail Zavidin
Ukraine
Local time: 15:33
English to Russian
+ ...
For example Apr 25, 2015

Hi!

bourriquet wrote:

Could someone help me create a regular expression for Xbench? I need this program to find the mistake where there's a redundant period between words in the target (i.e. "word.word") instead of a space (i.e. "word word"). I don't know much about regular expressions but I heard they could help find mistakes like these.


Something like that:

([a-zA-Z0-9]+)=1\.@1


Hope this helps.


 

Riccardo Schiaffino  Identity Verified
United States
Local time: 06:33
Member (2003)
English to Italian
+ ...
Not sure that is what Bourriquet needs Apr 26, 2015

Mikhail Zavidin wrote:

Hi!

bourriquet wrote:

Could someone help me create a regular expression for Xbench? I need this program to find the mistake where there's a redundant period between words in the target (i.e. "word.word") instead of a space (i.e. "word word"). I don't know much about regular expressions but I heard they could help find mistakes like these.


Something like that:

([a-zA-Z0-9]+)=1\.@1


Hope this helps.


Mikahil,

I think that the regular expression you suggest would find not only periods between words (what Bourriquet needs), but also dots inside numbers.

I.e., your regular expression would not only flag

word.word,

but also

12.456,39

I would suggest instead a simple

[a-zA-z]\.[a-zA-z]


 

Rolf Keller
Germany
Local time: 14:33
English to German
"A to Z" restricts the search to plain English Apr 26, 2015

I propose
([:letter:]+)[:punctuation:]([:letter:]+)

(No XBench here to test it, though.)


 

bourriquet
Poland
Local time: 14:33
English to Polish
+ ...
TOPIC STARTER
no results so far Apr 26, 2015

Thank you very much for fast replies.
However, none of these work in Xbench (checked against a Studio file with "word.word" mistake in target) after adding to checklist, but may be I'm doing something wrong.
It would be great if someone could verify themselves if their regular expression works in Xbench based on a Studio file with this kind of mistake included.

[Edited at 2015-04-26 10:22 GMT]


 

pep
Local time: 14:33
English to Spanish
Remember to set the Regular Expression search mode Apr 26, 2015

These expressions should both work:

[:letter:]+\.[:letter:]+
[a-z]+\.[a-z]+

but you should set the search mode as Regular Expression.

proz.sample.png

[Edited at 2015-04-26 13:10 GMT]


 

Mikhail Zavidin
Ukraine
Local time: 15:33
English to Russian
+ ...
Seems to be a RegEx bug Apr 26, 2015

bourriquet wrote:

Could someone help me create a regular expression for Xbench? I need this program to find the mistake where there's a redundant period between words in the target (i.e. "word.word") instead of a space (i.e. "word word"). I don't know much about regular expressions but I heard they could help find mistakes like these.


The RegEx I've suggested seems to work in plain text files (including UTF-8). With Cyrillic symbols also. I've tried ([а-я]+)=1\.@1.

With Trados Studio 2011's files the Xbench 2.9 behaves weirdly. It doesn't work correctly even with Latin symbols.

I suggest you report a bug to ApSIC. Or may be in Xbench 3.0 it will work as expected because it supports Unicode or so.


 

Riccardo Schiaffino  Identity Verified
United States
Local time: 06:33
Member (2003)
English to Italian
+ ...
A correction to my suggestion (and to Pep's) Apr 26, 2015

pep wrote:

These expressions should both work:

[:letter:]+\.[:letter:]+
[a-z]+\.[a-z]+



Wile [:letter:]+\.[:letter:]+ (or even just [:letter:]\.[:letter:]) works correctly,

[a-z]+\.[a-z]+ (or, for that matter, the regex expression I had suggested: [a-zA-Z]\.[a-zA-Z])

only works for non-accented characters

So the solution to Bourriquet in Xbench should be

[:letter:]+\.[:letter:]+


 

bourriquet
Poland
Local time: 14:33
English to Polish
+ ...
TOPIC STARTER
now it works Apr 27, 2015

Thank you all for your contributions. I created a new checklist item with one of the regular expressions in the target column and I get results after selecting "Check ongoing translation..." This is what I wanted, thank you!

 

Oscar Martin
Spain
Local time: 14:33
English to Spanish
+ ...
RegEx Apr 27, 2015

Hi,

While [:letter:]+\.[:letter:]+ will find the segments you're looking for, there are some items that must be taken into account:

- When using regular expressions, Xbench evaluates the regex from left to right. That is, if the first element is [:letter:], it will evaluate all segments that contains a sequence of one or more e [:letter:]. If the Xbench project contains a high number of segments, the search may take up to some minutes.
- Search first an element that should be present or not but it is not regex syntax, but a character, a word, etc. For instance, this search can be optimized as
"\." "[:letter:]+\.[:letter:]+"
as a powerseach.

First, it will discard all segments that do not contain a dot. Then it will evaluate 2 sequences of letters separated by a dot instead of a space.

Regards,

Oscar


 

Mikhail Zavidin
Ukraine
Local time: 15:33
English to Russian
+ ...
Now it works for me too Apr 27, 2015

bourriquet wrote:

Thank you all for your contributions. I created a new checklist item with one of the regular expressions in the target column and I get results after selecting "Check ongoing translation..." This is what I wanted, thank you!


I have just made it clear: the segments in question in Trados Studio must be with translated status. Then everything works OK.


 

kirsty morgan  Identity Verified
Spain
Local time: 14:33
Spanish to English
+ ...
XBench regular expressions tutorial? Oct 13, 2016

Does anyone know of a tutorial or guide that teaches the basic language of regular expressions as used by XBench? I currently use them to check punctuation differences but would like to find an easy-to-follow tutorial or guide to learn how to write more expressions. Most of the material I have found on regular expressions makes my head spin a little!
Any ideas gratefully received.


 

Dan Lucas  Identity Verified
United Kingdom
Local time: 13:33
Member (2014)
Japanese to English
Try RegexBuddy Oct 13, 2016

kirsty morgan wrote:
Any ideas gratefully received.

XBench's regex flavor is POSIX extended (ERE), about which more here. That site also has many useful-looking tutorials. However, might I suggest - as an independent but satisfied customer - that you invest just under 30 euro in RegexBuddy.

I have been using regexes for nearly two decades, starting from when I was doing a lot of Perl in the late 1990s, so I'm not a stranger to this arcane branch of text processing. But I don't use them every day and it's easy to forget the details, so when I'm trying to make up a new regex, unless it's really simple I reach for RegexBuddy first.

I find the menus, little symbols and text explanations make it far easier to build up a regex, brick by brick. Then you can test it right there with some text from your document.

Support is top-notch. RegexBuddy includes access to an excellent forum where the developer helps out with regex questions for free and typically replies within 24 hours.

Regards
Dan


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

regular expression in Xbench

Advanced search






Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search