Pages in topic:   [1 2] >
Find and Replace with Regular Expressions in Studio
Thread poster: Jacques DP

Jacques DP  Identity Verified
Switzerland
Local time: 02:20
Member (2003)
English to French
Jul 14, 2011

Hello there,

I am familiar with regular expressions and I tried to use them with the Find/Replace function in Studio. However I am not sure how to refer, in the Replace field, to the expressions matched in the Find field.

For example, suppose I want to change

$4

to

4 $

In the Find field I might use

\$([0-9]+)

Then in the Replace field I would like to write something like

\1 $

However "\1" is not understood by the system as matching the first bracketed expression, instead it is reprinted literally.

Any idea? Thank you.


 

Emma Goldsmith  Identity Verified
Spain
Local time: 02:20
Member (2010)
Spanish to English
find&replace in Studio Jul 14, 2011

I asked a similar question here:
http://www.proz.com/forum/sdl_trados_support/197884-using_findreplace_to_change_commas_to_periods.html
and Istvan explained that "\" doesn't work in find&replace in Studio.icon_frown.gif


 

Jacques DP  Identity Verified
Switzerland
Local time: 02:20
Member (2003)
English to French
TOPIC STARTER
Thanks Jul 14, 2011

Thanks Emma!

That's a strange oversight and seems to reduce a lot the usefulness of regular expressions in a find-replace function. Also the lookahead/lookbehind mechanism doesn't seem to help in my case.


 

FarkasAndras
Local time: 02:20
English to Hungarian
+ ...
Lookahead/lookbehind Jul 14, 2011

... should work. You just have to do it in two steps.
First, add the $ to the end with something like:
(Quote my post to see the regex, I gave up on fighting the forum)

replace '(?\$


 

Jacques DP  Identity Verified
Switzerland
Local time: 02:20
Member (2003)
English to French
TOPIC STARTER
Lookahead Jul 14, 2011

I reasoned in the same way and thought it would work but it didn't and I can't afford to spend more time researching this at this time unfortunately. I noted in particular that including only a lookbehind construct in Find (no main regular expression) doesn't work as expected as it matches (and replaces) one character which I think should not be the case. My guess was that the implementation is not very general and they made assumptions about how people would use it.

 

Jacques DP  Identity Verified
Switzerland
Local time: 02:20
Member (2003)
English to French
TOPIC STARTER
Thanks Jul 14, 2011

Thanks Emma!

That's a strange oversight and seems to reduce a lot the usefulness of regular expressions in a find-replace function. Also the lookahead/lookbehind mechanism doesn't seem to help in my case.

(I posted this already at 4.30 pm but it says it is in the vetting process. More bugs...)


 

István Hirsch  Identity Verified
Local time: 02:20
English to Hungarian
In Replace... Jul 14, 2011

… Studio does not remember what it found during the Find operation, so you cannot use „\”.

But you can use a workaround for this special case, if: the numbers are directly preceded by a "$" and followed by a space character, and do not have any decimal separator (including space). Working out a more general solution would be much more challenging.

1. Find: (?<=\$[0-9]+)_ Replace with: _$_ („Use”: Regular expressions - note the spaces!)

2. Find: \$(?=[0-9]) Replace with nothing.

In the 1st step you find spaces preceded by number preceded by „$” and replace them with _$_, then in the 2nd step you delete old $s.


[Módosítva: 2011-07-14 15:21 GMT]

[Módosítva: 2011-07-14 15:22 GMT]


 

Jacques DP  Identity Verified
Switzerland
Local time: 02:20
Member (2003)
English to French
TOPIC STARTER
Thanks István but Jul 14, 2011

Thanks István but a number can be followed by a comma, a period, nothing, a closing bracket, a colon, etc. My first thought when I learned about lookahead and lookbehind was that they would allow to work around the absence of references to matching expressions, but I soon realized this was actually not the case.

 

István Hirsch  Identity Verified
Local time: 02:20
English to Hungarian
In that case... Jul 14, 2011

...I would replace $ with ß$ in Studio, and after saving, would open the file in Notepad++ and
( regex) Find: (ß$)([0-9]+) Replace with: \2_$
(„Save” in Notepad, Open in Studio, delete ß if left).
ß was needed to restrict the changes to the target text in Notepad++.


 

Jacques DP  Identity Verified
Switzerland
Local time: 02:20
Member (2003)
English to French
TOPIC STARTER
Thanks all Jul 15, 2011

Thanks all for your replies and suggestions (including FarkasAndras). István, you are right that using a plain text editor is always a possibility. However it is not exactly convenient and it is always a bit dangerous to mess with the underlying files...

 

FarkasAndras
Local time: 02:20
English to Hungarian
+ ...
But... Jul 15, 2011

Jacques DP wrote:

I noted in particular that including only a lookbehind construct in Find (no main regular expression) doesn't work as expected as it matches (and replaces) one character which I think should not be the case.


I just tested this and you're right. As far as I'm concerned, that's definitely a bug in the regex engine.
I even tried adding a \b after the lookbehind so that it's not on its own, and Studio still deleted the character following the number.


 

Jacques DP  Identity Verified
Switzerland
Local time: 02:20
Member (2003)
English to French
TOPIC STARTER
I agree... Jul 15, 2011

...that it's a bugicon_smile.gif

 

SDL Community  Identity Verified
United Kingdom
Local time: 02:20
English
Not a bug... Jul 15, 2011

... a missing featureicon_wink.gif

Hi both,

Unfortunately you are correct and the regex search only works on the search part of the search and replace, so you always have to use the sort of clever workarounds that István introduces, or work directly on the SDLXLIFF with a decent editor. But as Farkas has pointed out in the past, this can be dangerous if you happen to match something unintended and break the file.

I did investigate the Batch Search and Replace application on the OpenExchange here:

SDL Batch Find/Replace : http://tinyurl.com/SDLbatchfind-replace

But this also seems to have a problem with this and it looks as though named groups are not supported in this application.

So, we have added an item to the Product Backlog for regex support in the Replace component of Search and Replace for Studio to be added... and we have contacted the developer of the OpenExchange application to see what can be done here too.

I did have an idea though. I used the SDLXLIFF Converter to export the SDLXLIFF to MSWord and then used the S&R with wildcards in there as follows:

Search ($)(?)
Replace \2 \1

And that correctly transformed your simpler example (I'm not clever enough to tackle the other one) so I could then import the amended SDLXLIFF back into Studio with the SDLXLIFF Converter. So, another workaround, but maybe useful?

Regards

Paul


 

Jacques DP  Identity Verified
Switzerland
Local time: 02:20
Member (2003)
English to French
TOPIC STARTER
One missing feature and one bug, actually Jul 15, 2011

Hi Paul,

Well, in this thread we mentioned one missing feature, namely backreferences, and one bug, namely the fact that a lookaround (lookahead or lookbehind) construct alone in the Find field uses (and replaces) one character, which is wrong by the definition of lookaround constructs.

so you always have to use the sort of clever workarounds that István introduces


Note that lookaround constructs do not generally allow one to solve the challenges created by the absence of backreferences. On the other hand, if we have backreferences, then lookaround constructs are only slightly convenient but not necessary. There is no equivalence between the two, backreferences allow arbitrary manipulation of matched substrings, while lookaround constructs (which were introduced late in the history of regular expressions) simply allows to drop some backreferences.

Thanks.


 

SDL Community  Identity Verified
United Kingdom
Local time: 02:20
English
Noted. Jul 15, 2011

Thanks Jacques.

 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Find and Replace with Regular Expressions in Studio

Advanced search







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search