Searching TM with RegEx
Thread poster: bluegrasstrans
bluegrasstrans  Identity Verified
Austria
Local time: 03:56
German to English
Apr 16, 2014

Hello!

Is it possible to search through a TM in the Translation Memories view using Regular Expressions in Studio 2014? Is this possible in the latest version of World Server? If so, how?

Thanks!


Direct link Reply with quote
 

Jonathan Hopkins  Identity Verified
Germany
Local time: 03:56
German to English
+ ...
Regex missing form WorldServer 10.1.207 Apr 16, 2014

Hi,

We're in the process of upgrading WorldServer from 10.1 to 10.4, so I hope that regex will be supported there, but as for the former version of WS regex doesn't seem to be supported at all in either the search or search and replace fields.

I also just tried using regex in the Translation Memories View of Studio 2014 and was disappointed to see that it does not work there either.

You can search an xliff file (in the editor of Studio 2014) using regex, but in order to search and replace in xliff files you actually have to use the Batch Find and Replace app from OpenExchange. When trying to search and replace in Studio's editor, it *appears* at first glance to do something, but in fact, all it does is find all the matching characters and change their segment status to draft without actually replacing the character you searched for.

I'd also be interested to hear a response if later versions of WS support regex, though I guess I'll find out soon enough once we've completed the upgrade.

Cheers,
Jon


Direct link Reply with quote
 
FarkasAndras
Local time: 03:56
English to Hungarian
+ ...
Usually not Apr 16, 2014

Most tools can't do regex TM searches. Xbench and TMLookup have this feature.
Note that one of the reasons regex search is not widely supported is that it is resource intensive (=slow). The software needs to iterate through each TM entry one by one instead of using the various optimized methods it uses for normal searches. If the TM contains a few thousand entries, this is not a problem. At a few hundred thousand or a few million, it can become crippling.


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 03:56
English
Search & Replace Apr 16, 2014

Hi,

You can search and replace with regex in Studio 2014. But only in the files you have open in the Editor.

You can search across multiple files with regex but not replace with regex, using the Batch Find & Replace app that is installed with Studio 2014, and also only the target segments.

You can search across multiple files and replace with regex using the SDLXLIFF Toolkit from the OpenExchange, and this works in both source and target.

So one alternative for your clean up operations might be to use the SDLTmConvert app from the OpenExchange ( http://wp.me/p2xDjK-mq ) as you can convert your SDLTM into a bunch of XLIFF files, then create a project in Studio with these files. You then add your SDLTM and use regex in Studio or on the SDLXLIFFs to make your corrections and then update the TM with the changes, overwriting the values in the SDLTM as you update.

Not perfect, and it does carry some risks if you do a batch update, but perhaps this would work for you?

Regards

Paul


Direct link Reply with quote
 
bluegrasstrans  Identity Verified
Austria
Local time: 03:56
German to English
TOPIC STARTER
Thanks for the info! Apr 17, 2014

I appreciate all the help. Although it would be a nice feature, I understand now that it would be a handful to implement, especially for large memories. I will definitely check out the app mentioned by Paul and hope that meets my requirements. And Jonathan, maybe when the upgrade is finished, you could report back here about whether RegEx is possible in the WS memory?

Thanks again to everyone!


Direct link Reply with quote
 

Jonathan Hopkins  Identity Verified
Germany
Local time: 03:56
German to English
+ ...
Replacing regex-matched characters doesn't work in Studio editor Apr 17, 2014

Hi Paul,

Thanks for the link to your blog post. That's definitely an interesting idea which could come in handy. I'll keep that in mind for when I may need it.

Would you mind expanding on what you mean here:

SDL Support wrote:

Hi,

You can search and replace with regex in Studio 2014. But only in the files you have open in the Editor.

You can search across multiple files with regex but not replace with regex, using the Batch Find & Replace app that is installed with Studio 2014, and also only the target segments.

You can search across multiple files and replace with regex using the SDLXLIFF Toolkit from the OpenExchange, and this works in both source and target.


Maybe it's just me, but I'm not really sure of the distinctions you're trying to make here. Are you aware that you can't replace characters in the Studio editor (2014) that are matched using regex? If I search for characters using regex in the editor, it finds them all right, but it behaves oddly when trying to replace these matched characters using the search and replace (CTRL+H) function.

If, for example, I want to match only all instances of a multiplcation sign that is directly preceded by a digit (without space) and immediately followed by a digit (again no space), this regex (?


Direct link Reply with quote
 

Jonathan Hopkins  Identity Verified
Germany
Local time: 03:56
German to English
+ ...
Message cut off Apr 17, 2014

Strange. For some reason my message has been truncated. It seems to have something to do with the regex. are these illigal entities here?

If, for example, I want to match only all instances of a multiplcation sign that is directly preceded by a digit (without space) and immediately followed by a digit (again no space), this regex [regex should have appeared here] finds them just fine once I select "find next":



But, when I select "Replace All" to replace the multiplication sign with a standard x surrounded by a space on either side, all Studio does is find all instances of the multiplication sign and change their segment status to draft without replacing the matched character:



@bluegrasstrans: I'll try to remember to update you on WS 10.4 functionality once our system has been upgraded, unless of course someone from SDL would care to tell us in advance.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Searching TM with RegEx

Advanced search







CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search