Pages in topic:   [1 2] >
A specific regular expression for Trados Studio display filter needed
Thread poster: Alex Marshall

Alex Marshall  Identity Verified
United States
Member (2011)
Russian to English
+ ...
Aug 29, 2013

This is probably a no-brainer for people like Jerzy Czopik, but I have to ask this question anyway:

I need the regular expression for the Trados Studio display filter that will filter out all English-only source segments in my Russian-English file. The file contains multiple source segments that are already in English (and do not need translating as such). I'd like to copy them all to target and confirm just to get them out of the way.

I know there is a regular expression for this (I did this once before), but now it eludes me.

Thanks!


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 02:52
English
Not tested in Studio yet... Aug 29, 2013

... but perhaps you could use something like this;

[^\p{IsCyrillic}]+

Regards

Paul


Direct link Reply with quote
 
E_Kelly
South Korea
Local time: 09:22
Member
English to Korean
The same ones Aug 29, 2013

[\u0400-\u04FF]+

(Cyrillic Range of Unicode)


Direct link Reply with quote
 
E_Kelly
South Korea
Local time: 09:22
Member
English to Korean
You'd better Aug 29, 2013

try it first, before post it.


"The negative" rules too far to control from time to time.


Are you sure ? Really ?

Mine are tested as always.



[EDIT]
On second thought, "in a sense" you (your code) could be right and I could be wrong (though I tested/proved it).


[Edited at 2013-08-29 22:54 GMT]


Direct link Reply with quote
 

Alex Marshall  Identity Verified
United States
Member (2011)
Russian to English
+ ...
TOPIC STARTER
Thanks, Jaesang Aug 29, 2013

This seems to work... in a sense.
What would be the Latin range of Unicode? If I wanted to eliminate all Cyrillic segments?


Direct link Reply with quote
 
E_Kelly
South Korea
Local time: 09:22
Member
English to Korean
You know the rules. Aug 29, 2013

So, You can check it out at Unicode web site.

Regards






[a-zA-Z]

But, try it again and again for sure.

[Edited at 2013-08-29 22:59 GMT]


Direct link Reply with quote
 

Alex Marshall  Identity Verified
United States
Member (2011)
Russian to English
+ ...
TOPIC STARTER
[\u0000-\u007F]+ doesn't work Aug 29, 2013

It appears that the basic Latin range [\u0000-\u007F]+ doesn't work.

Neither does [\u0400-\u04FF]+ for Cyrillic, for that matter. It eliminates some of the segments, but does so very inconsistently.

I remember using a very simple regex that included [Latin] or something.


Direct link Reply with quote
 
E_Kelly
South Korea
Local time: 09:22
Member
English to Korean
\w Aug 29, 2013

Do you mean ?

(Not proved though)


Direct link Reply with quote
 

Alex Marshall  Identity Verified
United States
Member (2011)
Russian to English
+ ...
TOPIC STARTER
\w Aug 29, 2013

It does not work with Trados Studio. Like SDL Support said earlier, it was not tested... Well, I tested it, and this regular expression does not work with Studio

Direct link Reply with quote
 
E_Kelly
South Korea
Local time: 09:22
Member
English to Korean
So, what do we have now ? Aug 29, 2013

How about

\p{L}

very basic.


Direct link Reply with quote
 
E_Kelly
South Korea
Local time: 09:22
Member
English to Korean
Maybe Aug 29, 2013

it is SDL's jobs, I guess.


It needs "Not matches" condition.
Because, Now we have "matches" condition only.





[EDIT]
The segment "Display Filter" needs "Not Containing" condition too.
Because, Now it has "Containing" only.

[Edited at 2013-08-30 02:12 GMT]


Direct link Reply with quote
 
E_Kelly
South Korea
Local time: 09:22
Member
English to Korean
Have you tried Aug 29, 2013

[a-zA-Z]

?




or


[^a-zA-Z]

it works fine for me (Korean).

[Edited at 2013-08-29 23:24 GMT]


Direct link Reply with quote
 

Alex Marshall  Identity Verified
United States
Member (2011)
Russian to English
+ ...
TOPIC STARTER
This: [a-zA-Z]... Aug 29, 2013

... works a little better, but still too erratically to be reliable (skips some of the all-Cyrillic segments, but still leaves many of them).

\p{L} doesn't work.

[^a-zA-Z] doesn't work at all

Thanks for trying, Jaesang!


Direct link Reply with quote
 
E_Kelly
South Korea
Local time: 09:22
Member
English to Korean
Keep on. Aug 29, 2013

Gotta go.

If you have it, Let me know.

Good Luck.


Direct link Reply with quote
 
E_Kelly
South Korea
Local time: 09:22
Member
English to Korean
Hve to make it clear. Aug 30, 2013

Korean [\uAC00-\uD7A3]
ASCII [\u0000-\u007F]
Cyrillic [\u0400-\u04FF]

Above work fine with SDL Trados Studio Segment Filtering.


Though followings make unwanted error message.

Korean \p{IsHangul}
ASCII \p{IsLatin} or \p{IsASCII} or \p{IsAlphabetic} whatsoever



-끝-




[EDIT]
You have to know the difference between "Error" and "Works with vain".

[Edited at 2013-08-30 00:29 GMT]


Direct link Reply with quote
 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

A specific regular expression for Trados Studio display filter needed

Advanced search







BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search