International Translation Day 2018

Join ProZ.com/TV for a FREE event September 25-26th celebrating International Translation Day! 50+ hours of content, Chat, Live Q&A & more. Join 1,000's of linguists from around the globe as ProZ.com/TV celebrates International Translation Day.

Click for Full Participation

Regex glossaries?
Thread poster: Piotr Bienkowski

Piotr Bienkowski  Identity Verified
Poland
Local time: 13:18
Member (2005)
English to Polish
+ ...
Jan 28, 2016

I want to learn something new but can't find the resources.

I want to use the regex glossary feature to be able to quickly insert all uppercase words such as ZORDRETSPECSALES. So I created a glossary and entered

[A-Z]+{tab}$0

Reloaded the glossary. Nothing happens i.e. the uppercase word is not highlighted. Regex wrong or something else?

{tab} represents the tab character.


 

FarkasAndras
Local time: 13:18
English to Hungarian
+ ...
? Jan 28, 2016

I don't know CT so I can't offer much help, I'm sure CT users will show up with insider info any minute now.
Still:
1) A-Z usually just matches latin characters, i.e. it won't match É or Ł. This varies tool to tool and should be tested. Possible expressions for matching all uppercase letters include [[:upper:]]+ and \p{Uppercase}+. Check the CT docs for info.
2) What's the $0? $ usually stands for "end of string". $1 stands for "previous captured string" but that would require you to put parens around something before. E.g. ([A-Z]+){tab}$1

[Edited at 2016-01-28 16:13 GMT]


 

Selcuk Akyuz  Identity Verified
Turkey
Local time: 14:18
Member (2006)
English to Turkish
+ ...
nontranslatables Jan 28, 2016

Hi Piotr,

I think you can use nontranslatables (shortcut F4 on Windows) for this purpose as well.

You store your nontranslatables in:
C:\CafeTran Espresso\cafetran\resources\placeables\nontranslatables.txt

I have the following in my list:
|[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][a-z]+\s[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][A-Z]+
|[a-z]+[A-Z]+[a-z]+
|[A-z]*+[0-9]+[A-z]*+
|((mailto\:|(news|(ht|f)tp(s?))\://){1}\S+)
|[\w-]+([\w-]+\.)+[\w-]+
|[\w-]+@([\w-]+\.)+[\w-]+
|[A-Z]([A-Z0-9]*[a-z][a-z0-9]*[A-Z]|[a-z0-9]*[A-Z][A-Z0-9]*[a-z])[A-Za-z0-9]*+
|[a-z]([a-z0-9]*[A-Z][A-Z0-9]*[a-z]|[A-Z0-9]*[a-z][a-z0-9]*[A-Z])[a-zA-Z0-9]*+
GmbH

I have no idea why your solution does not work.

Is that stored in your Regex glossary, otherwise Pipe character may be needed.
|[A-Z]+{tab}$0


 

Piotr Bienkowski  Identity Verified
Poland
Local time: 13:18
Member (2005)
English to Polish
+ ...
TOPIC STARTER
I copied you list.... Jan 28, 2016

Closed the project and reopened it, but only completely restarting Cafetran did the trick. Now I don't know if it is your list that works or my regex glossaryicon_wink.gif

Selcuk Akyuz wrote:

Hi Piotr,

I think you can use nontranslatables (shortcut F4 on Windows) for this purpose as well.

You store your nontranslatables in:
C:\CafeTran Espresso\cafetran\resources\placeables\nontranslatables.txt

I have the following in my list:
|[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][a-z]+\s[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][A-Z]+
|[a-z]+[A-Z]+[a-z]+
|[A-z]*+[0-9]+[A-z]*+
|((mailto\:|(news|(ht|f)tp(s?))\://){1}\S+)
|[\w-]+([\w-]+\.)+[\w-]+
|[\w-]+@([\w-]+\.)+[\w-]+
|[A-Z]([A-Z0-9]*[a-z][a-z0-9]*[A-Z]|[a-z0-9]*[A-Z][A-Z0-9]*[a-z])[A-Za-z0-9]*+
|[a-z]([a-z0-9]*[A-Z][A-Z0-9]*[a-z]|[A-Z0-9]*[a-z][a-z0-9]*[A-Z])[a-zA-Z0-9]*+
GmbH

I have no idea why your solution does not work.

Is that stored in your Regex glossary, otherwise Pipe character may be needed.
|[A-Z]+{tab}$0





 

Selcuk Akyuz  Identity Verified
Turkey
Local time: 14:18
Member (2006)
English to Turkish
+ ...
my list possibly Jan 28, 2016

You can test it, simply close your glossary and if they are still highlighted then it is proof that nontranslatables are working.

A version of your regex ([A-Z]+) or ([A-Z][A-Z]+) don't remember which one now, highlights matches but you should right click on glossary and select match case.

I could not find a regex to highlight É, the following one failed:

|\b[\p{Upper}]\b

-----------------------

Found iticon_smile.gif

|\b[\p{Lu}]\b






[Edited at 2016-01-28 20:32 GMT]


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Natalie[Call to this topic]

You can also contact site staff by submitting a support request »

Regex glossaries?

Advanced search






SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running and helps experienced users make the most of the powerful features.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search