Regex glossaries?
Thread poster: Piotr Bienkowski

Piotr Bienkowski  Identity Verified
Poland
Local time: 00:44
Member (2005)
English to Polish
+ ...
Jan 28, 2016

I want to learn something new but can't find the resources.

I want to use the regex glossary feature to be able to quickly insert all uppercase words such as ZORDRETSPECSALES. So I created a glossary and entered

[A-Z]+{tab}$0

Reloaded the glossary. Nothing happens i.e. the uppercase word is not highlighted. Regex wrong or something else?

{tab} represents the tab character.


 

FarkasAndras
Local time: 00:44
English to Hungarian
+ ...
? Jan 28, 2016

I don't know CT so I can't offer much help, I'm sure CT users will show up with insider info any minute now.
Still:
1) A-Z usually just matches latin characters, i.e. it won't match É or Ł. This varies tool to tool and should be tested. Possible expressions for matching all uppercase letters include [[:upper:]]+ and \p{Uppercase}+. Check the CT docs for info.
2) What's the $0? $ usually stands for "end of string". $1 stands for "previous captured string" but that would require you to put parens around something before. E.g. ([A-Z]+){tab}$1

[Edited at 2016-01-28 16:13 GMT]


 

Selcuk Akyuz  Identity Verified
Turkey
Local time: 01:44
Member (2006)
English to Turkish
+ ...
nontranslatables Jan 28, 2016

Hi Piotr,

I think you can use nontranslatables (shortcut F4 on Windows) for this purpose as well.

You store your nontranslatables in:
C:\CafeTran Espresso\cafetran\resources\placeables\nontranslatables.txt

I have the following in my list:
|[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][a-z]+\s[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][A-Z]+
|[a-z]+[A-Z]+[a-z]+
|[A-z]*+[0-9]+[A-z]*+
|((mailto\:|(news|(ht|f)tp(s?))\://){1}\S+)
|[\w-]+([\w-]+\.)+[\w-]+
|[\w-]+@([\w-]+\.)+[\w-]+
|[A-Z]([A-Z0-9]*[a-z][a-z0-9]*[A-Z]|[a-z0-9]*[A-Z][A-Z0-9]*[a-z])[A-Za-z0-9]*+
|[a-z]([a-z0-9]*[A-Z][A-Z0-9]*[a-z]|[A-Z0-9]*[a-z][a-z0-9]*[A-Z])[a-zA-Z0-9]*+
GmbH

I have no idea why your solution does not work.

Is that stored in your Regex glossary, otherwise Pipe character may be needed.
|[A-Z]+{tab}$0


 

Piotr Bienkowski  Identity Verified
Poland
Local time: 00:44
Member (2005)
English to Polish
+ ...
TOPIC STARTER
I copied you list.... Jan 28, 2016

Closed the project and reopened it, but only completely restarting Cafetran did the trick. Now I don't know if it is your list that works or my regex glossaryicon_wink.gif

Selcuk Akyuz wrote:

Hi Piotr,

I think you can use nontranslatables (shortcut F4 on Windows) for this purpose as well.

You store your nontranslatables in:
C:\CafeTran Espresso\cafetran\resources\placeables\nontranslatables.txt

I have the following in my list:
|[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][a-z]+\s[A-Z][a-z]+\s[A-Z][a-z]+
|[A-Z][A-Z]+
|[a-z]+[A-Z]+[a-z]+
|[A-z]*+[0-9]+[A-z]*+
|((mailto\:|(news|(ht|f)tp(s?))\://){1}\S+)
|[\w-]+([\w-]+\.)+[\w-]+
|[\w-]+@([\w-]+\.)+[\w-]+
|[A-Z]([A-Z0-9]*[a-z][a-z0-9]*[A-Z]|[a-z0-9]*[A-Z][A-Z0-9]*[a-z])[A-Za-z0-9]*+
|[a-z]([a-z0-9]*[A-Z][A-Z0-9]*[a-z]|[A-Z0-9]*[a-z][a-z0-9]*[A-Z])[a-zA-Z0-9]*+
GmbH

I have no idea why your solution does not work.

Is that stored in your Regex glossary, otherwise Pipe character may be needed.
|[A-Z]+{tab}$0





 

Selcuk Akyuz  Identity Verified
Turkey
Local time: 01:44
Member (2006)
English to Turkish
+ ...
my list possibly Jan 28, 2016

You can test it, simply close your glossary and if they are still highlighted then it is proof that nontranslatables are working.

A version of your regex ([A-Z]+) or ([A-Z][A-Z]+) don't remember which one now, highlights matches but you should right click on glossary and select match case.

I could not find a regex to highlight É, the following one failed:

|\b[\p{Upper}]\b

-----------------------

Found iticon_smile.gif

|\b[\p{Lu}]\b






[Edited at 2016-01-28 20:32 GMT]


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Natalie[Call to this topic]

You can also contact site staff by submitting a support request »

Regex glossaries?

Advanced search






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
WordFinder Unlimited
For clarity and excellence

WordFinder is the leading dictionary service that gives you the words you want anywhere, anytime. Access 260+ dictionaries from the world's leading dictionary publishers in virtually any device. Find the right word anywhere, anytime - online or offline.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search